StreamNative Pulsar Operator on EKS
The StreamNative Pulsar Operator manages the Pulsar service instances deployed on a Kubernetes cluster. In the article, I demo how to use StreamNative Pulsar operators to deploy a Pulsar cluster on the EKS cluster.
Create an AWS account
INSTALLING PRE-REQUISITES
Install Kubectl
sudo curl --silent --location -o /usr/local/bin/kubectl \ https://s3.us-west-2.amazonaws.com/amazon-eks/1.21.2/2021-07-05/bin/linux/amd64/kubectl sudo chmod +x /usr/local/bin/kubectl
Note: You must use a kubectl version that is within one minor version difference of your Amazon EKS cluster control plane. For example, a 1.21 kubectl client works with Kubernetes 1.20, 1.21, and 1.22 clusters.
Update AWS cli
sudo pip install --upgrade awscli && hash -r
Install Helm (v3.0.2 or higher)
Install eksctl for managing Amazon EKS
curl — silent — location “https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz” | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin
eksctl version
CREATING AN AMAZON EKS CLUSTER
Amazon Elastic Kubernetes Service (Amazon EKS) is a fully managed Kubernetes service, trusted to run the most sensitive and mission-critical applications because of its security, reliability, and scalability
EKS Cluster creation
CLUSTER="<name>"
eksctl create cluster --name ${CLUSTER} --region ${AWS_REGION} --zones ${AWS_REGION}a,${AWS_REGION}b
Launching EKS and all the dependencies will take approximately 15 minutes
Verify the cluster
Test the cluster:
Get cluster Name and Region:
eksctl get clusters
Set CLUSTER: (use the name of the Cluster from earlier step)
CLUSTER=<YOUR_CLUSTER_NAME>
Update kubeconfig:
aws eks update-kubeconfig --name $CLUSTER --region $AWS_REGION
Confirm your nodes:
kubectl get nodes # if we see our 3 nodes, we know we have authenticated correctly
Congratulations!
You now have a fully working Amazon EKS Cluster that is ready to use!
Why use k8s operators?
Kubernetes' operator pattern concept lets you extend the cluster's behaviour without modifying the code of Kubernetes itself by linking controllers to one or more custom resources. Operators are clients of the Kubernetes API that act as controllers for a Custom Resource.
StreamNative provides a good overview of the operator concept here. StreamNative “open-sources” four operators, zookeeper, bookkeeper, broker, and function mesh, which are wrapped as a helm chart. For a quick start, you can follow the official installation doc here. This article explores a step-by-step breakdown way to set up the pulsar cluster and observe its capabilities.
To install the Pulsar Operator, follow these steps.
helm repo add streamnative https://charts.streamnative.io
helm repo update
2. Create a Kubernetes namespace.
kubectl create namespace <k8s-namespace>
3. Install the Pulsar Operator using the pulsar-operator Helm chart.
helm install pulsar-operators --namespace <k8s-namespace> streamnative/pulsar-operator
4. Verify that the Pulsar Operator is installed successfully.
Recommended by LinkedIn
kubectl get po -n <k8s-namespace>
Expected outputs:
$ kubectl get all -n test
NAME
READY STATUS RESTARTS AGE
pod/pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z 1/1 Running 0 2m54s
pod/pulsar-operators-pulsar-controller-manager-bc8fb895f-ddnwj 1/1 Running 0 2m54s
pod/pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr 1/1 Running 0 2m54s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/pulsar-operators-bookkeeper-controller-manager 1/1 1 1 2m54s
deployment.apps/pulsar-operators-pulsar-controller-manager 1/1 1 1 2m54s
deployment.apps/pulsar-operators-zookeeper-controller-manager 1/1 1 1 2m54s
NAME DESIRED CURRENT READY AGE
replicaset.apps/pulsar-operators-bookkeeper-controller-manager-6ff67464ff 1 1 1 2m54s
replicaset.apps/pulsar-operators-pulsar-controller-manager-bc8fb895f 1 1 1 2m54s
replicaset.apps/pulsar-operators-zookeeper-controller-manager-6568777d4f 1 1 1 2m54s
If you check your k8s API resources, you will find all the API created.
>kubectl api-resources | grep pulsar
pulsarbrokers pb,broker
pulsar.streamnative.io/v1alpha1 true PulsarBroker
pulsarconnections pconn pulsar.streamnative.io/v1alpha1 true PulsarConnection
pulsarnamespaces pns pulsar.streamnative.io/v1alpha1 true PulsarNamespace
pulsarpermissions ppermission pulsar.streamnative.io/v1alpha1 true PulsarPermission
pulsarproxies pp,proxy pulsar.streamnative.io/v1alpha1 true PulsarProxy
pulsartenants ptenant pulsar.streamnative.io/v1alpha1 true PulsarTenant
pulsartopics ptopic pulsar.streamnative.io/v1alpha1 true PulsarTopic
As shown above, there are three controllers/operators, and each handles different “kinds” of clusters. You can use the API to create Topic,Tenant, permissions. Like the standard k8s controllers, deployment, for example, we tell the controller what we want by feeding the cluster definitions. In a regular k8s deployment, you put all kinds of components in a yaml file and use “apply” or “create” to create pods, services, configmap, and others. You can put zookeeper, bookkeeper, and broker cluster definitions in a single yaml file, then deploy them in one shot. In order to understand and troubleshoot the deployment, I break it down into three steps, zookeeper, bookkeeper, then broker. If you want to understand the dependencies among the three, check out Sijie Guo’s TGI Pulsar 001. The sequence is zookeeper, bookkeeper, broker, then others.
The following is the ZooKeeperCluster definition that the zookeeper controller/operator will take to deploy a zookeeper cluster. It is very similar to a deployment. It defines the image location, version, replicas, resources, and persistent disk properties. There should be other properties like JVM flags for tuning. I will discuss this later as we focus on getting a running cluster, and the operator should help make the extra configs updated automatically.
Deploy a ZooKeeper cluster.
apiVersion: zookeeper.streamnative.io/v1alpha1
kind: ZooKeeperCluster
metadata:
name: test-zookeeper
namespace: test
spec:
image: streamnative/pulsar:2.8.0.9
pod:
resources:
requests:
cpu: 50m
memory: 256Mi
persistence:
data:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
dataLog:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
reclaimPolicy: Delete
replicas: 3
Let’s apply this file and see what happens.
$ kubectl apply -f zookeeper-sample.yaml
zookeepercluster.zookeeper.streamnative.io/test-zookeeper created
$ kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z 1/1 Running 0 23m
pulsar-operators-pulsar-controller-manager-bc8fb895f-ddnwj 1/1 Running 0 23m
pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr 1/1 Running 0 23m
test-zookeeper-zk-0 1/1 Running 0 67s
test-zookeeper-zk-1 1/1 Running 0 67s
test-zookeeper-zk-2 1/1 Running 0 67s
$ kubectl logs -n test pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr
{"severity":"info","timestamp":"2022-08-03T20:48:28Z","logger":"controllers.ZooKeeperCluster","message":"Reconciling ZooKeeperCluster","Request.Namespace":"test","Request.Name":"test-zookeeper"}{"severity":"info","timestamp":"2022-08-03T20:48:28Z","logger":"controllers.ZooKeeperCluster","message":"Updating an existing ZooKeeper StatefulSet","StatefulSet.Namespace":"test","StatefulSet.Name":"test-zookeeper-zk"}
{"severity":"debug","timestamp":"2022-08-03T20:48:28Z","logger":"controller","message":"Successfully Reconciled","reconcilerGroup":"zookeeper.streamnative.io","reconcilerKind":"ZooKeeperCluster","controller":"zookeepercluster","name":"test-zookeeper","namespace":"test"}
{"severity":"info","timestamp":"2022-08-03T20:48:28Z","logger":"controllers.ZooKeeperCluster","message":"Reconciling ZooKeeperCluster","Request.Namespace":"test","Request.Name":"test-zookeeper"}
{"severity":"info","timestamp":"2022-08-03T20:48:28Z","logger":"controllers.ZooKeeperCluster","message":"Updating an existing ZooKeeper StatefulSet","StatefulSet.Namespace":"test","StatefulSet.Name":"test-zookeeper-zk"}
{"severity":"debug","timestamp":"2022-08-03T20:48:28Z","logger":"controller","message":"Successfully Reconciled","reconcilerGroup":"zookeeper.streamnative.io","reconcilerKind":"ZooKeeperCluster","controller":"zookeepercluster","name":"test-zookeeper","namespace":"test"}
The zookeeper controller is running a reconcile loop which keeps checking the “my” ZooKeeperCluster status in sn-platform namespace. This is a recommended operator design pattern. This operator pod log is handy to troubleshoot pulsar deployment.
The next component is the bookkeeper cluster. Following the same pattern, I define the BookKeeperCluster kind like the following.
Deploy a BookKeeper cluster.
apiVersion: bookkeeper.streamnative.io/v1alpha1
kind: BookKeeperCluster
metadata:
name: test-bookie
namespace: test
spec:
image: streamnative/pulsar:2.8.0.9
replicas: 3
pod:
resources:
requests:
cpu: 200m
memory: 256Mi
storage:
journal:
numDirsPerVolume: 1
numVolumes: 1
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
ledger:
numDirsPerVolume: 1
numVolumes: 1
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 16Gi
reclaimPolicy: Delete
zkServers: test-zookeeper:2181
$ kubectl apply -f bookie-sample.yaml
bookkeepercluster.bookkeeper.streamnative.io/test-bookie created
$ kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z 1/1 Running 0 32m
pulsar-operators-pulsar-controller-manager-bc8fb895f-ddnwj 1/1 Running 0 32m
pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr 1/1 Running 0 32m
test-bookie-bk-init-qfqkk 1/1 Running 0 52s
test-zookeeper-zk-0 1/1 Running 0 10m
test-zookeeper-zk-1 1/1 Running 0 10m
test-zookeeper-zk-2 1/1 Running 0 10m
$ kubectl logs -n test pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z
{"severity":"info","timestamp":"2022-08-03T20:56:48Z","logger":"controllers.BookKeeperCluster","message":"Reconciling BookKeeperCluster","Request.Namespace":"test","Request.Name":"test-bookie"}
{"severity":"info","timestamp":"2022-08-03T20:56:48Z","logger":"controllers.BookKeeperCluster","message":"Updating the status for the BookKeeperCluster","Namespace":"test","Name":"test-bookie","Status":{"observedGeneration":2,"replicas":0,"labelSelector":"","conditions":[{"type":"Ready","status":"False","reason":"Ready","message":"0/0 pod ready","lastTransitionTime":"2022-08-03T20:56:48Z"}]}}
{"severity":"debug","timestamp":"2022-08-03T20:56:48Z","logger":"controller","message":"Successfully Reconciled","reconcilerGroup":"bookkeeper.streamnative.io","reconcilerKind":"BookKeeperCluster","controller":"bookkeepercluster","name":"test-bookie","namespace":"test"}
{"severity":"info","timestamp":"2022-08-03T20:56:48Z","logger":"controllers.BookKeeperCluster","message":"Reconciling BookKeeperCluster","Request.Namespace":"test","Request.Name":"test-bookie"}
{"severity":"info","timestamp":"2022-08-03T20:56:48Z","logger":"controllers.BookKeeperCluster","message":"Updating the status for the BookKeeperCluster","Namespace":"test","Name":"test-bookie","Status":{"observedGeneration":2,"replicas":0,"labelSelector":"","conditions":[{"type":"Ready","status":"False","reason":"Ready","message":"0/0 pod ready","lastTransitionTime":"2022-08-03T20:56:48Z"}]}}
{"severity":"debug","timestamp":"2022-08-03T20:56:48Z","logger":"controller","message":"Successfully Reconciled","reconcilerGroup":"bookkeeper.streamnative.io","reconcilerKind":"BookKeeperCluster","controller":"bookkeepercluster","name":"test-bookie","namespace":"test"}
Deploy a Pulsar broker.
The next component is the broker cluster. The following is the PulsarBroker yaml file. From the file, you can see that broker also depends on zkServer. Note that I add “config.custom” in the descriptor.
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarBroker
metadata:
name: test-broker
namespace: test
spec:
image: streamnative/pulsar:2.8.0.9
pod:
resources:
requests:
cpu: 200m
memory: 256Mi
terminationGracePeriodSeconds: 30
config:
custom:
webSocketServiceEnabled: "true"
replicas: 1
zkServers: test-zookeeper:2181
$ kubectl apply -f broker-sample.yaml
pulsarbroker.pulsar.streamnative.io/test-broker created
$ kubectl get all -n test
NAME READY STATUS RESTARTS AGE
pod/my-broker-metadata-init-c55rk 1/1 Running 0 69s
pod/pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z 1/1 Running 0 60m
pod/pulsar-operators-pulsar-controller-manager-bc8fb895f-ddnwj 1/1 Running 0 60m
pod/pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr 1/1 Running 0 60m
pod/test-bookie-bk-init-qfqkk 1/1 Running 0 29m
pod/test-zookeeper-zk-0 1/1 Running 0 38m
pod/test-zookeeper-zk-1 1/1 Running 0 38m
pod/test-zookeeper-zk-2 1/1 Running 0 38m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/my-broker ClusterIP 10.100.209.198 <none> 6650/TCP,8080/TCP 69s
service/my-broker-headless ClusterIP None <none> 6650/TCP,8080/TCP 69s
service/test-broker-broker ClusterIP 10.100.168.219 <none> 6650/TCP,8080/TCP 23m
service/test-broker-broker-headless ClusterIP None <none> 6650/TCP,8080/TCP 23m
service/test-zookeeper-zk ClusterIP 10.100.163.15 <none> 2181/TCP,8000/TCP,9990/TCP 38m
service/test-zookeeper-zk-headless ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP,8000/TCP,9990/TCP 38m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/pulsar-operators-bookkeeper-controller-manager 1/1 1 1 60m
deployment.apps/pulsar-operators-pulsar-controller-manager 1/1 1 1 60m
deployment.apps/pulsar-operators-zookeeper-controller-manager 1/1 1 1 60m
NAME DESIRED CURRENT READY AGE
replicaset.apps/pulsar-operators-bookkeeper-controller-manager-6ff67464ff 1 1 1 60m
replicaset.apps/pulsar-operators-pulsar-controller-manager-bc8fb895f 1 1 1 60m
replicaset.apps/pulsar-operators-zookeeper-controller-manager-6568777d4f 1 1 1 60m
NAME READY AGE
statefulset.apps/test-zookeeper-zk 3/3 38m
NAME COMPLETIONS DURATION AGE
job.batch/my-broker-metadata-init 0/1 69s 69s
job.batch/test-bookie-bk-init 0/1 29m 29m
job.batch/test-broker-broker-metadata-init 0/1 23m
Deploy a Pulsar Proxy.
apiVersion: pulsar.streamnative.io/v1alpha1
kind: PulsarProxy
metadata:
name: test-proxy
namespace: test
spec:
brokerAddress: broker.test.svc.cluster.local
dnsNames: []
image: streamnative/pulsar:2.8.0.9
issuerRef:
name: ""
pod:
resources:
requests:
cpu: 200m
memory: 256Mi
replicas: 1
Apply the YAML file to create the Pulsar proxy.
kubectl apply -f /path/to/proxy-sample.yaml
Check whether the Pulsar proxy is created successfully.
$ kubectl get all -n test
NAME READY STATUS RESTARTS AGE
pod/my-broker-metadata-init-c55rk 1/1 Running 0 24m
pod/pulsar-operators-bookkeeper-controller-manager-6ff67464ff-szg5z 1/1 Running 0 14m
pod/pulsar-operators-pulsar-controller-manager-bc8fb895f-ddnwj 1/1 Running 0 84m
pod/pulsar-operators-zookeeper-controller-manager-6568777d4f-wqdpr 1/1 Running 0 84m
pod/test-bookie-bk-init-qfqkk 1/1 Running 0 52m
pod/test-zookeeper-zk-0 1/1 Running 0 61m
pod/test-zookeeper-zk-1 1/1 Running 0 61m
pod/test-zookeeper-zk-2 1/1 Running 0 61m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/my-broker ClusterIP 10.100.209.198 <none> 6650/TCP,8080/TCP 24m
service/my-broker-headless ClusterIP None <none> 6650/TCP,8080/TCP 24m
service/my-proxy-external LoadBalancer 10.100.244.52 10.0.0.37 6650:30091/TCP,8080:30184/TCP 14m
service/my-proxy-headless ClusterIP None <none> 6650/TCP,8080/TCP 14m
service/test-broker-broker ClusterIP 10.100.168.219 <none> 6650/TCP,8080/TCP 46m
service/test-broker-broker-headless ClusterIP None <none> 6650/TCP,8080/TCP 46m
service/test-zookeeper-zk ClusterIP 10.100.163.15 <none> 2181/TCP,8000/TCP,9990/TCP 61m
service/test-zookeeper-zk-headless ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP,8000/TCP,9990/TCP 61m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/pulsar-operators-bookkeeper-controller-manager 1/1 1 1 60m
deployment.apps/pulsar-operators-pulsar-controller-manager 1/1 1 1 60m
deployment.apps/pulsar-operators-zookeeper-controller-manager 1/1 1 1 60m
NAME DESIRED CURRENT READY AGE
replicaset.apps/pulsar-operators-bookkeeper-controller-manager-6ff67464ff 1 1 1 60m
replicaset.apps/pulsar-operators-pulsar-controller-manager-bc8fb895f 1 1 1 60m
replicaset.apps/pulsar-operators-zookeeper-controller-manager-6568777d4f 1 1 1 60m
NAME READY AGE
statefulset.apps/my-proxy 3/3 14m
statefulset.apps/test-zookeeper-zk 3/3 38m
NAME COMPLETIONS DURATION AGE
job.batch/my-broker-metadata-init 0/1 69s 69s
job.batch/test-bookie-bk-init 0/1 29m 29m
job.batch/test-broker-broker-metadata-init 0/1 23m
Introducing Pulsar Resources Operator for Kubernetes
The Pulsar Resources Operator is an independent controller that automatically manages Pulsar resources on Kubernetes using manifest files. The Pulsar Resources Operator provides full lifecycle management for the following Pulsar resources, including creation, update, and deletion:
The below tutorial guides you through creating Pulsar resources. You can create Pulsar resources automatically by applying resource manifest files to Kubernetes.
Before creating Pulsar resources, you must create a resource called PularConnection. The PulsarConnectioncovers the address of the Pulsar cluster and the authentication information. You can use this information to access a Pulsar cluster to create other resources.
Great tutorial. Can you wrap that up in a script or github? Add a video? Cool stuff.