4 minutes
Banzai Kafka in Kubernetes
This example deploys a Kafka cluster using Banzai Cloud Kafka Operator. This guide is a condensed version of the official Banzai documentation, the idea being to quickly get up and running with Kafka.
Setup
Prerequisites:
Provision Kubernetes
Any kubernetes cluster will do but for this tutorial I'll use Civo Cloud's offering.
civo k8s create \
--nodes 3 \
--save --switch --wait \
kafka
Provision Dependencies
Let's get the core dependencies provisioned. We'll need:
- Cert Manager: cert manager off loads the heavy lifting for certificate generation and rotation. There are many configurations that can be enabled, just check out the Banzai documentation for more information.
- Prometheus Operator: We only need the core bundle which will enable the use of service monitors and alerts. This is one of the key enablers that allows Banzai kafka clusters to recover and rebalance data.
- Zookeeper: You can use any zookeeper endpoint but Banzai has packaged an operator for use. Zookeeper is the key value database which stores the kafka state.
Note: This tutorial should be fully recreatable from the clodeblocks. However, it may be helpful to wait until each workload finishes deploying before moving on to the next command. I recommend putting a watch on the cluster pods in a separate terminal watch kubectl get pods --all-namespaces
.
We'll also need the Banzai Helm Repo:
helm repo add banzaicloud-stable https://kubernetes-charts.banzaicloud.com/
Cert-Manager:
# create cert-manager deployment
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.11.0/cert-manager.yaml
Prometheus Operator:
# create the prometheus operator
kubectl apply -n default -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/bundle.yaml
Zookeeper Operator:
# create zookeeper namespace
kubectl create namespace zookeeper
# create zk operator
helm upgrade --install \
--namespace zookeeper \
--wait \
zookeeper-operator banzaicloud-stable/zookeeper-operator
Zookeeper:
# create zookeeper cluster
cat <<EOF | kubectl apply --namespace zookeeper -f -
apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
name: zookeepercluster
namespace: zookeeper
spec:
replicas: 3
EOF
Provision Banzai Kafka
Again there are lots of configurations details this guide does not cover. Please see the Banzai documentation for more options.
Components:
- Kafka Operator: will maintain the lifecycle, data rebalancing and scaling for all the provisioned Kafkas in the cluster.
- Kafka Instance: this guide provisions a simple kafka instance, configured for internal cluster access and initialized with some basic scaling/rebalancing rules.
Kafka Operator:
# create the kafka namespace
kubectl create namespace kafka
# get the values file configured for prometheus
TMP_FILE=/tmp/kafka-prometheus-alerts.yaml
curl https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka-prometheus-alerts.yaml -o $TMP_FILE -s
# install kafka operator with prometheus alerts
helm upgrade --install \
--namespace kafka \
--values $TMP_FILE \
kafka-operator banzaicloud-stable/kafka-operator
Kafka:
# create the kafka cluster
KAFKA_INSTANCE=https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka.yaml
curl $KAFKA_INSTANCE | kubectl apply -n kafka -f -
# create the service monitor
KAFKA_SERVICE_MONITOR=https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka-prometheus.yaml
curl $KAFKA_SERVICE_MONITOR | kubectl apply -n kafka -f -
Validate
First we will validate the Cruise Control Dashboard is online and healthy. Cruise Control is a tool from LinkedIn which provides exceptional operational control over kafka clusters. The API can be triggered via Prometheus alerts, making Banzai clusters highly resilient. Go ahead and explore the configuration options.
Cruise Control Dashboard:
# proxy to the cruise-control dashboard for kafka maintenance (may take a couple minutes)
kubectl port-forward -n kafka svc/kafka-cruisecontrol-svc 8090:8090 &
echo http://localhost:8090
We can also validate that we can produce and consume from this cluster. First we need to provision a topic to use (note: the cluster must be finished provisioning before the topic can be applied):
# create a topic to which we can produce/consume
cat <<EOF | kubectl apply -f -
apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaTopic
metadata:
name: civo-topic
namespace: kafka
spec:
clusterRef:
name: kafka
name: civo-topic
partitions: 3
replicationFactor: 2
config:
"retention.ms": "604800000"
"cleanup.policy": "delete"
EOF
Run the following two commands in separate terminals:
Produce:
# run a producer in a pod
kubectl run kafka-producer \
-n kafka -it --rm=true --restart=Never \
--image=wurstmeister/kafka:2.12-2.3.0 \
-- /opt/kafka/bin/kafka-console-producer.sh \
--broker-list kafka-headless:29092 \
--topic civo-topic
Consume:
# run a consumer in a pod
kubectl run kafka-consumer \
-n kafka -it --rm=true --restart=Never \
--image=wurstmeister/kafka:2.12-2.3.0 \
-- /opt/kafka/bin/kafka-console-consumer.sh \
--bootstrap-server kafka-headless:29092 \
--from-beginning \
--topic civo-topic
Clean
Well that was fun. Go check out the Banzai docs and tune a cluster to your needs. Time to clean up!
# kill any dangling proxies
killall kubectl
# clean up the kubernetes cluster
civo k8s delete kafka
702 Words
2020-01-09 14:30 +0000