This example deploys a Kafka cluster using Banzai Cloud Kafka Operator. This guide is a condensed version of the official Banzai documentation, the idea being to quickly get up and running with Kafka.

Setup

Prerequisites:

Provision Kubernetes

Any kubernetes cluster will do but for this tutorial I'll use Civo Cloud's offering.

civo k8s create \
  --nodes 3 \
  --save --switch --wait \
  kafka

Provision Dependencies

Let's get the core dependencies provisioned. We'll need:

  • Cert Manager: cert manager off loads the heavy lifting for certificate generation and rotation. There are many configurations that can be enabled, just check out the Banzai documentation for more information.
  • Prometheus Operator: We only need the core bundle which will enable the use of service monitors and alerts. This is one of the key enablers that allows Banzai kafka clusters to recover and rebalance data.
  • Zookeeper: You can use any zookeeper endpoint but Banzai has packaged an operator for use. Zookeeper is the key value database which stores the kafka state.

Note: This tutorial should be fully recreatable from the clodeblocks. However, it may be helpful to wait until each workload finishes deploying before moving on to the next command. I recommend putting a watch on the cluster pods in a separate terminal watch kubectl get pods --all-namespaces.

We'll also need the Banzai Helm Repo:

helm repo add banzaicloud-stable https://kubernetes-charts.banzaicloud.com/

Cert-Manager:

# create cert-manager deployment
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v0.11.0/cert-manager.yaml

Prometheus Operator:

# create the prometheus operator
kubectl apply -n default -f https://raw.githubusercontent.com/coreos/prometheus-operator/master/bundle.yaml

Zookeeper Operator:

# create zookeeper namespace
kubectl create namespace zookeeper

# create zk operator
helm upgrade --install \
  --namespace zookeeper \
  --wait \
  zookeeper-operator banzaicloud-stable/zookeeper-operator

Zookeeper:

# create zookeeper cluster
cat <<EOF | kubectl apply --namespace zookeeper -f -

apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
  name: zookeepercluster
  namespace: zookeeper
spec:
  replicas: 3
EOF

Provision Banzai Kafka

Again there are lots of configurations details this guide does not cover. Please see the Banzai documentation for more options.

Components:

  • Kafka Operator: will maintain the lifecycle, data rebalancing and scaling for all the provisioned Kafkas in the cluster.
  • Kafka Instance: this guide provisions a simple kafka instance, configured for internal cluster access and initialized with some basic scaling/rebalancing rules.

Kafka Operator:

# create the kafka namespace
kubectl create namespace kafka

# get the values file configured for prometheus
TMP_FILE=/tmp/kafka-prometheus-alerts.yaml

curl https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka-prometheus-alerts.yaml -o $TMP_FILE -s

# install kafka operator with prometheus alerts
helm upgrade --install \
  --namespace kafka \
  --values $TMP_FILE \
  kafka-operator banzaicloud-stable/kafka-operator

Kafka:

# create the kafka cluster
KAFKA_INSTANCE=https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka.yaml
curl $KAFKA_INSTANCE | kubectl apply -n kafka -f -

# create the service monitor

KAFKA_SERVICE_MONITOR=https://raw.githubusercontent.com/gabeduke/civo-kafka-example/v1.0.0/kafka-prometheus.yaml
curl $KAFKA_SERVICE_MONITOR | kubectl apply -n kafka -f -

Validate

First we will validate the Cruise Control Dashboard is online and healthy. Cruise Control is a tool from LinkedIn which provides exceptional operational control over kafka clusters. The API can be triggered via Prometheus alerts, making Banzai clusters highly resilient. Go ahead and explore the configuration options.

Cruise Control Dashboard:

# proxy to the cruise-control dashboard for kafka maintenance (may take a couple minutes)
kubectl port-forward -n kafka svc/kafka-cruisecontrol-svc 8090:8090 &
echo http://localhost:8090

We can also validate that we can produce and consume from this cluster. First we need to provision a topic to use (note: the cluster must be finished provisioning before the topic can be applied):

# create a topic to which we can produce/consume
cat <<EOF | kubectl apply -f -

apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaTopic
metadata:
  name: civo-topic
  namespace: kafka
spec:
  clusterRef:
    name: kafka
  name: civo-topic
  partitions: 3
  replicationFactor: 2
  config:
    "retention.ms": "604800000"
    "cleanup.policy": "delete"
EOF

Run the following two commands in separate terminals:

Produce:

# run a producer in a pod
kubectl run kafka-producer \
  -n kafka -it --rm=true --restart=Never \
  --image=wurstmeister/kafka:2.12-2.3.0 \
    -- /opt/kafka/bin/kafka-console-producer.sh \
    --broker-list kafka-headless:29092 \
    --topic civo-topic

Consume:

# run a consumer in a pod
kubectl run kafka-consumer \
  -n kafka -it --rm=true --restart=Never \
  --image=wurstmeister/kafka:2.12-2.3.0 \
    -- /opt/kafka/bin/kafka-console-consumer.sh \
    --bootstrap-server kafka-headless:29092 \
    --from-beginning \
    --topic civo-topic

Clean

Well that was fun. Go check out the Banzai docs and tune a cluster to your needs. Time to clean up!

# kill any dangling proxies
killall kubectl

# clean up the kubernetes cluster
civo k8s delete kafka