This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Koperator

1: Install the operator
2: Upgrade the operator
3: Test provisioned Kafka Cluster
4: CruiseControlOperation to manage Cruise Control
5: Istio Integration
6: Configure Kafka cluster

6.1: CRD

6.1.1: KafkaCluster CRD schema reference (group kafka.banzaicloud.io)
6.1.2: KafkaTopic CRD schema reference (group kafka.banzaicloud.io)
6.1.3: KafkaUser CRD schema reference (group kafka.banzaicloud.io)

6.2: KafkaCluster CR Examples

7: Provisioning Kafka Topics
8: Kafka clusters with pre-provisioned volumes
9: Securing Kafka With SSL
10: Koperator capablities
11: KRaft Mode (ZooKeeper-free Kafka)
12: Monitoring Apache Kafka on Kubernetes
13: Expose the Kafka cluster to external applications
14: Configure rack awareness
15: Supported versions and compatibility matrix
16: Benchmarking Kafka

17: Delete the operator
18: Tips and tricks for the Koperator
19: Troubleshooting the operator

19.1: Common errors

20: Kafka on Kubernetes - The Hard Way

20.1: Prerequisites and Setup
20.2: Kubernetes Cluster Setup
20.3: Dependencies Installation
20.4: Koperator Installation
20.5: Kafka Cluster Deployment
20.6: Testing and Validation
20.7: Disaster Recovery Scenarios
20.8: Troubleshooting Guide

21: Support
22: Developer Guide
23: License

The Koperator (formerly called Banzai Cloud Kafka Operator) is a Kubernetes operator to automate provisioning, management, autoscaling and operations of Apache Kafka clusters deployed to K8s.

Overview

Apache Kafka is an open-source distributed streaming platform, and some of the main features of the Koperator are:

the provisioning of secure and production-ready Kafka clusters
KRaft mode support for ZooKeeper-free Kafka deployments
fine grained broker configuration support
advanced and highly configurable External Access via LoadBalancers using Envoy
graceful Kafka cluster scaling and rebalancing
monitoring via Prometheus
encrypted communication using SSL
automatic reaction and self healing based on alerts (plugin system, with meaningful default alert plugins) using Cruise Control
graceful rolling upgrade
advanced topic and user management via CRD

Koperator architecture

The Koperator helps you create production-ready Apache Kafka cluster on Kubernetes, with scaling, rebalancing, and alerts based self healing.

Motivation

Apache Kafka predates Kubernetes and was designed mostly for static on-premise environments. State management, node identity, failover, etc all come part and parcel with Kafka, so making it work properly on Kubernetes and on an underlying dynamic environment can be a challenge.

There are already several approaches to operating Apache Kafka on Kubernetes, however, we did not find them appropriate for use in a highly dynamic environment, nor capable of meeting our customers’ needs. At the same time, there is substantial interest within the Kafka community for a solution which enables Kafka on Kubernetes, both in the open source and closed source space.

We took a different approach to what’s out there - we believe for a good reason - please read on to understand more about our design motivations and some of the scenarios which were driving us to create the Koperator.

Finally, our motivation is to build an open source solution and a community which drives the innovation and features of this operator. We are long term contributors and active community members of both Apache Kafka and Kubernetes, and we hope to recreate a similar community around this operator.

Koperator features

Design motivations

Kafka is a stateful application. The first piece of the puzzle is the Broker, which is a simple server capable of creating/forming a cluster with other Brokers. Every Broker has his own unique configuration which differs slightly from all others - the most relevant of which is the unique broker ID.

All Kafka on Kubernetes operators use StatefulSet to create a Kafka Cluster. Just to quickly recap from the K8s docs:

StatefulSet manages the deployment and scaling of a set of Pods, and provide guarantees about their ordering and uniqueness. Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains sticky identities for each of its Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that is maintained across any rescheduling.

How does this look from the perspective of Apache Kafka?

With StatefulSet we get:

unique Broker IDs generated during Pod startup
networking between brokers with headless services
unique Persistent Volumes for Brokers

Using StatefulSet we lose:

the ability to modify the configuration of unique Brokers
to remove a specific Broker from a cluster (StatefulSet always removes the most recently created Broker)
to use multiple, different Persistent Volumes for each Broker

Koperator uses simple Pods, ConfigMaps, and PersistentVolumeClaims, instead of StatefulSet. Using these resources allows us to build an Operator which is better suited to manage Apache Kafka.

With the Koperator you can:

modify the configuration of unique Brokers
remove specific Brokers from clusters
use multiple Persistent Volumes for each Broker

Features

Fine Grained Broker Configuration Support

We needed to be able to react to events in a fine-grained way for each Broker - and not in the limited way StatefulSet does (which, for example, removes the most recently created Brokers). Some of the available solutions try to overcome these deficits by placing scripts inside the container to generate configurations at runtime, whereas the Koperator’s configurations are deterministically placed in specific Configmaps.

Graceful Kafka Cluster Scaling with the help of our CruiseControlOperation custom resource

We know how to operate Apache Kafka at scale (we are contributors and have been operating Kafka on Kubernetes for years now). We believe, however, that LinkedIn has even more experience than we do. To scale Kafka clusters both up and down gracefully, we integrated LinkedIn’s Cruise-Control to do the hard work for us. We already have good defaults (i.e. plugins) that react to events, but we also allow our users to write their own.

External Access via LoadBalancer

The Koperator externalizes access to Apache Kafka using a dynamically (re)configured Envoy proxy. Using Envoy allows us to use a single LoadBalancer, so there’s no need for a LoadBalancer for each Broker.

Kafka External Access

Communication via SSL

The operator fully automates Kafka’s SSL support. The operator can provision the required secrets and certificates for you, or you can provide your own.

SSL support for Kafka

Monitoring via Prometheus

The Koperator exposes Cruise-Control and Kafka JMX metrics to Prometheus.

Reacting on Alerts

Koperator acts as a Prometheus Alert Manager. It receives alerts defined in Prometheus, and creates actions based on Prometheus alert annotations.

Currently, there are three default actions (which can be extended):

upscale cluster (add a new Broker)
downscale cluster (remove a Broker)
add additional disk to a Broker

Graceful Rolling Upgrade

Operator supports graceful rolling upgrade, It means the operator will check if the cluster is healthy. It basically checks if the cluster has offline partitions, and all the replicas are in sync. It proceeds only when the failure threshold is smaller than the configured one.

The operator also allows to create special alerts on Prometheus, which affects the rolling upgrade state, by increasing the error rate.

Dynamic Configuration Support

Kafka operates with three type of configurations:

Read-only
ClusterWide
PerBroker

Read-only config requires broker restart to update all the others may be updated dynamically. Operator CRD distinguishes these fields, and proceed with the right action. It can be a rolling upgrade, or a dynamic reconfiguration.

KRaft Mode Support (ZooKeeper-free Kafka)

Koperator supports Apache Kafka’s KRaft mode, which eliminates the dependency on ZooKeeper:

Simplified Architecture: Deploy Kafka clusters without requiring a separate ZooKeeper cluster
Better Scalability: Improved metadata handling for large-scale deployments
Future-Ready: KRaft is the future of Kafka and the recommended approach for new deployments
Flexible Node Roles: Support for dedicated controller nodes, broker nodes, or combined roles
Production Ready: Full support for SSL, monitoring, and all Koperator features in KRaft mode with Kafka 3.9.1

For detailed KRaft configuration and deployment guides, see the KRaft Mode Documentation.

Seamless Istio mesh support

Standard Istio Integration: Koperator now uses standard Istio resources (Gateway, VirtualService) instead of deprecated banzaicloud istio-operator
Flexible Deployment: Works with any Istio installation method (operator, Helm, or manual)
Service Mesh Ready: Operator allows to use ClusterIP services instead of Headless, which works better with Service meshes
Sidecar Compatibility: To avoid too early Kafka initialization, the operator uses a small script to mitigate sidecar container readiness issues
Istio Ingress Gateways: Operator supports creating Istio ingress gateways for external access to Kafka clusters running inside the mesh
No Control Plane Dependency: Works with any Istio installation without requiring specific control plane configuration

For detailed Istio integration configuration, troubleshooting, and migration guides, see the Istio Integration Guide.

Getting Started

Quick Start

For users who want to get started quickly with Koperator, follow the standard installation and deployment guides:

Install the Operator - Install Koperator and its dependencies
Deploy a Kafka Cluster - Create your first Kafka cluster
Test Your Deployment - Validate your cluster with producers and consumers

End-to-End Tutorial - The Hard Way

For users who want to understand every component and learn Kafka on Kubernetes from the ground up, we provide a comprehensive tutorial inspired by Kelsey Hightower’s “kubernetes-the-hard-way”:

Kafka on Kubernetes - The Hard Way

This tutorial walks you through:

Setting up a multi-node Kubernetes cluster with kind
Installing all dependencies manually (cert-manager, ZooKeeper, Prometheus)
Deploying a production-ready Kafka cluster with monitoring
Testing performance, disaster recovery, and troubleshooting

Perfect for:

Learning how all components work together
Understanding Kafka deployment architecture
Preparing for production deployments
Troubleshooting and debugging skills

Time commitment: 2-3 hours

Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.

1 - Install the operator

The operator installs version 3.9.1 of Apache Kafka, and can run on:

Minikube v0.33.1+,
Kubernetes 1.21+, and
Red Hat OpenShift 4.10-4.11.

The operator supports Kafka 2.6.2-3.9.x.

CAUTION:

The ZooKeeper and the Kafka clusters need persistent volume (PV) to store data. Therefore, when installing the operator on Amazon EKS with Kubernetes version 1.23 or later, you must install the EBS CSI driver add-on on your cluster.

Prerequisites

A Kubernetes cluster (minimum 6 vCPU and 10 GB RAM). Red Hat OpenShift is also supported in Koperator version 0.24 and newer, but note that it needs some permissions for certain components to function.

We believe in the separation of concerns principle, thus the Koperator does not install nor manage Apache ZooKeeper or cert-manager. Note that ZooKeeper is only required for traditional Kafka deployments - KRaft mode deployments do not require ZooKeeper.

Install Koperator and its requirements independently

Install cert-manager with Helm

Koperator uses cert-manager for issuing certificates to clients and brokers and cert-manager is required for TLS-encrypted client connections. It is recommended to deploy and configure a cert-manager instance if there is none in your environment yet.

Note:

Koperator 0.24.0 and newer versions support cert-manager 1.10.0+ (which is a requirement for Red Hat OpenShift)

Koperator 0.18.1 and newer supports cert-manager 1.5.3-1.9.x

Koperator 0.8.x-0.17.0 supports cert-manager 1.3.x

Install cert-manager’s CustomResourceDefinitions.

kubectl apply \
--validate=false \
-f https://github.com/jetstack/cert-manager/releases/download/v1.11.0/cert-manager.crds.yaml

Expected output:

customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created

If you are installing cert-manager on a Red Hat OpenShift version 4.10 cluster, the default security computing profile must be enabled for cert-manager to work.

Create a new SecurityContextConstraint object named restricted-seccomp which will be a copy of the OpenShift built-in restricted SecurityContextConstraint, but will also allow the runtime/default / RuntimeDefault security computing profile according to the OpenShift documentation.

oc create -f - <<EOF
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: true
allowPrivilegedContainer: false
allowedCapabilities: null
apiVersion: security.openshift.io/v1
defaultAddCapabilities: null
fsGroup:
    type: MustRunAs
groups:
    - system:authenticated
kind: SecurityContextConstraints
metadata:
    annotations:
        include.release.openshift.io/ibm-cloud-managed: "true"
        include.release.openshift.io/self-managed-high-availability: "true"
        include.release.openshift.io/single-node-developer: "true"
        kubernetes.io/description: restricted denies access to all host features and requires pods to be run with a UID, and SELinux context that are allocated to the namespace.  This is the most restrictive SCC and it is used by default for authenticated users.
        release.openshift.io/create-only: "true"
    name: restricted-seccomp # ~
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
    - KILL
    - MKNOD
    - SETUID
    - SETGID
runAsUser:
    type: MustRunAsRange
seLinuxContext:
    type: MustRunAs
seccompProfiles: # +
    - runtime/default # +
supplementalGroups:
    type: RunAsAny
users: []
volumes:
    - configMap
    - downwardAPI
    - emptyDir
    - persistentVolumeClaim
    - projected
    - secret
EOF

Expected output:

securitycontextconstraints.security.openshift.io/restricted-seccomp created

Elevate the permissions of the namespace containing the cert-manager service account.

Using the default cert-manager namespace:

oc adm policy add-scc-to-group restricted-seccomp system:serviceaccounts:cert-manager

Using a custom namespace for cert-manager:

oc adm policy add-scc-to-group anyuid system:serviceaccounts:{NAMESPACE_FOR_CERT_MANAGER_SERVICE_ACCOUNT}

Expected output:

clusterrole.rbac.authorization.k8s.io/system:openshift:scc:restricted-seccomp added: "system:serviceaccounts:{NAMESPACE_FOR_CERT_MANAGER_SERVICE_ACCOUNT}"

Install cert-manager.

helm install \
cert-manager \
--repo https://charts.jetstack.io cert-manager \
--version v1.11.0 \
--namespace cert-manager \
--create-namespace \
--atomic \
--debug

Expected output:

install.go:194: [debug] Original chart version: "v1.11.0"
install.go:211: [debug] CHART PATH: /Users/pregnor/.cache/helm/repository/cert-manager-v1.11.0.tgz

# ...
NAME: cert-manager
LAST DEPLOYED: Thu Mar 23 08:40:07 2023
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
USER-SUPPLIED VALUES:
{}

COMPUTED VALUES:
# ...
NOTES:
cert-manager v1.11.0 has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://cert-manager.io/docs/configuration/

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:

https://cert-manager.io/docs/usage/ingress/

Verify that cert-manager has been deployed and is in running state.

kubectl get pods -n cert-manager

Expected output:

NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-6b4d84674-4pkh4               1/1     Running   0          117s
cert-manager-cainjector-59f8d9f696-wpqph   1/1     Running   0          117s
cert-manager-webhook-56889bfc96-x8szj      1/1     Running   0          117s

Install zookeeper-operator with Helm (Optional for KRaft mode)

Koperator can use either Apache Zookeeper or KRaft mode for Kafka cluster coordination:

For traditional deployments: Deploy zookeeper-operator if your environment doesn’t have an instance of it yet, and create a Zookeeper cluster if there is none in your environment yet for your Kafka cluster.
For KRaft mode deployments: ZooKeeper is not required. Skip this section and see KRaft Mode (ZooKeeper-free Kafka) for KRaft configuration.

If you’re deploying a traditional Kafka cluster (non-KRaft), you must:

Note: You are recommended to create a separate ZooKeeper deployment for each Kafka cluster. If you want to share the same ZooKeeper cluster across multiple Kafka cluster instances, use a unique zk path in the KafkaCluster CR to avoid conflicts (even with previous defunct KafkaCluster instances).

If you are installing zookeeper-operator on a Red Hat OpenShift cluster, elevate the permissions of the namespace containing the Zookeeper service account.

Using the default zookeeper namespace:

oc adm policy add-scc-to-group anyuid system:serviceaccounts:zookeeper

Using a custom namespace for Zookeeper:

oc adm policy add-scc-to-group anyuid system:serviceaccounts:{NAMESPACE_FOR_ZOOKEEPER_SERVICE_ACCOUNT}

Expected output:

clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "system:serviceaccounts:{NAMESPACE_FOR_ZOOKEEPER_SERVICE_ACCOUNT}"

Install ZooKeeper using the Pravega’s Zookeeper Operator.

helm install \
zookeeper-operator \
--repo https://charts.pravega.io zookeeper-operator \
--version 0.2.14 \
--namespace=zookeeper \
--create-namespace \
--atomic \
--debug

Expected output:

install.go:194: [debug] Original chart version: "0.2.14"
install.go:211: [debug] CHART PATH: /Users/pregnor/.cache/helm/repository/zookeeper-operator-0.2.14.tgz

# ...
NAME: zookeeper-operator
LAST DEPLOYED: Thu Mar 23 08:42:42 2023
NAMESPACE: zookeeper
STATUS: deployed
REVISION: 1
TEST SUITE: None
USER-SUPPLIED VALUES:
{}

COMPUTED VALUES:
# ...

Verify that zookeeper-operator has been deployed and is in running state.

kubectl get pods --namespace zookeeper

Expected output:

NAME                                  READY   STATUS    RESTARTS   AGE
zookeeper-operator-5857967dcc-gm5l5   1/1     Running   0          3m22s

Deploy a Zookeeper cluster for Kafka

Create a Zookeeper cluster.
```
    
```

Verify that Zookeeper has been deployed and is in running state with the configured number of replicas.

kubectl get pods -n zookeeper

Expected output:

NAME                                  READY   STATUS    RESTARTS   AGE
zookeeper-server-0                    1/1     Running   0          27m
zookeeper-operator-54444dbd9d-2tccj   1/1     Running   0          28m

Install prometheus-operator with Helm

Koperator uses Prometheus for exporting metrics of the Kafka cluster. It is recommended to deploy a Prometheus instance if you don’t already have one.

If you are installing prometheus-operator on a Red Hat OpenShift version 4.10 cluster, create a SecurityContextConstraints object nonroot-v2 with the following configuration for Prometheus admission and operator service accounts to work.

oc create -f - <<EOF
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: false
allowPrivilegedContainer: false
allowedCapabilities:
    - NET_BIND_SERVICE
apiVersion: security.openshift.io/v1
defaultAddCapabilities: null
fsGroup:
    type: RunAsAny
groups: []
kind: SecurityContextConstraints
metadata:
    annotations:
        include.release.openshift.io/ibm-cloud-managed: "true"
        include.release.openshift.io/self-managed-high-availability: "true"
        include.release.openshift.io/single-node-developer: "true"
        kubernetes.io/description: nonroot provides all features of the restricted SCC but allows users to run with any non-root UID.  The user must specify the UID or it must be specified on the by the manifest of the container runtime. On top of the legacy 'nonroot' SCC, it also requires to drop ALL capabilities and does not allow privilege escalation binaries. It will also default the seccomp profile to runtime/default if unset, otherwise this seccomp profile is required.
    name: nonroot-v2
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
    - ALL
runAsUser:
    type: MustRunAsNonRoot
seLinuxContext:
    type: MustRunAs
seccompProfiles:
    - runtime/default
supplementalGroups:
    type: RunAsAny
users: []
volumes:
    - configMap
    - downwardAPI
    - emptyDir
    - persistentVolumeClaim
    - projected
    - secret
EOF

Expected output:

securitycontextconstraints.security.openshift.io/nonroot-v2 created

If you are installing prometheus-operator on a Red Hat OpenShift cluster, elevate the permissions of the Prometheus service accounts.

Note: OpenShift doesn’t let you install Prometheus in the default namespace due to security considerations.

Using the default prometheus namespace:

oc adm policy add-scc-to-user nonroot-v2 system:serviceaccount:prometheus:prometheus-kube-prometheus-admission
oc adm policy add-scc-to-user nonroot-v2 system:serviceaccount:prometheus:prometheus-kube-prometheus-operator
oc adm policy add-scc-to-user hostnetwork system:serviceaccount:prometheus:prometheus-operator-prometheus-node-exporter
oc adm policy add-scc-to-user node-exporter system:serviceaccount:prometheus:prometheus-operator-prometheus-node-exporter

Using a custom namespace or service account name for Prometheus:

oc adm policy add-scc-to-user nonroot-v2 system:serviceaccount:{NAMESPACE_FOR_PROMETHEUS}:{PROMETHEUS_ADMISSION_SERVICE_ACCOUNT_NAME}
oc adm policy add-scc-to-user nonroot-v2 system:serviceaccount:{NAMESPACE_FOR_PROMETHEUS}:{PROMETHEUS_OPERATOR_SERVICE_ACCOUNT_NAME}
oc adm policy add-scc-to-user hostnetwork system:serviceaccount:{NAMESPACE_FOR_PROMETHEUS}:{PROMETHEUS_NODE_EXPORTER_SERVICE_ACCOUNT_NAME}
oc adm policy add-scc-to-user node-exporter system:serviceaccount:{NAMESPACE_FOR_PROMETHEUS}:{PROMETHEUS_NODE_EXPORTER_SERVICE_ACCOUNT_NAME}

Expected output:

clusterrole.rbac.authorization.k8s.io/system:openshift:scc:nonroot-v2 added: "{PROMETHEUS_ADMISSION_SERVICE_ACCOUNT_NAME}"
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:nonroot-v2 added: "{PROMETHEUS_OPERATOR_SERVICE_ACCOUNT_NAME}"
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:hostnetwork added: "{PROMETHEUS_NODE_EXPORTER_SERVICE_ACCOUNT_NAME}"
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:node-exporter added: "{PROMETHEUS_NODE_EXPORTER_SERVICE_ACCOUNT_NAME}"

Install the Prometheus operator and its CustomResourceDefinitions into the prometheus namespace.

On an OpenShift cluster:

helm install \
prometheus \
--repo https://prometheus-community.github.io/helm-charts kube-prometheus-stack \
--version 42.0.1 \
--namespace prometheus \
--create-namespace \
--atomic \
--debug \
--set prometheusOperator.createCustomResource=true \
--set defaultRules.enabled=false \
--set alertmanager.enabled=false \
--set grafana.enabled=false \
--set kubeApiServer.enabled=false \
--set kubelet.enabled=false \
--set kubeControllerManager.enabled=false \
--set coreDNS.enabled=false \
--set kubeEtcd.enabled=false \
--set kubeScheduler.enabled=false \
--set kubeProxy.enabled=false \
--set kubeStateMetrics.enabled=false \
--set nodeExporter.enabled=false \
--set prometheus.enabled=false \
--set prometheusOperator.containerSecurityContext.capabilities.drop\[0\]="ALL" \
--set prometheusOperator.containerSecurityContext.seccompProfile.type=RuntimeDefault \
--set prometheusOperator.admissionWebhooks.createSecretJob.securityContext.allowPrivilegeEscalation=false \
--set prometheusOperator.admissionWebhooks.createSecretJob.securityContext.capabilities.drop\[0\]="ALL" \
--set prometheusOperator.admissionWebhooks.createSecretJob.securityContext.seccompProfile.type=RuntimeDefault \
--set prometheusOperator.admissionWebhooks.patchWebhookJob.securityContext.allowPrivilegeEscalation=false \
--set prometheusOperator.admissionWebhooks.patchWebhookJob.securityContext.capabilities.drop\[0\]="ALL" \
--set prometheusOperator.admissionWebhooks.patchWebhookJob.securityContext.seccompProfile.type=RuntimeDefault

On a regular Kubernetes cluster:

helm install prometheus \
--repo https://prometheus-community.github.io/helm-charts kube-prometheus-stack \
--version 42.0.1 \
--namespace prometheus \
--create-namespace \
--atomic \
--debug \
--set prometheusOperator.createCustomResource=true \
--set defaultRules.enabled=false \
--set alertmanager.enabled=false \
--set grafana.enabled=false \
--set kubeApiServer.enabled=false \
--set kubelet.enabled=false \
--set kubeControllerManager.enabled=false \
--set coreDNS.enabled=false \
--set kubeEtcd.enabled=false \
--set kubeScheduler.enabled=false \
--set kubeProxy.enabled=false \
--set kubeStateMetrics.enabled=false \
--set nodeExporter.enabled=false \
--set prometheus.enabled=false

Expected output:

install.go:194: [debug] Original chart version: "45.7.1"
install.go:211: [debug] CHART PATH: /Users/pregnor/.cache/helm/repository/kube-prometheus-stack-45.7.1.tgz

# ...
NAME: prometheus
LAST DEPLOYED: Thu Mar 23 09:28:29 2023
NAMESPACE: prometheus
STATUS: deployed
REVISION: 1
TEST SUITE: None
USER-SUPPLIED VALUES:
# ...

COMPUTED VALUES:
# ...
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
    kubectl --namespace prometheus get pods -l "release=prometheus"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

Verify that prometheus-operator has been deployed and is in running state.

kubectl get pods -n prometheus

Expected output:

NAME                                                   READY   STATUS    RESTARTS   AGE
prometheus-kube-prometheus-operator-646d5fd7d5-s72jn   1/1     Running   0          15m

Install Koperator with Helm

Koperator can be deployed using its Helm chart.

Install the Koperator CustomResourceDefinition resources (adjust the version number to the Koperator release you want to install). This is performed in a separate step to allow you to uninstall and reinstall Koperator without deleting your installed custom resources.

kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_cruisecontroloperations.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkatopics.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkausers.yaml

Expected output:

customresourcedefinition.apiextensions.k8s.io/cruisecontroloperations.kafka.banzaicloud.io created
customresourcedefinition.apiextensions.k8s.io/kafkaclusters.kafka.banzaicloud.io created
customresourcedefinition.apiextensions.k8s.io/kafkatopics.kafka.banzaicloud.io created
customresourcedefinition.apiextensions.k8s.io/kafkausers.kafka.banzaicloud.io created

If you are installing Koperator on a Red Hat OpenShift cluster:

Elevate the permissions of the Koperator namespace.

Using the default kafka namespace:

oc adm policy add-scc-to-group anyuid system:serviceaccounts:kafka

Using a custom namespace for Koperator:

oc adm policy add-scc-to-group anyuid system:serviceaccounts:{NAMESPACE_FOR_KOPERATOR}

Expected output:

clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "system:serviceaccounts:{NAMESPACE_FOR_KOPERATOR}"

If the Kafka cluster is going to run in a different namespace than Koperator, elevate the permissions of the Kafka cluster broker service account (ServiceAccountName provided in the KafkaCluster custom resource).

oc adm policy add-scc-to-user anyuid system:serviceaccount:{NAMESPACE_FOR_KAFKA_CLUSTER_BROKER_SERVICE_ACCOUNT}:{KAFKA_CLUSTER_BROKER_SERVICE_ACCOUNT_NAME}

Expected output:

clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "system:serviceaccount:{NAMESPACE_FOR_KAFKA_CLUSTER_BROKER_SERVICE_ACCOUNT}:{KAFKA_CLUSTER_BROKER_SERVICE_ACCOUNT_NAME}"

Install Koperator into the kafka namespace using the OCI Helm chart from GitHub Container Registry:

📦 View available versions: ghcr.io/adobe/koperator/kafka-operator

# Install the latest release
helm install kafka-operator oci://ghcr.io/adobe/helm-charts/kafka-operator --namespace=kafka --create-namespace

# Or install a specific version (replace with desired version)
helm install kafka-operator oci://ghcr.io/adobe/helm-charts/kafka-operator --version 0.28.0-adobe-20250923 --namespace=kafka --create-namespace

Pull and inspect the chart before installation

# Pull the chart locally
helm pull oci://ghcr.io/adobe/helm-charts/kafka-operator --version 0.28.0-adobe-20250923

# Extract and inspect
tar -xzf kafka-operator-0.28.0-adobe-20250923.tgz
helm template kafka-operator ./kafka-operator/

# Install from local chart
helm install kafka-operator ./kafka-operator/ --namespace=kafka --create-namespace

Expected output:

install.go:194: [debug] Original chart version: ""
install.go:211: [debug] CHART PATH: /Users/pregnor/development/src/github.com/adobe/koperator/kafka-operator-0.28.0-adobe-20250923.tgz

# ...
NAME: kafka-operator
LAST DEPLOYED: Thu Mar 23 10:05:11 2023
NAMESPACE: kafka
STATUS: deployed
REVISION: 1
TEST SUITE: None
USER-SUPPLIED VALUES:
# ...

Verify that Koperator has been deployed and is in running state.

kubectl get pods -n kafka

Expected output:

NAME                                       READY   STATUS    RESTARTS   AGE
kafka-operator-operator-8458b45587-286f9   2/2     Running   0          62s

Deploy a Kafka cluster

Create the Kafka cluster using the KafkaCluster custom resource. You can find various examples for the custom resource in Configure Kafka cluster and in the Koperator repository.

CAUTION:
After the cluster is created, you cannot change the way the listeners are configured without an outage. If a cluster is created with unencrypted (plain text) listener and you want to switch to SSL encrypted listeners (or the way around), you must manually delete each broker pod. The operator will restart the pods with the new listener configuration.
- To create a sample Kafka cluster that allows unencrypted client connections, run the following command:
```
kubectl create \
-n kafka \
-f https://raw.githubusercontent.com/adobe/koperator/v0.28.0-adobe-20250923/config/samples/simplekafkacluster.yaml
```
- To create a sample Kafka cluster that allows TLS-encrypted client connections, run the following command. For details on the configuration parameters related to SSL, see Enable SSL encryption in Apache Kafka.
```
kubectl create \
-n kafka \
-f https://raw.githubusercontent.com/adobe/koperator/v0.28.0-adobe-20250923/config/samples/simplekafkacluster_ssl.yaml
```
- To create a sample Kafka cluster using KRaft mode (ZooKeeper-free), run the following command. For details on KRaft configuration, see KRaft Mode (ZooKeeper-free Kafka).
```
kubectl create \
-n kafka \
-f https://raw.githubusercontent.com/adobe/koperator/master/config/samples/kraft/simplekafkacluster_kraft.yaml
```
Expected output:
```
kafkacluster.kafka.banzaicloud.io/kafka created
```

Wait and verify that the Kafka cluster resources have been deployed and are in running state.

kubectl -n kafka get kafkaclusters.kafka.banzaicloud.io kafka --watch

Expected output:

NAME    CLUSTER STATE        CLUSTER ALERT COUNT   LAST SUCCESSFUL UPGRADE   UPGRADE ERROR COUNT   AGE
kafka   ClusterReconciling   0                                               0                     5s
kafka   ClusterReconciling   0                                               0                     7s
kafka   ClusterReconciling   0                                               0                     8s
kafka   ClusterReconciling   0                                               0                     9s
kafka   ClusterReconciling   0                                               0                     2m17s
kafka   ClusterReconciling   0                                               0                     3m11s
kafka   ClusterReconciling   0                                               0                     3m27s
kafka   ClusterReconciling   0                                               0                     3m29s
kafka   ClusterReconciling   0                                               0                     3m31s
kafka   ClusterReconciling   0                                               0                     3m32s
kafka   ClusterReconciling   0                                               0                     3m32s
kafka   ClusterRunning       0                                               0                     3m32s
kafka   ClusterReconciling   0                                               0                     3m32s
kafka   ClusterRunning       0                                               0                     3m34s
kafka   ClusterReconciling   0                                               0                     4m23s
kafka   ClusterRunning       0                                               0                     4m25s
kafka   ClusterReconciling   0                                               0                     4m25s
kafka   ClusterRunning       0                                               0                     4m27s
kafka   ClusterRunning       0                                               0                     4m37s
kafka   ClusterReconciling   0                                               0                     4m37s
kafka   ClusterRunning       0                                               0                     4m39s

kubectl get pods -n kafka

Expected output:

kafka-0-9brj4                              1/1     Running   0          94s
kafka-1-c2spf                              1/1     Running   0          93s
kafka-2-p6sg2                              1/1     Running   0          92s
kafka-cruisecontrol-776f49fdbb-rjhp8       1/1     Running   0          51s
kafka-operator-operator-7d47f65d86-2mx6b   2/2     Running   0          13m

If prometheus-operator is deployed, create a Prometheus instance and corresponding ServiceMonitors for Koperator.

kubectl create \
-n kafka \
-f https://raw.githubusercontent.com/adobe/koperator/v0.28.0-adobe-20250923/config/samples/kafkacluster-prometheus.yaml

Expected output:

clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
prometheus.monitoring.coreos.com/kafka-prometheus created
prometheusrule.monitoring.coreos.com/kafka-alerts created
serviceaccount/prometheus created
servicemonitor.monitoring.coreos.com/cruisecontrol-servicemonitor created
servicemonitor.monitoring.coreos.com/kafka-servicemonitor created

Wait and verify that the Kafka cluster Prometheus instance has been deployed and is in running state.

kubectl get pods -n kafka

Expected output:

NAME                                      READY   STATUS    RESTARTS   AGE
kafka-0-nvx8c                             1/1     Running   0          16m
kafka-1-swps9                             1/1     Running   0          15m
kafka-2-lppzr                             1/1     Running   0          15m
kafka-cruisecontrol-fb659b84b-7cwpn       1/1     Running   0          15m
kafka-operator-operator-8bb75c7fb-7w4lh   2/2     Running   0          17m
prometheus-kafka-prometheus-0             2/2     Running   0          16m

Test your deployment

For a simple test, see Test provisioned Kafka Cluster.
For a more in-depth view at using SSL and the KafkaUser CRD, see Securing Kafka With SSL.
To create topics via with the KafkaTopic CRD, see Provisioning Kafka Topics.

2 - Upgrade the operator

When upgrading your Koperator deployment to a new version, complete the following steps.

Update the CRDs for the new release from the main repository.

CAUTION:
Hazard of data loss Do not delete the old CRD from the cluster. Deleting the CRD removes your Kafka cluster.

Replace the KafkaCluster CRDs with the new ones on your cluster by running the following commands:

kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_cruisecontroloperations.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkatopics.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkausers.yaml

Update your Koperator deployment by running:

helm upgrade kafka-operator \
oci://ghcr.io/adobe/helm-charts/kafka-operator \
--namespace=kafka

3 - Test provisioned Kafka Cluster

Create Topic

Topic creation by default is enabled in Apache Kafka, but if it is configured otherwise, you’ll need to create a topic first.

You can use the KafkaTopic CR to create a topic called my-topic like this:
```
    
```
Note: The previous command will fail if the cluster has not finished provisioning.

Expected output:
```
kafkatopic.kafka.banzaicloud.io/my-topic created
```

To create a sample topic from the CLI you can run the following:

For internal listeners exposed by a headless service (KafkaCluster.spec.headlessServiceEnabled set to true):

kubectl -n kafka run kafka-topics -it --image=ghcr.io/adobe/kafka:2.13-3.9.1 --rm=true --restart=Never -- /opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-headless.kafka:29092 --topic my-topic --create --partitions 1 --replication-factor 1

For internal listeners exposed by a regular service (KafkaCluster.spec.headlessServiceEnabled set to false):

kubectl -n kafka run kafka-topics -it --image=ghcr.io/adobe/kafka:2.13-3.9.1 --rm=true --restart=Never -- /opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-all-broker.kafka:29092 --topic my-topic --create --partitions 1 --replication-factor 1

After you have created a topic, produce and consume some messages:

Send and receive messages without SSL within a cluster

You can use the following commands to send and receive messages within a Kubernetes cluster when SSL encryption is disabled for Kafka.

Produce messages:

Start the producer container

kubectl run \
-n kafka \
kafka-producer \
-it \
--image=ghcr.io/adobe/kafka:2.13-3.9.1 \
--rm=true \
--restart=Never \
-- \
/opt/kafka/bin/kafka-console-producer.sh \
--bootstrap-server kafka-headless:29092 \
--topic my-topic

Wait for the producer container to run, this may take a couple seconds.

Expected output:
```
If you don't see a command prompt, try pressing enter.
```
Press enter to get a command prompt.

Expected output:
```
>
```
Type your messages and press enter, each line will be sent through Kafka.

Example:
```
> test
> message
>
```
Stop the container. (You can CTRL-D out of it.)

Expected output:
```
pod "kafka-producer" deleted
```

Consume messages:

Start the consumer container.

kubectl run \
-n kafka \
kafka-consumer \
-it \
--image=ghcr.io/adobe/kafka:2.13-3.9.1 \
--rm=true \
--restart=Never \
-- \
/opt/kafka/bin/kafka-console-consumer.sh \
--bootstrap-server kafka-headless:29092 \
--topic my-topic \
--from-beginning

Wait for the consumer container to run, this may take a couple seconds.

Expected output:
```
If you don't see a command prompt, try pressing enter.
```
The messages sent by the producer should be displayed here.

Example:
```
test
message
```

Stop the container. (You can CTRL-C out of it.)

Expected output:

Processed a total of 3 messages
pod "kafka-consumer" deleted
pod kafka/kafka-consumer terminated (Error)

Send and receive messages with SSL within a cluster

You can use the following procedure to send and receive messages within a Kubernetes cluster when SSL encryption is enabled for Kafka. To test a Kafka instance secured by SSL we recommend using kcat.

To use the java client instead of kcat, generate the proper truststore and keystore using the official docs.

Create a Kafka user. The client will use this user account to access Kafka. You can use the KafkaUser custom resource to customize the access rights as needed. For example:
```
    
```
To use Kafka inside the cluster, create a Pod which contains kcat. Create a kafka-test pod in the kafka namespace. Note that the value of the secretName parameter must be the same as you used when creating the KafkaUser resource, for example, example-kafkauser-secret.
```
    
```
Wait until the pod is created, then exec into the container:
```
kubectl exec -it -n kafka kafka-test -- sh
```

Run the following command to check that you can connect to the brokers.

kcat -L -b kafka-headless:29092 -X security.protocol=SSL -X ssl.key.location=/ssl/certs/tls.key -X ssl.certificate.location=/ssl/certs/tls.crt -X ssl.ca.location=/ssl/certs/ca.crt

The first line of the output should indicate that the communication is encrypted, for example:

Metadata for all topics (from broker -1: ssl://kafka-headless:29092/bootstrap):

Produce some test messages. Run:

kcat -P -b kafka-headless:29092 -t my-topic \
-X security.protocol=SSL \
-X ssl.key.location=/ssl/certs/tls.key \
-X ssl.certificate.location=/ssl/certs/tls.crt \
-X ssl.ca.location=/ssl/certs/ca.crt

And type some test messages.

Consume some messages. The following command will use the certificate provisioned with the cluster to connect to Kafka. If you’d like to create and use a different user, create a KafkaUser CR, for details, see the SSL documentation.
```
kcat -C -b kafka-headless:29092 -t my-topic \
-X security.protocol=SSL \
-X ssl.key.location=/ssl/certs/tls.key \
-X ssl.certificate.location=/ssl/certs/tls.crt \
-X ssl.ca.location=/ssl/certs/ca.crt
```
You should see the messages you have created.

Send and receive messages outside a cluster

Prerequisites

Producers and consumers that are not in the same Kubernetes cluster can access the Kafka cluster only if an external listener is configured in your KafkaCluster CR. Check that the listenersConfig.externalListeners section exists in the KafkaCluster CR.

Obtain the external address and port number of the cluster by running the following commands.

If the external listener uses a LoadBalancer:

export SERVICE_IP=$(kubectl get svc --namespace kafka -o jsonpath="{.status.loadBalancer.ingress[0].hostname}" envoy-loadbalancer-external-kafka); echo $SERVICE_IP

export SERVICE_PORTS=($(kubectl get svc --namespace kafka -o jsonpath="{.spec.ports[*].port}" envoy-loadbalancer-external-kafka)); echo ${SERVICE_PORTS[@]}

# depending on the shell you are using, arrays may be indexed starting from 0 or 1
export SERVICE_PORT=${SERVICE_PORTS[@]:0:1}; echo $SERVICE_PORT

If the external listener of your Kafka cluster accepts encrypted connections, proceed to SSL enabled. Otherwise, proceed to SSL disabled.

SSL disabled

Produce some test messages on the the external client.
- If you have kcat installed, run:
```
kcat -P -b $SERVICE_IP:$SERVICE_PORT -t my-topic
```
- If you have the Java Kafka client installed, run:
```
kafka-console-producer.sh --bootstrap-server $SERVICE_IP:$SERVICE_PORT --topic my-topic
```
And type some test messages.

Consume some messages.

If you have kcat installed, run:

kcat -C -b $SERVICE_IP:$SERVICE_PORT -t my-topic

If you have the Java Kafka client installed, run:

kafka-console-consumer.sh --bootstrap-server $SERVICE_IP:$SERVICE_PORT --topic my-topic --from-beginning

You should see the messages you have created.

SSL enabled

You can use the following procedure to send and receive messages from an external host that is outside a Kubernetes cluster when SSL encryption is enabled for Kafka. To test a Kafka instance secured by SSL we recommend using kcat.

To use the java client instead of kcat, generate the proper truststore and keystore using the official docs.

Install kcat.

MacOS:
```
brew install kcat
```
Ubuntu:
```
apt-get update
apt-get install kcat
```

Connect to the Kubernetes cluster that runs your Kafka deployment.
Create a Kafka user. The client will use this user account to access Kafka. You can use the KafkaUser custom resource to customize the access rights as needed. For example:
```
    
```

Download the certificate and the key of the user, and the CA certificate used to verify the certificate of the Kafka server. These are available in the Kubernetes Secret created for the KafkaUser.

kubectl get secrets -n kafka <name-of-the-user-secret> -o jsonpath="{['data']['tls\.crt']}" | base64 -D > client.crt.pem
kubectl get secrets -n kafka <name-of-the-user-secret> -o jsonpath="{['data']['tls\.key']}" | base64 -D > client.key.pem
kubectl get secrets -n kafka <name-of-the-user-secret> -o jsonpath="{['data']['ca\.crt']}" | base64 -D > ca.crt.pem

Copy the downloaded certificates to a location that is accessible to the external host.
If you haven’t done so already, obtain the external address and port number of the cluster.

Produce some test messages on the host that is outside your cluster.

kcat -b $SERVICE_IP:$SERVICE_PORT -P -X security.protocol=SSL \
-X ssl.key.location=client.key.pem \
-X ssl.certificate.location=client.crt.pem \
-X ssl.ca.location=ca.crt.pem \
-t my-topic

And type some test messages.

Consume some messages.

kcat -b $SERVICE_IP:$SERVICE_PORT -C -X security.protocol=SSL \
-X ssl.key.location=client.key.pem \
-X ssl.certificate.location=client.crt.pem \
-X ssl.ca.location=ca.crt.pem \
-t my-topic

You should see the messages you have created.

4 - CruiseControlOperation to manage Cruise Control

Koperator version 0.22 introduces the CruiseControlOperation custom resource. Koperator executes the Cruise Control related task based on the state of the CruiseControlOperation custom resource. This gives you better control over Cruise Control, improving reliability, configurability, and observability.

Overview

When a broker is added or removed from the Kafka cluster or when new storage is added for a broker, Koperator creates a CruiseControlOperation custom resource. This custom resource describes a task that Cruise Control executes to move the partitions. Koperator watches the created CruiseControlOperation custom resource and updates its state based on the result of the Cruise Control task. Koperator can re-execute the task if it fails.

Cruise Control can execute only one task at a time, so the priority of the tasks depends on the type of the operation:

Upscale operations are executed first, then
downscale operations, then
rebalance operations.

The following Cruise Control tasks are supported:

add_broker (GracefulUpscale)
remove_broker (GracefulDownscale)
rebalance (GracefulDiskRebalance)

You can follow the progress of the operation through the KafkaCluster custom resource’s status and through the CruiseControlOperation custom resource’s status.
The following example shows the steps of an add_broker (GracefulUpscale*) operation, but the same applies for the Kafka cluster remove_broker (GracefulDownScale*) and rebalance (when the volumeState is GracefulDiskRebalance*) operations.

Upscale the Kafka cluster by adding a new broker with id “3” into the KafkaCluster CR:

spec:
...  
  brokers:
    - id: 0
      brokerConfigGroup: "default"
    - id: 1
      brokerConfigGroup: "default"
    - id: 2
      brokerConfigGroup: "default"
    - id: 3
      brokerConfigGroup: "default"
...

A new broker pod is created and the cruiseControlOperationReference is added to the KafkaCluster status.
This is the reference of the created CruiseControlOperation custom resource.
The cruiseControlState shows the CruiseControlOperation state: GracefulUpscaleScheduled, meaning that CruiseControlOperation has been created and is waiting for the add_broker task to be finished.

status:
...
  brokersState:
    "3":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-addbroker-mhh72
        cruiseControlState: GracefulUpscaleScheduled
        volumeStates:
          /kafka-logs:
            cruiseControlOperationReference:
              name: kafka-rebalance-h6ntt
            cruiseControlVolumeState: GracefulDiskRebalanceScheduled
          /kafka-logs2:
            cruiseControlOperationReference:
              name: kafka-rebalance-h6ntt
            cruiseControlVolumeState: GracefulDiskRebalanceScheduled
    ...

The add_broker Cruise Control task is in progress:

status:
...
  brokersState:
    "3":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-addbroker-mhh72
        cruiseControlState: GracefulUpscaleRunning
    ...

When the add_broker Cruise Control task is completed:

status:
...
  brokersState:
    "3":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-addbroker-mhh72
        cruiseControlState: GracefulUpscaleSucceeded
    ...

There are two other possible states of cruiseControlState, GracefulUpscaleCompletedWithError and GracefulUpscalePaused.

GracefulUpscalePaused is a special state. For details, see Control the created CruiseControlOperation.
The GracefulUpscaleCompletedWithError occurs when the Cruise Control task fails. If the cruiseControlOperation.spec.errorPolicy is set to retry (which is the default value), Koperator re-executes the failed task every 30s until it succeeds. During the re-execution the cruiseControlState returns to GracefulUpscaleRunning.
```
status:
...
  brokersState:
    "3":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-addbroker-mhh72
        cruiseControlState: GracefulUpscaleCompletedWithError
    ...
```

CruiseControlOperation CR overview

The kafka-addbroker-mhh72 CruiseControlOperation custom resource from the previous example looks like:

kind: CruiseControlOperation
metadata:
...
  name: kafka-addbroker-mhh72
...
spec:
...
status:
  currentTask:
    finished: "2022-11-18T09:31:40Z"
    httpRequest: http://kafka-cruisecontrol-svc.kafka.svc.cluster.local:8090/kafkacruisecontrol/add_broker?allow_capacity_estimation=true&brokerid=3&data_from=VALID_WINDOWS&dryrun=false&exclude_recently_demoted_brokers=true&exclude_recently_removed_brokers=true&json=true&use_ready_default_goals=true
    httpResponseCode: 200
    id: 222e30f0-1e7a-4c87-901c-bed2854d69b7
    operation: add_broker
    parameters:
      brokerid: "3"
      exclude_recently_demoted_brokers: "true"
      exclude_recently_removed_brokers: "true"
    started: "2022-11-18T09:30:48Z"
    state: Completed
    summary:
      Data to move: "0"
      Intra broker data to move: "0"
      Number of intra broker replica movements: "0"
      Number of leader movements: "0"
      Number of replica movements: "36"
      Provision recommendation: '[ReplicaDistributionGoal] Remove at least 4 brokers.'
      Recent windows: "1" 
  errorPolicy: retry
  retryCount: 0

The status.currentTask describes the Cruise Control task.
The httpRequest field contains the whole POST HTTP request that has been executed.
The id is the Cruise Control task identifier number.
The state shows the progress of the request.
The summary is Cruise Control’s optimization proposal. It shows the scope of the changes that Cruise Control will apply through the operation.
The retryCount field shows the number of retries when a task has failed and cruiseControlOperation.spec.errorPolicy is set to retry. In this case, the status.failedTask field shows the history of the failed tasks (including their error messages).
For further information on the fields, see the source code.

Control the created CruiseControlOperation

Stop a task

The task execution can be stopped gracefully when the CruiseControlOperation is deleted. In this case the corresponding cruiseControlState or the cruiseControlVolumeState will transition to Graceful*Succeeded.

Handle failed tasks

cruiseControlOperation.spec.errorPolicy defines how the failed Cruise Control task should be handled. When the errorPolicy is set to retry, Koperator re-executes the failed task every 30 seconds. When it is set to ignore, Koperator treats the failed task as completed, thus the cruiseControlState or the cruiseControlVolumeState transitions to Graceful*Succeeded.

Pause a task

When there is a Cruise Control task which can not be completed without an error and the cruiseControlOperation.spec.errorPolicy is set to retry, Koperator will re-execute the task until it succeeds. You can pause automatic re-execution by adding the following label on the corresponding CruiseControlOperation custom resource. For details see this example. To continue the task, remove the label (or set to any other value than true).

Pausing is useful when the reason of the error can not be fixed any time soon but you want to retry the operation later when the problem is resolved.

A paused CruiseControlOperation tasks are ignored when selecting operations for execution: when a new CruiseControlOperation with the same operation type (status.currentTask.operation) is created, the new one is executed and the paused one is skipped.

kind: CruiseControlOperation
metadata:
...
  name: kafka-addbroker-mhh72
  labels:
    pause: "true"
...

Automatic cleanup

You can set automatic cleanup time for the created CruiseControlOperations in the KafkaCluster custom resource.
In the following example, the finished (completed successfully or completedWithError and errorPolicy: ignore) CruiseControlOperation custom resources are automatically deleted after 300 seconds.

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
...
spec:
...
  cruiseControlConfig:
    cruiseControlOperationSpec:
      ttlSecondsAfterFinished: 300
...

Example for the ignore and pause use-cases

This example shows how to ignore and pause an operation.

Using the original example with four Kafka brokers from the Overview as the starting point, this example removes two brokers at the same time by editing the KafkaCluster custom resource and deleting broker 2 and broker 3.
```
 Spec:
 ...  
   brokers:
    - id: 0
      brokerConfigGroup: "default"
    - id: 1
      brokerConfigGroup: "default"
```

The brokers (kafka-removebroker-lg7qm, kafka-removebroker-4plfq) will have separate remove_broker operations. The example shows that the first one is already in running state.

status:
...
  brokersState:
    "2":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-lg7qm
        cruiseControlState: GracefulDownscaleRunning
     ...
     "3":
     gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-4plfq
        cruiseControlState: GracefulDownscaleScheduled
    ...

Assume that something unexpected happened, so the remove_broker operation enters the GracefulDownscaleCompletedWithError state.

status:
...
  brokersState:
    "2":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-lg7qm
        cruiseControlState: GracefulDownscaleCompletedWithError
     ...
     "3":
     gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-4plfq
        cruiseControlState: GracefulDownscaleScheduled
    ...

At this point, you can decide how to handle this problem using one of the three possible options: retry it (which is the default behavior), ignore the error, or use the pause label to pause the operation and let Koperator execute the next operation.

Ignore use-case: To ignore the error, set the cruiseControlOperation.spec.errorPolicy field to ignore. The operation will be considered as a successful operation, and the broker pod and the persistent volume will be removed from the Kubernetes cluster and from the KafkaCluster status. Koperator will continue to execute the next task: remove_broker for kafka-removebroker-4plfq.
```
status:
...
  brokersState:
    ...
     "3":
     gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-4plfq
        cruiseControlState: GracefulDownscaleRunning
    ...
```

Pause use-case: To pause this task, add the pause: true label to the kafka-removebroker-lg7qm CruiseControlOperation. Koperator won’t try to re-execute this task, and moves on to the next remove_broker operation.

status:
...
  brokersState:
    "2":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-lg7qm
        cruiseControlState: GracefulDownscalePaused
     ...
     "3":
     gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-4plfq
        cruiseControlState: GracefulDownscaleRunning
    ...

When the second remove_broker operation is finished, only the paused task remains:

status:
...
  brokersState:
    "2":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-lg7qm
        cruiseControlState: GracefulDownscalePaused
     ...

When the problem has been resolved, you can retry removing broker 2 by removing the pause label.

status:
...
  brokersState:
    "2":
     ...
      gracefulActionState:
        cruiseControlOperationReference:
          name: kafka-removebroker-lg7qm
        cruiseControlState: GracefulDownscaleRunning
     ...

If everything goes well, the broker is removed.

5 - Istio Integration

Istio Integration with Koperator

Koperator now supports Istio integration using standard Istio resources, providing advanced service mesh capabilities for Kafka clusters. This integration replaces the deprecated banzaicloud istio-operator with a more robust approach using standard Kubernetes and Istio resources.

Overview

The Istio integration in Koperator provides:

Service Mesh Capabilities: Full Istio service mesh integration for Kafka clusters
Traffic Management: Advanced traffic routing and load balancing
Security: mTLS encryption and authentication
Observability: Enhanced monitoring and tracing capabilities
Gateway Management: Automatic Istio Gateway and VirtualService creation
No Control Plane Dependency: Works with any Istio installation

Prerequisites

Before using Istio integration with Koperator, ensure you have:

Istio Installation: Any Istio installation (operator-based or manual)
Kubernetes Cluster: Version 1.19+ with sufficient resources
Istio CRDs: Istio Custom Resource Definitions installed

Installation

1. Install Istio (Optional)

If you don’t have Istio installed, you can install it using any method:

# Option 1: Using Istio operator
kubectl apply -f https://github.com/istio/istio/releases/download/1.19.0/istio-1.19.0-linux-amd64.tar.gz
istioctl install --set values.defaultRevision=default

# Option 2: Using Helm
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
helm install istio-base istio/base -n istio-system --create-namespace
helm install istiod istio/istiod -n istio-system --wait

2. Verify Istio Installation

# Check Istio control plane status (if installed)
kubectl get pods -n istio-system

# Verify Istio CRDs are available
kubectl get crd | grep istio

Configuration

Basic Istio Configuration

Configure your KafkaCluster to use Istio ingress:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka
  namespace: kafka
spec:
  ingressController: "istioingress"
  istioIngressConfig:
    gatewayConfig:
      mode: ISTIO_MUTUAL
  # ... rest of your Kafka configuration

Note: The istioControlPlane configuration is no longer required as Koperator now creates standard Kubernetes resources that work with any Istio installation.

Advanced Configuration Options

Istio Ingress Configuration

spec:
  istioIngressConfig:
    # Gateway configuration
    gatewayConfig:
      mode: ISTIO_MUTUAL  # or SIMPLE for non-mTLS
    
    # Resource limits and requests
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "2000m"
        memory: "1024Mi"
    
    # Replica configuration
    replicas: 2
    
    # Node selector for gateway placement
    nodeSelector:
      kubernetes.io/os: linux
    
    # Tolerations for gateway scheduling
    tolerations:
    - key: "istio"
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"
    
    # Environment variables
    envs:
    - name: CUSTOM_VAR
      value: "custom-value"
    
    # Annotations for the gateway
    annotations:
      custom.annotation: "value"
    
    # Service annotations
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

Architecture Changes

Previous Architecture (banzaicloud istio-operator)

The previous implementation used:

IstioMeshGateway custom resources
banzaicloud istio-operator dependencies
Custom Istio operator APIs

New Architecture (Standard Istio Resources)

The new implementation uses:

Standard Kubernetes Deployment and Service resources
Native Istio Gateway and VirtualService resources
No dependency on specific Istio control plane or operator

Resource Creation

When you create a KafkaCluster with Istio integration, Koperator automatically creates:

Kubernetes Deployment: Istio proxy deployment with docker.io/istio/proxyv2:latest image
Kubernetes Service: Load balancer service for external access
Istio Gateway: Routes external traffic to Kafka brokers
VirtualService: Defines routing rules for the gateway

The implementation creates standard Kubernetes resources that work with any Istio installation, making it more flexible and compatible.

Security Features

mTLS Configuration

The Istio integration supports mutual TLS (mTLS) for secure communication:

spec:
  istioIngressConfig:
    gatewayConfig:
      mode: ISTIO_MUTUAL  # Enables mTLS

Authentication and Authorization

Istio provides additional security features:

PeerAuthentication: Configure mTLS policies
AuthorizationPolicy: Define access control rules
RequestAuthentication: JWT token validation

Monitoring and Observability

Metrics

Istio integration provides enhanced metrics:

Traffic Metrics: Request rates, latency, error rates
Gateway Metrics: Istio gateway performance
Service Mesh Metrics: End-to-end observability

Tracing

Distributed tracing is automatically enabled:

Jaeger Integration: Automatic trace collection
Zipkin Support: Alternative tracing backend
Custom Trace Sampling: Configurable sampling rates

Troubleshooting

Common Issues

Istio Control Plane Not Found

# Verify Istio control plane is running
kubectl get pods -n istio-system

Gateway Not Receiving Traffic

# Check gateway status
kubectl get gateway -n kafka
kubectl describe gateway kafka-gateway -n kafka

mTLS Configuration Issues

# Verify peer authentication
kubectl get peerauthentication -n kafka

Debugging Commands

# Check Istio proxy status
istioctl proxy-status

# Verify configuration
istioctl analyze

# Check gateway configuration
kubectl get gateway,virtualservice -n kafka

# View Istio logs
kubectl logs -n kafka -l app=istio-ingressgateway

Migration from banzaicloud istio-operator

If you’re migrating from the deprecated banzaicloud istio-operator:

Remove old dependencies: Uninstall banzaicloud istio-operator
Install upstream Istio: Follow the installation steps above
Update configurations: Update your KafkaCluster specs
Test thoroughly: Verify all functionality works as expected

Best Practices

Resource Planning: Allocate sufficient resources for Istio components
Security: Always use mTLS in production environments
Monitoring: Set up comprehensive monitoring and alerting
Testing: Thoroughly test Istio configurations in non-production environments
Updates: Keep Istio and Koperator versions compatible

Examples

Complete KafkaCluster with Istio

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka
  namespace: kafka
spec:
  headlessServiceEnabled: false
  ingressController: "istioingress"
  istioIngressConfig:
    gatewayConfig:
      mode: ISTIO_MUTUAL
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "2000m"
        memory: "1024Mi"
    replicas: 2
    annotations:
      sidecar.istio.io/inject: "true"
  zkAddresses:
    - "zookeeper-server-client.zookeeper:2181"
  clusterImage: "ghcr.io/adobe/koperator/kafka:2.13-3.9.1"
  brokers:
    - id: 0
      brokerConfigGroup: "default"
    - id: 1
      brokerConfigGroup: "default"
    - id: 2
      brokerConfigGroup: "default"
  brokerConfigGroups:
    default:
      storageConfigs:
        - mountPath: "/kafka-logs"
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
  listenersConfig:
    internalListeners:
      - type: "plaintext"
        name: "internal"
        containerPort: 29092
        usedForInnerBrokerCommunication: true
    externalListeners:
      - type: "plaintext"
        name: "external"
        externalStartingPort: 19090
        containerPort: 9094

Support

For issues related to Istio integration:

Koperator Issues: Report to the Koperator GitHub repository
Istio Issues: Report to the Istio GitHub repository
Documentation: Check the official Istio documentation
Community: Join the Istio and Koperator community discussions

6 - Configure Kafka cluster

Koperator provides convenient ways of configuring Kafka resources through Kubernetes custom resources.

Overview

The KafkaCluster custom resource is the main configuration resource for the Kafka clusters.
It defines the Apache Kafka cluster properties, like Kafka brokers and listeners configurations. By deploying the KafkaCluster custom resource, Koperator sets up your Kafka cluster.
You can change your Kafka cluster properties by updating the KafkaCluster custom resource. The KafkaCluster custom resource always reflects to your Kafka cluster: when something has changed in your KafkaCluster custom resource, Koperator reconciles the changes to your Kafka cluster.

For the CRD reference, see CRD.

6.1 - CRD

The following sections contain the reference documentation of the various custom resource definitions (CRDs) that are specific to Koperator.

For sample YAML files, see the samples directory in the GitHub project.

6.1.1 - KafkaCluster CRD schema reference (group kafka.banzaicloud.io)

KafkaCluster is the Schema for the kafkaclusters API

KafkaCluster

KafkaCluster is the Schema for the kafkaclusters API

Full name:: kafkaclusters.kafka.banzaicloud.io
Group:: kafka.banzaicloud.io
Singular name:: kafkacluster
Plural name:: kafkaclusters
Scope:: Namespaced
Versions:: v1beta1

Version v1beta1

Properties

.apiVersion

string

APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources

.kind

string

Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds

object

array

An array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. If the operator is Gt or Lt, the values array must have a single element, which will be interpreted as an integer. This array is replaced during a strategic merge patch.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields

array

A list of node selector requirements by node’s fields.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[*]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[*].key

string Required

The label key that the selector applies to.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[*].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[*].values

array

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].values[]

string

.spec.brokers[].brokerConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].weight

integer Required

Weight associated with matching the corresponding nodeSelectorTerm, in the range 1-100.

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields

array

A list of node selector requirements by node’s fields.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[*]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[*].key

string Required

The label key that the selector applies to.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[*].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.brokers[].brokerConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[*].values

array

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[*].values

array

values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces

array

namespaces specifies a static list of namespace names that the term applies to. The term is applied to the union of the namespaces listed in this field and the ones selected by namespaceSelector. null or empty namespaces list and null namespaceSelector means “this pod’s namespace”.

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector

object

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].key

string Required

key is the label key that the selector applies to.

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces

array

.spec.brokers[].brokerConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces[*]

string

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[*]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[*].key

string Required

key is the label key that the selector applies to.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[*].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces[*]

string

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.topologyKey

string Required

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].weight

integer Required

weight associated with matching the corresponding podAffinityTerm, in the range 1-100.

.spec.brokers[*].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution

array

If the anti-affinity requirements specified by this field are not met at scheduling time, the pod will not be scheduled onto the node. If the anti-affinity requirements specified by this field cease to be met at some point during pod execution (e.g. due to a pod label update), the system may or may not try to eventually evict the pod from its node. When there are multiple elements, the lists of nodes corresponding to each podAffinityTerm are intersected, i.e. all terms must be satisfied.

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].key

string Required

key is the label key that the selector applies to.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchLabels

object

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces

array

.spec.brokers[].brokerConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces[*]

string

string

Container name: required for volumes, optional for env vars

string

Scheme to use for connecting to the host. Defaults to HTTP.

.spec.brokers[].brokerConfig.containers[].lifecycle.preStop.tcpSocket

object

.spec.brokers[].brokerConfig.containers[].lifecycle.preStop.tcpSocket.host

string

Optional: Host name to connect to, defaults to the pod IP.

.spec.brokers[].brokerConfig.containers[].lifecycle.preStop.tcpSocket.port

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.containers[].livenessProbe

object

Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.brokers[].brokerConfig.containers[].livenessProbe.exec

object

Exec specifies the action to take.

.spec.brokers[].brokerConfig.containers[].livenessProbe.exec.command

array

array

.spec.brokers[].brokerConfig.containers[].readinessProbe.exec.command[*]

string

.spec.brokers[].brokerConfig.containers[].readinessProbe.failureThreshold

integer

Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.

.spec.brokers[].brokerConfig.containers[].readinessProbe.grpc

object

GRPC specifies an action involving a GRPC port. This is a beta field and requires enabling GRPCContainerProbe feature gate.

.spec.brokers[].brokerConfig.containers[].readinessProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

array

.spec.brokers[].brokerConfig.containers[].startupProbe.exec.command[*]

string

.spec.brokers[].brokerConfig.containers[].startupProbe.failureThreshold

integer

Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.

.spec.brokers[].brokerConfig.containers[].startupProbe.grpc

object

GRPC specifies an action involving a GRPC port. This is a beta field and requires enabling GRPCContainerProbe feature gate.

.spec.brokers[].brokerConfig.containers[].startupProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

array

Envs defines environment variables for Kafka broker Pods. Adding the “+” prefix to the name prepends the value to that environment variable instead of overwriting it. Add the “+” suffix to append.

.spec.brokers[].brokerConfig.envs[]

object

EnvVar represents an environment variable present in a Container.

.spec.brokers[].brokerConfig.envs[].name

string Required

Name of the environment variable. Must be a C_IDENTIFIER.

.spec.brokers[].brokerConfig.envs[].value

string

.spec.brokers[].brokerConfig.envs[].valueFrom

object

Source for the environment variable’s value. Cannot be used if value is not empty.

.spec.brokers[].brokerConfig.envs[].valueFrom.configMapKeyRef

object

Selects a key of a ConfigMap.

.spec.brokers[].brokerConfig.envs[].valueFrom.configMapKeyRef.key

string Required

The key to select.

.spec.brokers[].brokerConfig.envs[].valueFrom.configMapKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.brokers[].brokerConfig.envs[].valueFrom.configMapKeyRef.optional

boolean

Specify whether the ConfigMap or its key must be defined

.spec.brokers[].brokerConfig.envs[].valueFrom.secretKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.brokers[].brokerConfig.envs[].valueFrom.secretKeyRef.optional

boolean

Specify whether the Secret or its key must be defined

.spec.brokers[*].brokerConfig.image

string

.spec.brokers[*].brokerConfig.imagePullSecrets

array

.spec.brokers[].brokerConfig.imagePullSecrets[]

object

LocalObjectReference contains enough information to let you locate the referenced object inside the same namespace.

.spec.brokers[].brokerConfig.imagePullSecrets[].name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.brokers[*].brokerConfig.initContainers

array

InitContainers add extra initContainers to the Kafka broker pod

.spec.brokers[].brokerConfig.initContainers[]

object

object

EnvVar represents an environment variable present in a Container.

.spec.brokers[].brokerConfig.initContainers[].env[*].name

string Required

Name of the environment variable. Must be a C_IDENTIFIER.

.spec.brokers[].brokerConfig.initContainers[].env[*].value

string

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom

object

Source for the environment variable’s value. Cannot be used if value is not empty.

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.configMapKeyRef

object

Selects a key of a ConfigMap.

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.configMapKeyRef.key

string Required

The key to select.

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.configMapKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.configMapKeyRef.optional

boolean

Specify whether the ConfigMap or its key must be defined

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.secretKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.brokers[].brokerConfig.initContainers[].env[*].valueFrom.secretKeyRef.optional

boolean

Specify whether the Secret or its key must be defined

Scheme to use for connecting to the host. Defaults to HTTP.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.postStart.tcpSocket

object

.spec.brokers[].brokerConfig.initContainers[].lifecycle.postStart.tcpSocket.host

string

Optional: Host name to connect to, defaults to the pod IP.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.postStart.tcpSocket.port

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders

array

Custom headers to set in the request. HTTP allows repeated headers.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[*]

object

HTTPHeader describes a custom header to be used in HTTP probes

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[*].name

string Required

The header field name

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[*].value

string Required

The header field value

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.path

string

Path to access on the HTTP server.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.port

Required

Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.httpGet.scheme

string

Scheme to use for connecting to the host. Defaults to HTTP.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.tcpSocket

object

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.tcpSocket.host

string

Optional: Host name to connect to, defaults to the pod IP.

.spec.brokers[].brokerConfig.initContainers[].lifecycle.preStop.tcpSocket.port

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].livenessProbe

object

Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.exec

object

Exec specifies the action to take.

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.exec.command

array

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.exec.command[*]

string

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.failureThreshold

integer

Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.grpc

object

GRPC specifies an action involving a GRPC port. This is a beta field and requires enabling GRPCContainerProbe feature gate.

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.terminationGracePeriodSeconds

integer

.spec.brokers[].brokerConfig.initContainers[].livenessProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.brokers[].brokerConfig.initContainers[].name

string Required

Name of the container specified as a DNS_LABEL. Each container in a pod must have a unique name (DNS_LABEL). Cannot be updated.

.spec.brokers[].brokerConfig.initContainers[].ports

array

.spec.brokers[].brokerConfig.initContainers[].ports[*]

object

ContainerPort represents a network port in a single container.

.spec.brokers[].brokerConfig.initContainers[].ports[*].containerPort

integer Required

Number of port to expose on the pod’s IP address. This must be a valid port number, 0 < x < 65536.

.spec.brokers[].brokerConfig.initContainers[].ports[*].hostIP

string

What host IP to bind the external port to.

.spec.brokers[].brokerConfig.initContainers[].ports[*].hostPort

integer

Number of port to expose on the host. If specified, this must be a valid port number, 0 < x < 65536. If HostNetwork is specified, this must match ContainerPort. Most containers do not need this.

.spec.brokers[].brokerConfig.initContainers[].ports[*].name

string

If specified, this must be an IANA_SVC_NAME and unique within the pod. Each named port in a pod must have a unique name. Name for the port that can be referred to by services.

.spec.brokers[].brokerConfig.initContainers[].ports[*].protocol

string

Protocol for port. Must be UDP, TCP, or SCTP. Defaults to “TCP”.

.spec.brokers[].brokerConfig.initContainers[].readinessProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].readinessProbe.terminationGracePeriodSeconds

integer

.spec.brokers[].brokerConfig.initContainers[].readinessProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.brokers[].brokerConfig.initContainers[].resources

object

Compute Resources required by this container. Cannot be updated. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

.spec.brokers[].brokerConfig.initContainers[].resources.limits

object

Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

.spec.brokers[].brokerConfig.initContainers[].resources.requests

object

.spec.brokers[].brokerConfig.initContainers[].securityContext

object

.spec.brokers[].brokerConfig.initContainers[].securityContext.allowPrivilegeEscalation

boolean

.spec.brokers[].brokerConfig.initContainers[].securityContext.capabilities

object

The capabilities to add/drop when running containers. Defaults to the default set of capabilities granted by the container runtime. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[].brokerConfig.initContainers[].securityContext.capabilities.add

array

Added capabilities

.spec.brokers[].brokerConfig.initContainers[].securityContext.capabilities.add[*]

string

Capability represent POSIX capabilities type

.spec.brokers[].brokerConfig.initContainers[].securityContext.capabilities.drop

array

Removed capabilities

.spec.brokers[].brokerConfig.initContainers[].securityContext.capabilities.drop[*]

string

Capability represent POSIX capabilities type

.spec.brokers[].brokerConfig.initContainers[].securityContext.privileged

boolean

.spec.brokers[].brokerConfig.initContainers[].securityContext.procMount

string

.spec.brokers[].brokerConfig.initContainers[].securityContext.readOnlyRootFilesystem

boolean

Whether this container has a read-only root filesystem. Default is false. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[].brokerConfig.initContainers[].securityContext.runAsGroup

integer

.spec.brokers[].brokerConfig.initContainers[].securityContext.runAsNonRoot

boolean

.spec.brokers[].brokerConfig.initContainers[].securityContext.runAsUser

integer

.spec.brokers[].brokerConfig.initContainers[].securityContext.seLinuxOptions

object

.spec.brokers[].brokerConfig.initContainers[].securityContext.seLinuxOptions.level

string

Level is SELinux level label that applies to the container.

.spec.brokers[].brokerConfig.initContainers[].securityContext.seLinuxOptions.role

string

Role is a SELinux role label that applies to the container.

.spec.brokers[].brokerConfig.initContainers[].securityContext.seLinuxOptions.type

string

Type is a SELinux type label that applies to the container.

.spec.brokers[].brokerConfig.initContainers[].securityContext.seLinuxOptions.user

string

User is a SELinux user label that applies to the container.

.spec.brokers[].brokerConfig.initContainers[].securityContext.seccompProfile

object

.spec.brokers[].brokerConfig.initContainers[].securityContext.seccompProfile.localhostProfile

string

.spec.brokers[].brokerConfig.initContainers[].securityContext.seccompProfile.type

string Required

.spec.brokers[].brokerConfig.initContainers[].securityContext.windowsOptions

object

.spec.brokers[].brokerConfig.initContainers[].securityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.brokers[].brokerConfig.initContainers[].securityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.brokers[].brokerConfig.initContainers[].securityContext.windowsOptions.hostProcess

boolean

.spec.brokers[].brokerConfig.initContainers[].securityContext.windowsOptions.runAsUserName

string

.spec.brokers[].brokerConfig.initContainers[].startupProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.brokers[].brokerConfig.initContainers[].startupProbe.terminationGracePeriodSeconds

integer

.spec.brokers[].brokerConfig.initContainers[].startupProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.brokers[].brokerConfig.initContainers[].stdin

boolean

Whether this container should allocate a buffer for stdin in the container runtime. If this is not set, reads from stdin in the container will always result in EOF. Default is false.

.spec.brokers[].brokerConfig.initContainers[].stdinOnce

boolean

.spec.brokers[].brokerConfig.initContainers[].terminationMessagePath

string

string

Path within the volume from which the container’s volume should be mounted. Defaults to “” (volume’s root).

.spec.brokers[].brokerConfig.initContainers[].volumeMounts[*].subPathExpr

string

.spec.brokers[].brokerConfig.initContainers[].workingDir

string

Container’s working directory. If not specified, the container runtime’s default will be used, which might be configured in the container image. Cannot be updated.

string

.spec.brokers[*].brokerConfig.nodePortExternalIP

object

External listeners that use NodePort type service to expose the broker outside the Kubernetes clusterT and their external IP to advertise Kafka broker external listener. The external IP value is ignored in case of external listeners that use LoadBalancer type service to expose the broker outside the Kubernetes cluster. Also, when “hostnameOverride” field of the external listener is set it will override the broker’s external listener advertise address according to the description of the “hostnameOverride” field.

.spec.brokers[*].brokerConfig.nodePortNodeAddressType

string

When “hostNameOverride” and brokerConfig.nodePortExternalIP are empty and NodePort access method is selected for an external listener the NodePortNodeAdddressType defines the Kafka broker’s Kubernetes node’s address type that shall be used in the advertised.listeners property. https://kubernetes.io/docs/concepts/architecture/nodes/#addresses The NodePortNodeAddressType’s possible values can be Hostname, ExternalIP, InternalIP, InternalDNS,ExternalDNS

object

The seccomp options to use by the containers in this pod. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[*].brokerConfig.podSecurityContext.seccompProfile.localhostProfile

string

.spec.brokers[*].brokerConfig.podSecurityContext.seccompProfile.type

string Required

.spec.brokers[*].brokerConfig.podSecurityContext.supplementalGroups

array

A list of groups applied to the first process run in each container, in addition to the container’s primary GID. If unspecified, no groups will be added to any container. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[].brokerConfig.podSecurityContext.supplementalGroups[]

integer

.spec.brokers[*].brokerConfig.podSecurityContext.sysctls

array

Sysctls hold a list of namespaced sysctls used for the pod. Pods with unsupported sysctls (by the container runtime) might fail to launch. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[].brokerConfig.podSecurityContext.sysctls[]

object

Sysctl defines a kernel parameter to be set

.spec.brokers[].brokerConfig.podSecurityContext.sysctls[].name

string Required

Name of a property to set

.spec.brokers[].brokerConfig.podSecurityContext.sysctls[].value

string Required

Value of a property to set

.spec.brokers[*].brokerConfig.podSecurityContext.windowsOptions

object

The Windows specific settings applied to all containers. If unspecified, the options within a container’s SecurityContext will be used. If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence. Note that this field cannot be set when spec.os.name is linux.

.spec.brokers[*].brokerConfig.podSecurityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.brokers[*].brokerConfig.podSecurityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.brokers[*].brokerConfig.podSecurityContext.windowsOptions.hostProcess

boolean

.spec.brokers[*].brokerConfig.podSecurityContext.windowsOptions.runAsUserName

string

.spec.brokers[*].brokerConfig.priorityClassName

string

PriorityClassName specifies the priority class name for a broker pod(s). If specified, the PriorityClass resource with this PriorityClassName must be created beforehand. If not specified, the broker pods’ priority is default to zero.

.spec.brokers[*].brokerConfig.resourceRequirements

object

array

Added capabilities

.spec.brokers[].brokerConfig.securityContext.capabilities.add[]

string

Capability represent POSIX capabilities type

.spec.brokers[*].brokerConfig.securityContext.capabilities.drop

array

Removed capabilities

.spec.brokers[].brokerConfig.securityContext.capabilities.drop[]

string

Capability represent POSIX capabilities type

.spec.brokers[*].brokerConfig.securityContext.privileged

boolean

.spec.brokers[*].brokerConfig.securityContext.procMount

string

.spec.brokers[*].brokerConfig.securityContext.readOnlyRootFilesystem

boolean

Whether this container has a read-only root filesystem. Default is false. Note that this field cannot be set when spec.os.name is windows.

.spec.brokers[*].brokerConfig.securityContext.runAsGroup

integer

.spec.brokers[*].brokerConfig.securityContext.runAsNonRoot

boolean

.spec.brokers[*].brokerConfig.securityContext.runAsUser

integer

.spec.brokers[*].brokerConfig.securityContext.seLinuxOptions

object

.spec.brokers[*].brokerConfig.securityContext.seLinuxOptions.level

string

Level is SELinux level label that applies to the container.

.spec.brokers[*].brokerConfig.securityContext.seLinuxOptions.role

string

Role is a SELinux role label that applies to the container.

.spec.brokers[*].brokerConfig.securityContext.seLinuxOptions.type

string

Type is a SELinux type label that applies to the container.

.spec.brokers[*].brokerConfig.securityContext.seLinuxOptions.user

string

User is a SELinux user label that applies to the container.

.spec.brokers[*].brokerConfig.securityContext.seccompProfile

object

.spec.brokers[*].brokerConfig.securityContext.seccompProfile.localhostProfile

string

.spec.brokers[*].brokerConfig.securityContext.seccompProfile.type

string Required

.spec.brokers[*].brokerConfig.securityContext.windowsOptions

object

.spec.brokers[*].brokerConfig.securityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.brokers[*].brokerConfig.securityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.brokers[*].brokerConfig.securityContext.windowsOptions.hostProcess

boolean

.spec.brokers[*].brokerConfig.securityContext.windowsOptions.runAsUserName

string

.spec.brokers[*].brokerConfig.serviceAccountName

string

.spec.brokers[*].brokerConfig.storageConfigs

array

.spec.brokers[].brokerConfig.storageConfigs[]

object

StorageConfig defines the broker storage configuration

.spec.brokers[].brokerConfig.storageConfigs[].emptyDir

object

If set https://kubernetes.io/docs/concepts/storage/volumes#emptydir is used as storage for Kafka broker log dirs. The use of empty dir as Kafka broker storage is useful in development environments where data loss is not a concern as data stored on emptydir backed storage is lost at pod restarts. Either pvcSpec or emptyDir has to be set. When both pvcSpec and emptyDir fields are set the pvcSpec is used by default.

.spec.brokers[].brokerConfig.storageConfigs[].emptyDir.medium

string

medium represents what type of storage medium should back this directory. The default is “” which means to use the node’s default medium. Must be an empty string (default) or Memory. More info: https://kubernetes.io/docs/concepts/storage/volumes#emptydir

.spec.brokers[].brokerConfig.storageConfigs[].emptyDir.sizeLimit

sizeLimit is the total amount of local storage required for this EmptyDir volume. The size limit is also applicable for memory medium. The maximum usage on memory medium EmptyDir would be the minimum value between the SizeLimit specified here and the sum of memory limits of all containers in a pod. The default is nil which means that the limit is undefined. More info: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir

.spec.brokers[].brokerConfig.storageConfigs[].mountPath

string Required

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec

object

If set https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim is used as storage for Kafka broker log dirs. Either pvcSpec or emptyDir has to be set. When both pvcSpec and emptyDir fields are set the pvcSpec is used by default.

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.accessModes

array

accessModes contains the desired access modes the volume should have. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#access-modes-1

object

resources represents the minimum resources the volume should have. If RecoverVolumeExpansionFailure feature is enabled users are allowed to specify resource requirements that are lower than previous value but must still be higher than capacity recorded in the status field of the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#resources

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.resources.limits

object

Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

array

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.selector.matchExpressions[].values[]

string

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.selector.matchLabels

object

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.storageClassName

string

storageClassName is the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.volumeMode

string

volumeMode defines what type of volume is required by the claim. Value of Filesystem is implied when not included in claim spec.

.spec.brokers[].brokerConfig.storageConfigs[].pvcSpec.volumeName

string

volumeName is the binding reference to the PersistentVolume backing this claim.

.spec.brokers[*].brokerConfig.terminationGracePeriodSeconds

integer

TerminationGracePeriod defines the pod termination grace period

boolean

Mounted read-only if true, read-write otherwise (false or unspecified). Defaults to false.

.spec.brokers[].brokerConfig.volumeMounts[].subPath

string

Path within the volume from which the container’s volume should be mounted. Defaults to “” (volume’s root).

array Required

monitors is Required: Monitors is a collection of Ceph monitors More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it

accessModes contains the desired access modes the volume should have. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#access-modes-1

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.accessModes[*]

string

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSource

object

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSource.apiGroup

string

APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSource.kind

string Required

Kind is the type of resource being referenced

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSource.name

string Required

Name is the name of resource being referenced

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSourceRef

object

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSourceRef.apiGroup

string

APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.dataSourceRef.kind

string Required

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[*]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[*].key

string Required

key is the label key that the selector applies to.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[*].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[*].values

array

.spec.brokers[].brokerConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[].values[]

string

array

targetWWNs is Optional: FC target worldwide names (WWNs)

.spec.brokers[].brokerConfig.volumes[].fc.targetWWNs[*]

string

.spec.brokers[].brokerConfig.volumes[].fc.wwids

array

wwids Optional: FC volume world wide identifiers (wwids) Either wwids or combination of targetWWNs and lun must be set, but not both simultaneously.

string

fsType is filesystem type of the volume that you want to mount. Tip: Ensure that the filesystem type is supported by the host operating system. Examples: “ext4”, “xfs”, “ntfs”. Implicitly inferred to be “ext4” if unspecified. More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk TODO: how do we prevent errors in the filesystem from compromising the machine

string

iscsiInterface is the interface Name that uses an iSCSI transport. Defaults to ‘default’ (tcp).

.spec.brokers[].brokerConfig.volumes[].iscsi.lun

integer Required

lun represents iSCSI Target Lun number.

.spec.brokers[].brokerConfig.volumes[].iscsi.portals

array

portals is the iSCSI Target Portal List. The portal is either an IP or ip_addr:port if the port is other than default (typically TCP ports 860 and 3260).

.spec.brokers[].brokerConfig.volumes[].projected.sources[*].configMap.optional

boolean

optional specify whether the ConfigMap or its keys must be defined

.spec.brokers[].brokerConfig.volumes[].projected.sources[*].downwardAPI

object

downwardAPI information about the downwardAPI data to project

.spec.brokers[].brokerConfig.volumes[].projected.sources[*].downwardAPI.items

array

Items is a list of DownwardAPIVolume file

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[]

object

DownwardAPIVolumeFile represents information to create the file containing the pod field

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].fieldRef

object

Required: Selects a field of the pod: only annotations, labels, name and namespace are supported.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].fieldRef.apiVersion

string

Version of the schema the FieldPath is written in terms of, defaults to “v1”.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].fieldRef.fieldPath

string Required

Path of the field to select in the specified API version.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].mode

integer

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].path

string Required

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].resourceFieldRef

object

Selects a resource of the container: only resources limits and requests (limits.cpu, limits.memory, requests.cpu and requests.memory) are currently supported.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].resourceFieldRef.containerName

string

Container name: required for volumes, optional for env vars

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].resourceFieldRef.divisor

Specifies the output format of the exposed resources, defaults to “1”

.spec.brokers[].brokerConfig.volumes[].projected.sources[].downwardAPI.items[].resourceFieldRef.resource

string Required

Required: resource to select

.spec.brokers[].brokerConfig.volumes[].projected.sources[*].secret

object

secret information about the secret data to project

.spec.brokers[].brokerConfig.volumes[].projected.sources[*].secret.items

array

items if unspecified, each key-value pair in the Data field of the referenced Secret will be projected into the volume as a file whose name is the key and content is the value. If specified, the listed keys will be projected into the specified paths, and unlisted keys will not be present. If a key is specified which is not present in the Secret, the volume setup will error unless it is marked optional. Paths must be relative and may not contain the ‘..’ path or start with ‘..’.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].secret.items[]

object

Maps a string key to a path within a volume.

.spec.brokers[].brokerConfig.volumes[].projected.sources[].secret.items[].key

string Required

key is the key to project.

string

keyring is the path to key ring for RBDUser. Default is /etc/ceph/keyring. More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it

.spec.brokers[].brokerConfig.volumes[].rbd.monitors

array Required

monitors is a collection of Ceph monitors. More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it

object

Maps a string key to a path within a volume.

.spec.brokers[].brokerConfig.volumes[].secret.items[*].key

string Required

key is the key to project.

string

.spec.brokers[*].id

integer Required

.spec.brokers[*].readOnlyConfig

string

.spec.clientSSLCertSecret

object

ClientSSLCertSecret is a reference to the Kubernetes secret where custom client SSL certificate can be provided. It will be used by the koperator, cruise control, cruise control metrics reporter to communicate on SSL with that internal listener which is used for interbroker communication. The client certificate must share the same chain of trust as the server certificate used by the corresponding internal listener. The secret must contain the keystore, truststore jks files and the password for them in base64 encoded format under the keystore.jks, truststore.jks, password data fields.

.spec.clientSSLCertSecret.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.clusterImage

string

.spec.clusterMetricsReporterImage

string

.spec.clusterWideConfig

string

.spec.cruiseControlConfig

object Required

CruiseControlConfig defines the config for Cruise Control

.spec.cruiseControlConfig.affinity

object

Affinity is a group of affinity scheduling rules.

.spec.cruiseControlConfig.affinity.nodeAffinity

object

Describes node affinity scheduling rules for the pod.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution

array

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*]

object

An empty preferred scheduling term matches all objects with implicit weight 0 (i.e. it’s a no-op). A null preferred scheduling term matches no objects (i.e. is also a no-op).

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference

object Required

A node selector term, associated with the corresponding weight.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference.matchExpressions

array

A list of node selector requirements by node’s labels.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].key

string Required

The label key that the selector applies to.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference.matchFields

array

A list of node selector requirements by node’s fields.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].key

string Required

The label key that the selector applies to.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].values

array

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].values[*]

string

.spec.cruiseControlConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

Weight associated with matching the corresponding nodeSelectorTerm, in the range 1-100.

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[*].matchFields

array

A list of node selector requirements by node’s fields.

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].key

string Required

The label key that the selector applies to.

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.cruiseControlConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].values

array

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector

object

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaces

array

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces[]

string

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey

string Required

.spec.cruiseControlConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

weight associated with matching the corresponding podAffinityTerm, in the range 1-100.

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution

array

array

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].labelSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector

object

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaces

array

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces[]

string

.spec.cruiseControlConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].topologyKey

string Required

.spec.cruiseControlConfig.affinity.podAntiAffinity

object

Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod in the same node, zone, etc. as some other pod(s)).

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution

array

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*]

object

The weights of all of the matched WeightedPodAffinityTerm fields are added per-node to find the most preferred node(s)

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm

object Required

Required. A pod affinity term, associated with the corresponding weight.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector

object

A label query over a set of resources, in this case pods.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector

object

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values

array

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaces

array

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces[]

string

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey

string Required

.spec.cruiseControlConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

weight associated with matching the corresponding podAffinityTerm, in the range 1-100.

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution

array

array

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].labelSelector.matchLabels

object

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector

object

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].operator

string Required

object

Annotations to be applied to CruiseControl pod

.spec.cruiseControlConfig.cruiseControlEndpoint

string

.spec.cruiseControlConfig.cruiseControlOperationSpec

object

CruiseControlOperationSpec specifies the configuration of the CruiseControlOperation handling

.spec.cruiseControlConfig.cruiseControlOperationSpec.ttlSecondsAfterFinished

integer

When TTLSecondsAfterFinished is specified, the created and finished (completed successfully or completedWithError and errorPolicy: ignore) cruiseControlOperation custom resource will be deleted after the given time elapsed. When it is 0 then the resource is going to be deleted instantly after the operation is finished. When it is not specified the resource is not going to be removed. Value can be only zero and positive integers.

.spec.cruiseControlConfig.cruiseControlTaskSpec

object

CruiseControlTaskSpec specifies the configuration of the CC Tasks

.spec.cruiseControlConfig.cruiseControlTaskSpec.RetryDurationMinutes

integer Required

RetryDurationMinutes describes the amount of time the Operator waits for the task

.spec.cruiseControlConfig.image

string

.spec.cruiseControlConfig.imagePullSecrets

array

.spec.cruiseControlConfig.imagePullSecrets[*]

object

LocalObjectReference contains enough information to let you locate the referenced object inside the same namespace.

.spec.cruiseControlConfig.imagePullSecrets[*].name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.initContainers

array

InitContainers add extra initContainers to CruiseControl pod

.spec.cruiseControlConfig.initContainers[*]

object

object

EnvVar represents an environment variable present in a Container.

.spec.cruiseControlConfig.initContainers[].env[].name

string Required

Name of the environment variable. Must be a C_IDENTIFIER.

.spec.cruiseControlConfig.initContainers[].env[].value

string

.spec.cruiseControlConfig.initContainers[].env[].valueFrom

object

Source for the environment variable’s value. Cannot be used if value is not empty.

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.configMapKeyRef

object

Selects a key of a ConfigMap.

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.configMapKeyRef.key

string Required

The key to select.

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.configMapKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.configMapKeyRef.optional

boolean

Specify whether the ConfigMap or its key must be defined

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.secretKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.initContainers[].env[].valueFrom.secretKeyRef.optional

boolean

Specify whether the Secret or its key must be defined

Scheme to use for connecting to the host. Defaults to HTTP.

.spec.cruiseControlConfig.initContainers[*].lifecycle.postStart.tcpSocket

object

.spec.cruiseControlConfig.initContainers[*].lifecycle.postStart.tcpSocket.host

string

Optional: Host name to connect to, defaults to the pod IP.

.spec.cruiseControlConfig.initContainers[*].lifecycle.postStart.tcpSocket.port

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.httpGet.httpHeaders

array

Custom headers to set in the request. HTTP allows repeated headers.

.spec.cruiseControlConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[]

object

HTTPHeader describes a custom header to be used in HTTP probes

.spec.cruiseControlConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[].name

string Required

The header field name

.spec.cruiseControlConfig.initContainers[].lifecycle.preStop.httpGet.httpHeaders[].value

string Required

The header field value

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.httpGet.path

string

Path to access on the HTTP server.

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.httpGet.port

Required

Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.httpGet.scheme

string

Scheme to use for connecting to the host. Defaults to HTTP.

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.tcpSocket

object

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.tcpSocket.host

string

Optional: Host name to connect to, defaults to the pod IP.

.spec.cruiseControlConfig.initContainers[*].lifecycle.preStop.tcpSocket.port

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].livenessProbe

object

Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.cruiseControlConfig.initContainers[*].livenessProbe.exec

object

Exec specifies the action to take.

.spec.cruiseControlConfig.initContainers[*].livenessProbe.exec.command

array

.spec.cruiseControlConfig.initContainers[].livenessProbe.exec.command[]

string

.spec.cruiseControlConfig.initContainers[*].livenessProbe.failureThreshold

integer

Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.

.spec.cruiseControlConfig.initContainers[*].livenessProbe.grpc

object

GRPC specifies an action involving a GRPC port. This is a beta field and requires enabling GRPCContainerProbe feature gate.

.spec.cruiseControlConfig.initContainers[*].livenessProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].livenessProbe.terminationGracePeriodSeconds

integer

.spec.cruiseControlConfig.initContainers[*].livenessProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.cruiseControlConfig.initContainers[*].name

string Required

Name of the container specified as a DNS_LABEL. Each container in a pod must have a unique name (DNS_LABEL). Cannot be updated.

.spec.cruiseControlConfig.initContainers[*].ports

array

.spec.cruiseControlConfig.initContainers[].ports[]

object

ContainerPort represents a network port in a single container.

.spec.cruiseControlConfig.initContainers[].ports[].containerPort

integer Required

Number of port to expose on the pod’s IP address. This must be a valid port number, 0 < x < 65536.

.spec.cruiseControlConfig.initContainers[].ports[].hostIP

string

What host IP to bind the external port to.

.spec.cruiseControlConfig.initContainers[].ports[].hostPort

integer

Number of port to expose on the host. If specified, this must be a valid port number, 0 < x < 65536. If HostNetwork is specified, this must match ContainerPort. Most containers do not need this.

.spec.cruiseControlConfig.initContainers[].ports[].name

string

If specified, this must be an IANA_SVC_NAME and unique within the pod. Each named port in a pod must have a unique name. Name for the port that can be referred to by services.

.spec.cruiseControlConfig.initContainers[].ports[].protocol

string

Protocol for port. Must be UDP, TCP, or SCTP. Defaults to “TCP”.

.spec.cruiseControlConfig.initContainers[*].readinessProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].readinessProbe.terminationGracePeriodSeconds

integer

.spec.cruiseControlConfig.initContainers[*].readinessProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.cruiseControlConfig.initContainers[*].resources

object

Compute Resources required by this container. Cannot be updated. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

.spec.cruiseControlConfig.initContainers[*].resources.limits

object

Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

.spec.cruiseControlConfig.initContainers[*].resources.requests

object

.spec.cruiseControlConfig.initContainers[*].securityContext

object

.spec.cruiseControlConfig.initContainers[*].securityContext.allowPrivilegeEscalation

boolean

.spec.cruiseControlConfig.initContainers[*].securityContext.capabilities

object

The capabilities to add/drop when running containers. Defaults to the default set of capabilities granted by the container runtime. Note that this field cannot be set when spec.os.name is windows.

.spec.cruiseControlConfig.initContainers[*].securityContext.capabilities.add

array

Added capabilities

.spec.cruiseControlConfig.initContainers[].securityContext.capabilities.add[]

string

Capability represent POSIX capabilities type

.spec.cruiseControlConfig.initContainers[*].securityContext.capabilities.drop

array

Removed capabilities

.spec.cruiseControlConfig.initContainers[].securityContext.capabilities.drop[]

string

Capability represent POSIX capabilities type

.spec.cruiseControlConfig.initContainers[*].securityContext.privileged

boolean

.spec.cruiseControlConfig.initContainers[*].securityContext.procMount

string

.spec.cruiseControlConfig.initContainers[*].securityContext.readOnlyRootFilesystem

boolean

Whether this container has a read-only root filesystem. Default is false. Note that this field cannot be set when spec.os.name is windows.

.spec.cruiseControlConfig.initContainers[*].securityContext.runAsGroup

integer

.spec.cruiseControlConfig.initContainers[*].securityContext.runAsNonRoot

boolean

.spec.cruiseControlConfig.initContainers[*].securityContext.runAsUser

integer

.spec.cruiseControlConfig.initContainers[*].securityContext.seLinuxOptions

object

.spec.cruiseControlConfig.initContainers[*].securityContext.seLinuxOptions.level

string

Level is SELinux level label that applies to the container.

.spec.cruiseControlConfig.initContainers[*].securityContext.seLinuxOptions.role

string

Role is a SELinux role label that applies to the container.

.spec.cruiseControlConfig.initContainers[*].securityContext.seLinuxOptions.type

string

Type is a SELinux type label that applies to the container.

.spec.cruiseControlConfig.initContainers[*].securityContext.seLinuxOptions.user

string

User is a SELinux user label that applies to the container.

.spec.cruiseControlConfig.initContainers[*].securityContext.seccompProfile

object

.spec.cruiseControlConfig.initContainers[*].securityContext.seccompProfile.localhostProfile

string

.spec.cruiseControlConfig.initContainers[*].securityContext.seccompProfile.type

string Required

.spec.cruiseControlConfig.initContainers[*].securityContext.windowsOptions

object

.spec.cruiseControlConfig.initContainers[*].securityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.cruiseControlConfig.initContainers[*].securityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.cruiseControlConfig.initContainers[*].securityContext.windowsOptions.hostProcess

boolean

.spec.cruiseControlConfig.initContainers[*].securityContext.windowsOptions.runAsUserName

string

.spec.cruiseControlConfig.initContainers[*].startupProbe.grpc.port

integer Required

Port number of the gRPC service. Number must be in the range 1 to 65535.

Required

Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.

.spec.cruiseControlConfig.initContainers[*].startupProbe.terminationGracePeriodSeconds

integer

.spec.cruiseControlConfig.initContainers[*].startupProbe.timeoutSeconds

integer

Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes

.spec.cruiseControlConfig.initContainers[*].stdin

boolean

Whether this container should allocate a buffer for stdin in the container runtime. If this is not set, reads from stdin in the container will always result in EOF. Default is false.

.spec.cruiseControlConfig.initContainers[*].stdinOnce

boolean

.spec.cruiseControlConfig.initContainers[*].terminationMessagePath

string

string

Path within the volume from which the container’s volume should be mounted. Defaults to “” (volume’s root).

.spec.cruiseControlConfig.initContainers[].volumeMounts[].subPathExpr

string

.spec.cruiseControlConfig.initContainers[*].workingDir

string

Container’s working directory. If not specified, the container runtime’s default will be used, which might be configured in the container image. Cannot be updated.

.spec.cruiseControlConfig.log4jConfig

string

.spec.cruiseControlConfig.nodeSelector

object

.spec.cruiseControlConfig.podSecurityContext

object

.spec.cruiseControlConfig.podSecurityContext.fsGroup

integer

.spec.cruiseControlConfig.podSecurityContext.fsGroupChangePolicy

string

.spec.cruiseControlConfig.podSecurityContext.runAsGroup

integer

.spec.cruiseControlConfig.podSecurityContext.runAsNonRoot

boolean

Indicates that the container must run as a non-root user. If true, the Kubelet will validate the image at runtime to ensure that it does not run as UID 0 (root) and fail to start the container if it does. If unset or false, no such validation will be performed. May also be set in SecurityContext. If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence.

.spec.cruiseControlConfig.podSecurityContext.runAsUser

integer

string

.spec.cruiseControlConfig.podSecurityContext.seccompProfile.type

string Required

.spec.cruiseControlConfig.podSecurityContext.supplementalGroups

array

.spec.cruiseControlConfig.podSecurityContext.supplementalGroups[*]

integer

.spec.cruiseControlConfig.podSecurityContext.sysctls

array

.spec.cruiseControlConfig.podSecurityContext.sysctls[*]

object

Sysctl defines a kernel parameter to be set

.spec.cruiseControlConfig.podSecurityContext.sysctls[*].name

string Required

Name of a property to set

.spec.cruiseControlConfig.podSecurityContext.sysctls[*].value

Added capabilities

.spec.cruiseControlConfig.securityContext.capabilities.add[*]

string

Capability represent POSIX capabilities type

.spec.cruiseControlConfig.securityContext.capabilities.drop

array

Removed capabilities

.spec.cruiseControlConfig.securityContext.capabilities.drop[*]

string

Capability represent POSIX capabilities type

.spec.cruiseControlConfig.securityContext.privileged

boolean

.spec.cruiseControlConfig.securityContext.procMount

string

.spec.cruiseControlConfig.securityContext.readOnlyRootFilesystem

boolean

Whether this container has a read-only root filesystem. Default is false. Note that this field cannot be set when spec.os.name is windows.

.spec.cruiseControlConfig.securityContext.runAsGroup

integer

.spec.cruiseControlConfig.securityContext.runAsNonRoot

boolean

.spec.cruiseControlConfig.securityContext.runAsUser

integer

.spec.cruiseControlConfig.securityContext.seLinuxOptions

object

.spec.cruiseControlConfig.securityContext.seLinuxOptions.level

string

Level is SELinux level label that applies to the container.

.spec.cruiseControlConfig.securityContext.seLinuxOptions.role

string

Role is a SELinux role label that applies to the container.

.spec.cruiseControlConfig.securityContext.seLinuxOptions.type

string

Type is a SELinux type label that applies to the container.

.spec.cruiseControlConfig.securityContext.seLinuxOptions.user

string

User is a SELinux user label that applies to the container.

.spec.cruiseControlConfig.securityContext.seccompProfile

object

.spec.cruiseControlConfig.securityContext.seccompProfile.localhostProfile

string

.spec.cruiseControlConfig.securityContext.seccompProfile.type

string Required

.spec.cruiseControlConfig.securityContext.windowsOptions

object

.spec.cruiseControlConfig.securityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.cruiseControlConfig.securityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.cruiseControlConfig.securityContext.windowsOptions.hostProcess

boolean

.spec.cruiseControlConfig.securityContext.windowsOptions.runAsUserName

string

.spec.cruiseControlConfig.serviceAccountName

string

.spec.cruiseControlConfig.tolerations

array

.spec.cruiseControlConfig.tolerations[*]

Path within the container at which the volume should be mounted. Must not contain ‘:’.

.spec.cruiseControlConfig.volumeMounts[*].mountPropagation

string

mountPropagation determines how mounts are propagated from the host to container and the other way around. When not set, MountPropagationNone is used. This field is beta in 1.10.

.spec.cruiseControlConfig.volumeMounts[*].name

string Required

This must match the Name of a Volume.

.spec.cruiseControlConfig.volumeMounts[*].readOnly

boolean

Mounted read-only if true, read-write otherwise (false or unspecified). Defaults to false.

.spec.cruiseControlConfig.volumeMounts[*].subPath

integer

.spec.cruiseControlConfig.volumes[*].awsElasticBlockStore.readOnly

boolean

readOnly value true will force the readOnly setting in VolumeMounts. More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore

.spec.cruiseControlConfig.volumes[*].awsElasticBlockStore.volumeID

string Required

volumeID is unique ID of the persistent disk resource in AWS (Amazon EBS volume). More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore

.spec.cruiseControlConfig.volumes[*].azureDisk

object

azureDisk represents an Azure Data Disk mount on the host and bind mount to the pod.

.spec.cruiseControlConfig.volumes[*].azureDisk.cachingMode

string

cachingMode is the Host Caching mode: None, Read Only, Read Write.

.spec.cruiseControlConfig.volumes[*].azureDisk.diskName

string Required

diskName is the Name of the data disk in the blob storage

.spec.cruiseControlConfig.volumes[*].azureDisk.diskURI

string Required

diskURI is the URI of data disk in the blob storage

.spec.cruiseControlConfig.volumes[*].azureDisk.fsType

string

fsType is Filesystem type to mount. Must be a filesystem type supported by the host operating system. Ex. “ext4”, “xfs”, “ntfs”. Implicitly inferred to be “ext4” if unspecified.

.spec.cruiseControlConfig.volumes[*].azureDisk.kind

string

.spec.cruiseControlConfig.volumes[*].azureDisk.readOnly

boolean

readOnly Defaults to false (read/write). ReadOnly here will force the ReadOnly setting in VolumeMounts.

.spec.cruiseControlConfig.volumes[*].azureFile

object

azureFile represents an Azure File Service mount on the host and bind mount to the pod.

.spec.cruiseControlConfig.volumes[*].azureFile.readOnly

boolean

readOnly defaults to false (read/write). ReadOnly here will force the ReadOnly setting in VolumeMounts.

.spec.cruiseControlConfig.volumes[*].azureFile.secretName

string Required

secretName is the name of secret that contains Azure Storage Account Name and Key

.spec.cruiseControlConfig.volumes[*].azureFile.shareName

string Required

shareName is the azure share Name

.spec.cruiseControlConfig.volumes[*].cephfs

object

cephFS represents a Ceph FS mount on the host that shares a pod’s lifetime

.spec.cruiseControlConfig.volumes[*].cephfs.monitors

array Required

monitors is Required: Monitors is a collection of Ceph monitors More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it

.spec.cruiseControlConfig.volumes[].cephfs.monitors[]

string

.spec.cruiseControlConfig.volumes[*].cephfs.path

object

cinder represents a cinder volume attached and mounted on kubelets host machine. More info: https://examples.k8s.io/mysql-cinder-pd/README.md

.spec.cruiseControlConfig.volumes[*].cinder.fsType

string

.spec.cruiseControlConfig.volumes[*].cinder.readOnly

boolean

readOnly defaults to false (read/write). ReadOnly here will force the ReadOnly setting in VolumeMounts. More info: https://examples.k8s.io/mysql-cinder-pd/README.md

.spec.cruiseControlConfig.volumes[*].cinder.secretRef

object

secretRef is optional: points to a secret object containing parameters used to connect to OpenStack.

.spec.cruiseControlConfig.volumes[*].cinder.secretRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.volumes[*].cinder.volumeID

string Required

volumeID used to identify the volume in cinder. More info: https://examples.k8s.io/mysql-cinder-pd/README.md

.spec.cruiseControlConfig.volumes[*].configMap

object

configMap represents a configMap that should populate this volume

.spec.cruiseControlConfig.volumes[*].configMap.defaultMode

integer

.spec.cruiseControlConfig.volumes[*].configMap.items

array

.spec.cruiseControlConfig.volumes[].configMap.items[]

object

Maps a string key to a path within a volume.

.spec.cruiseControlConfig.volumes[].configMap.items[].key

string Required

key is the key to project.

.spec.cruiseControlConfig.volumes[].configMap.items[].mode

integer

.spec.cruiseControlConfig.volumes[].configMap.items[].path

string Required

path is the relative path of the file to map the key to. May not be an absolute path. May not contain the path element ‘..’. May not start with the string ‘..’.

.spec.cruiseControlConfig.volumes[*].configMap.name

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].fieldRef.fieldPath

string Required

Path of the field to select in the specified API version.

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].mode

integer

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].path

string Required

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].resourceFieldRef

object

Selects a resource of the container: only resources limits and requests (limits.cpu, limits.memory, requests.cpu and requests.memory) are currently supported.

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].resourceFieldRef.containerName

string

Container name: required for volumes, optional for env vars

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].resourceFieldRef.divisor

Specifies the output format of the exposed resources, defaults to “1”

.spec.cruiseControlConfig.volumes[].downwardAPI.items[].resourceFieldRef.resource

string Required

string

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSource

object

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSource.apiGroup

string

APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required.

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSource.kind

string Required

Kind is the type of resource being referenced

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSource.name

string Required

Name is the name of resource being referenced

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSourceRef

object

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSourceRef.apiGroup

string

APIGroup is the group for the resource being referenced. If APIGroup is not specified, the specified Kind must be in the core API group. For any other third-party types, APIGroup is required.

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.dataSourceRef.kind

string Required

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.cruiseControlConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.cruiseControlConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.cruiseControlConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.cruiseControlConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[].values

array

.spec.cruiseControlConfig.volumes[].ephemeral.volumeClaimTemplate.spec.selector.matchExpressions[].values[*]

string

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.selector.matchLabels

object

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.storageClassName

string

storageClassName is the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.volumeMode

string

volumeMode defines what type of volume is required by the claim. Value of Filesystem is implied when not included in claim spec.

.spec.cruiseControlConfig.volumes[*].ephemeral.volumeClaimTemplate.spec.volumeName

.spec.cruiseControlConfig.volumes[*].flexVolume.options

object

options is Optional: this field holds extra command options if any.

.spec.cruiseControlConfig.volumes[*].flexVolume.readOnly

boolean

readOnly is Optional: defaults to false (read/write). ReadOnly here will force the ReadOnly setting in VolumeMounts.

.spec.cruiseControlConfig.volumes[*].flexVolume.secretRef

object

.spec.cruiseControlConfig.volumes[*].flexVolume.secretRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.volumes[*].flocker

object

flocker represents a Flocker volume attached to a kubelet’s host machine. This depends on the Flocker control service being running

.spec.cruiseControlConfig.volumes[*].flocker.datasetName

string

datasetName is Name of the dataset stored as metadata -> name on the dataset for Flocker should be considered as deprecated

.spec.cruiseControlConfig.volumes[*].flocker.datasetUUID

string

datasetUUID is the UUID of the dataset. This is unique identifier of a Flocker dataset

.spec.cruiseControlConfig.volumes[*].gcePersistentDisk

object

.spec.cruiseControlConfig.volumes[*].gcePersistentDisk.fsType

string

.spec.cruiseControlConfig.volumes[*].gcePersistentDisk.partition

integer

.spec.cruiseControlConfig.volumes[*].gcePersistentDisk.pdName

string Required

pdName is unique name of the PD resource in GCE. Used to identify the disk in GCE. More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk

.spec.cruiseControlConfig.volumes[*].gcePersistentDisk.readOnly

boolean

readOnly here will force the ReadOnly setting in VolumeMounts. Defaults to false. More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk

.spec.cruiseControlConfig.volumes[*].gitRepo

object

boolean

readOnly here will force the Glusterfs volume to be mounted with read-only permissions. Defaults to false. More info: https://examples.k8s.io/volumes/glusterfs/README.md#create-a-pod

server is the hostname or IP address of the NFS server. More info: https://kubernetes.io/docs/concepts/storage/volumes#nfs

string Required

volumeID uniquely identifies a Portworx volume

.spec.cruiseControlConfig.volumes[*].projected

object

projected items for all in one resources secrets, configmaps, and downward API

.spec.cruiseControlConfig.volumes[*].projected.defaultMode

integer

.spec.cruiseControlConfig.volumes[*].projected.sources

.spec.cruiseControlConfig.volumes[].projected.sources[].configMap.optional

boolean

optional specify whether the ConfigMap or its keys must be defined

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI

object

downwardAPI information about the downwardAPI data to project

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI.items

array

Items is a list of DownwardAPIVolume file

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI.items[*]

object

DownwardAPIVolumeFile represents information to create the file containing the pod field

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI.items[*].fieldRef

object

Required: Selects a field of the pod: only annotations, labels, name and namespace are supported.

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI.items[*].fieldRef.apiVersion

string

Version of the schema the FieldPath is written in terms of, defaults to “v1”.

.spec.cruiseControlConfig.volumes[].projected.sources[].downwardAPI.items[*].fieldRef.fieldPath

string Required

Path of the field to select in the specified API version.

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.cruiseControlConfig.volumes[].projected.sources[].secret.optional

boolean

optional field specify whether the Secret or its key must be defined

.spec.cruiseControlConfig.volumes[].projected.sources[].serviceAccountToken

object

serviceAccountToken is information about the serviceAccountToken data to project

.spec.cruiseControlConfig.volumes[].projected.sources[].serviceAccountToken.audience

string

string

keyring is the path to key ring for RBDUser. Default is /etc/ceph/keyring. More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it

.spec.cruiseControlConfig.volumes[*].rbd.monitors

array Required

monitors is a collection of Ceph monitors. More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it

.spec.cruiseControlConfig.volumes[].rbd.monitors[]

secret represents a secret that should populate this volume. More info: https://kubernetes.io/docs/concepts/storage/volumes#secret

.spec.cruiseControlConfig.volumes[*].secret.defaultMode

integer

.spec.cruiseControlConfig.volumes[*].secret.items

array

.spec.cruiseControlConfig.volumes[].secret.items[]

object

Maps a string key to a path within a volume.

.spec.cruiseControlConfig.volumes[].secret.items[].key

string Required

key is the key to project.

string

volumeName is the human-readable name of the StorageOS volume. Volume names are only unique within a namespace.

.spec.cruiseControlConfig.volumes[*].storageos.volumeNamespace

Describes node affinity scheduling rules for the pod.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution

array

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*]

object

An empty preferred scheduling term matches all objects with implicit weight 0 (i.e. it’s a no-op). A null preferred scheduling term matches no objects (i.e. is also a no-op).

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference

object Required

A node selector term, associated with the corresponding weight.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference.matchExpressions

array

A list of node selector requirements by node’s labels.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].key

string Required

The label key that the selector applies to.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].values

array

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].preference.matchFields

array

A list of node selector requirements by node’s fields.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].key

string Required

The label key that the selector applies to.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].values

array

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[].preference.matchFields[].values[*]

string

.spec.envoyConfig.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

Weight associated with matching the corresponding nodeSelectorTerm, in the range 1-100.

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[].values

array

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[*].matchFields

array

A list of node selector requirements by node’s fields.

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[]

object

A node selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].key

string Required

The label key that the selector applies to.

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].operator

string Required

Represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.

.spec.envoyConfig.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchFields[].values

array

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchLabels

object

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector

object

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaces

array

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces[]

string

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey

string Required

.spec.envoyConfig.affinity.podAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

weight associated with matching the corresponding podAffinityTerm, in the range 1-100.

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution

array

array

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].labelSelector.matchLabels

object

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector

object

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchLabels

object

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaces

array

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces[]

string

.spec.envoyConfig.affinity.podAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].topologyKey

string Required

.spec.envoyConfig.affinity.podAntiAffinity

object

Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod in the same node, zone, etc. as some other pod(s)).

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution

array

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*]

object

The weights of all of the matched WeightedPodAffinityTerm fields are added per-node to find the most preferred node(s)

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm

object Required

Required. A pod affinity term, associated with the corresponding weight.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector

object

A label query over a set of resources, in this case pods.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.labelSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.labelSelector.matchLabels

object

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector

object

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaceSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaceSelector.matchLabels

object

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.namespaces

array

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[].podAffinityTerm.namespaces[]

string

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].podAffinityTerm.topologyKey

string Required

.spec.envoyConfig.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[*].weight

integer Required

weight associated with matching the corresponding podAffinityTerm, in the range 1-100.

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution

array

array

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].labelSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].labelSelector.matchLabels

object

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector

object

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchExpressions

array

matchExpressions is a list of label selector requirements. The requirements are ANDed.

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[]

object

A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].key

string Required

key is the label key that the selector applies to.

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].operator

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values

array

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaceSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaceSelector.matchLabels

object

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[*].namespaces

array

.spec.envoyConfig.affinity.podAntiAffinity.requiredDuringSchedulingIgnoredDuringExecution[].namespaces[]

string

integer

Envoy health-check port

.spec.envoyConfig.image

.spec.envoyConfig.nodeSelector

object

NodeSelector is the node selector expression for envoy pods

.spec.envoyConfig.podSecurityContext

object

PodSecurityContext holds pod-level security attributes and common container settings for the Envoy pods.

.spec.envoyConfig.podSecurityContext.fsGroup

integer

.spec.envoyConfig.podSecurityContext.fsGroupChangePolicy

string

.spec.envoyConfig.podSecurityContext.runAsGroup

integer

.spec.envoyConfig.podSecurityContext.runAsNonRoot

boolean

Indicates that the container must run as a non-root user. If true, the Kubelet will validate the image at runtime to ensure that it does not run as UID 0 (root) and fail to start the container if it does. If unset or false, no such validation will be performed. May also be set in SecurityContext. If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence.

.spec.envoyConfig.podSecurityContext.runAsUser

integer

string

.spec.envoyConfig.podSecurityContext.seccompProfile.type

string Required

.spec.envoyConfig.podSecurityContext.supplementalGroups

array

.spec.envoyConfig.podSecurityContext.supplementalGroups[*]

integer

.spec.envoyConfig.podSecurityContext.sysctls

array

.spec.envoyConfig.podSecurityContext.sysctls[*]

object

Sysctl defines a kernel parameter to be set

.spec.envoyConfig.podSecurityContext.sysctls[*].name

string Required

Name of a property to set

.spec.envoyConfig.podSecurityContext.sysctls[*].value

string Required

Value of a property to set

.spec.envoyConfig.podSecurityContext.windowsOptions

object

.spec.envoyConfig.podSecurityContext.windowsOptions.gmsaCredentialSpec

string

GMSACredentialSpec is where the GMSA admission webhook (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the GMSA credential spec named by the GMSACredentialSpecName field.

.spec.envoyConfig.podSecurityContext.windowsOptions.gmsaCredentialSpecName

string

GMSACredentialSpecName is the name of the GMSA credential spec to use.

.spec.envoyConfig.podSecurityContext.windowsOptions.hostProcess

boolean

.spec.envoyConfig.podSecurityContext.windowsOptions.runAsUserName

string

.spec.envoyConfig.priorityClassName

string

PriorityClassName specifies the priority class name for the Envoy pod(s) If specified, the PriorityClass resource with this PriorityClassName must be created beforehand If not specified, the Envoy pods’ priority is default to zero

.spec.envoyConfig.tolerations[*].operator

string

.spec.envoyConfig.tolerations[*].tolerationSeconds

integer

.spec.envoyConfig.tolerations[*].value

string

Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.

string Required

operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

.spec.envoyConfig.topologySpreadConstraints[].labelSelector.matchExpressions[].values

array

.spec.envoyConfig.topologySpreadConstraints[].labelSelector.matchExpressions[].values[*]

string

.spec.envoyConfig.topologySpreadConstraints[*].labelSelector.matchLabels

object

.spec.envoyConfig.topologySpreadConstraints[*].matchLabelKeys

array

MatchLabelKeys is a set of pod label keys to select the pods over which spreading will be calculated. The keys are used to lookup values from the incoming pod labels, those key-value labels are ANDed with labelSelector to select the group of existing pods over which spreading will be calculated for the incoming pod. Keys that don’t exist in the incoming pod labels will be ignored. A null or empty list means only match against labelSelector.

string Required

WhenUnsatisfiable indicates how to deal with a pod if it doesn’t satisfy the spread constraint. - DoNotSchedule (default) tells the scheduler not to schedule it. - ScheduleAnyway tells the scheduler to schedule the pod in any location, but giving higher precedence to topologies that would help reduce the skew. A constraint is considered “Unsatisfiable” for an incoming pod if and only if every possible node assignment for that pod would violate “MaxSkew” on some topology. For example, in a 3-zone cluster, MaxSkew is set to 1, and pods with the same labelSelector spread as 3/1/1: | zone1 | zone2 | zone3 | | P P P | P | P | If WhenUnsatisfiable is set to DoNotSchedule, incoming pod can only be scheduled to zone2(zone3) to become 3/2/1(3/1/2) as ActualSkew(2-1) on zone2(zone3) satisfies MaxSkew(1). In other words, the cluster can still be imbalanced, but scheduler won’t make it more imbalanced. It’s a required field.

.spec.envs

array

.spec.envs[*]

object

EnvVar represents an environment variable present in a Container.

.spec.envs[*].name

string Required

Name of the environment variable. Must be a C_IDENTIFIER.

.spec.envs[*].value

string

.spec.envs[*].valueFrom

object

Source for the environment variable’s value. Cannot be used if value is not empty.

.spec.envs[*].valueFrom.configMapKeyRef

object

Selects a key of a ConfigMap.

.spec.envs[*].valueFrom.configMapKeyRef.key

string Required

The key to select.

.spec.envs[*].valueFrom.configMapKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.envs[*].valueFrom.configMapKeyRef.optional

boolean

Specify whether the ConfigMap or its key must be defined

.spec.envs[*].valueFrom.secretKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.envs[*].valueFrom.secretKeyRef.optional

boolean

Specify whether the Secret or its key must be defined

.spec.headlessServiceEnabled

boolean Required

.spec.ingressController

string

IngressController specifies the type of the ingress controller to be used for external listeners. The istioingress ingress controller type requires the spec.istioControlPlane field to be populated as well.

.spec.istioControlPlane

object

IstioControlPlane is a reference to the IstioControlPlane resource for envoy configuration. It must be specified if istio ingress is used.

.spec.istioControlPlane.name

string Required

.spec.istioIngressConfig.envs[*].valueFrom.fieldRef.apiVersion

string

Version of the schema the FieldPath is written in terms of, defaults to “v1”.

.spec.istioIngressConfig.envs[*].valueFrom.fieldRef.fieldPath

string Required

Path of the field to select in the specified API version.

.spec.istioIngressConfig.envs[*].valueFrom.resourceFieldRef

object

.spec.istioIngressConfig.envs[*].valueFrom.resourceFieldRef.containerName

string

Container name: required for volumes, optional for env vars

.spec.istioIngressConfig.envs[*].valueFrom.resourceFieldRef.divisor

Specifies the output format of the exposed resources, defaults to “1”

.spec.istioIngressConfig.envs[*].valueFrom.resourceFieldRef.resource

string Required

Required: resource to select

.spec.istioIngressConfig.envs[*].valueFrom.secretKeyRef

object

Selects a key of a secret in the pod’s namespace

.spec.istioIngressConfig.envs[*].valueFrom.secretKeyRef.key

string Required

The key of the secret to select from. Must be a valid secret key.

.spec.istioIngressConfig.envs[*].valueFrom.secretKeyRef.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.istioIngressConfig.envs[*].valueFrom.secretKeyRef.optional

boolean

Specify whether the Secret or its key must be defined

.spec.istioIngressConfig.gatewayConfig

object

.spec.istioIngressConfig.gatewayConfig.caCertificates

string

REQUIRED if mode is MUTUAL. The path to a file containing certificate authority certificates to use in verifying a presented client side certificate.

.spec.istioIngressConfig.gatewayConfig.cipherSuites

array

Optional: If specified, only support the specified cipher list. Otherwise default to the default cipher list supported by Envoy.

string

.spec.istioIngressConfig.nodeSelector

object

.spec.istioIngressConfig.replicas

integer

.spec.istioIngressConfig.resourceRequirements

object

ResourceRequirements describes the compute resource requirements.

.spec.istioIngressConfig.resourceRequirements.limits

object

Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

.spec.istioIngressConfig.resourceRequirements.requests

object

.spec.istioIngressConfig.tolerations

array

.spec.istioIngressConfig.tolerations[*]

object

The pod this Toleration is attached to tolerates any taint that matches the triple using the matching operator .

.spec.istioIngressConfig.tolerations[*].effect

.spec.listenersConfig.externalListeners[*]

object

ExternalListenerConfig defines the external listener config for Kafka

.spec.listenersConfig.externalListeners[*].accessMethod

string

accessMethod defines the method which the external listener is exposed through. Two types are supported LoadBalancer and NodePort. The recommended and default is the LoadBalancer. NodePort should be used in Kubernetes environments with no support for provisioning Load Balancers.

.spec.listenersConfig.externalListeners[*].anyCastPort

integer

configuring AnyCastPort allows kafka cluster access without specifying the exact broker

.spec.listenersConfig.externalListeners[*].config

object

Config allows to specify ingress controller configuration per external listener if set overrides the the default KafkaClusterSpec.IstioIngressConfig or KafkaClusterSpec.EnvoyConfig for this external listener.

.spec.listenersConfig.externalListeners[*].config.defaultIngressConfig

string Required

.spec.listenersConfig.externalListeners[*].config.ingressConfig

object

.spec.listenersConfig.externalListeners[*].containerPort

integer Required

.spec.listenersConfig.externalListeners[*].externalStartingPort

integer Required

externalStartingPort is added to each broker ID to get the port number that will be used for external access to the broker. The choice of broker ID and externalStartingPort must satisfy 0 < broker ID + externalStartingPort <= 65535 If accessMethod is Nodeport and externalStartingPort is set to 0 then the broker IDs are not added and the Nodeport port numbers will be chosen automatically by the K8s Service controller

.spec.listenersConfig.externalListeners[*].externalTrafficPolicy

string

externalTrafficPolicy denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. “Local” preserves the client source IP and avoids a second hop for LoadBalancer and Nodeport type services, but risks potentially imbalanced traffic spreading. “Cluster” obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading.

.spec.listenersConfig.externalListeners[*].hostnameOverride

string

In case of external listeners using LoadBalancer access method the value of this field is used to advertise the Kafka broker external listener instead of the public IP of the provisioned LoadBalancer service (e.g. can be used to advertise the listener using a URL recorded in DNS instead of public IP). In case of external listeners using NodePort access method the broker instead of node public IP (see “brokerConfig.nodePortExternalIP”) is advertised on the address having the following format: -.

string Required

SecurityProtocol is the protocol used to communicate with brokers. Valid values are: plaintext, ssl, sasl_plaintext, sasl_ssl.

.spec.listenersConfig.internalListeners

array Required

.spec.listenersConfig.internalListeners[*]

object

InternalListenerConfig defines the internal listener config for Kafka

.spec.listenersConfig.internalListeners[*].containerPort

integer Required

.spec.listenersConfig.internalListeners[*].name

string Required

.spec.listenersConfig.internalListeners[*].serverSSLCertSecret

object

.spec.listenersConfig.internalListeners[*].serverSSLCertSecret.name

string

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?

.spec.listenersConfig.internalListeners[*].sslClientAuth

string

SSLClientAuth specifies whether client authentication is required, requested, or not required. This field defaults to “required” if it is omitted

.spec.listenersConfig.internalListeners[*].type

string Required

SecurityProtocol is the protocol used to communicate with brokers. Valid values are: plaintext, ssl, sasl_plaintext, sasl_ssl.

.spec.listenersConfig.internalListeners[*].usedForControllerCommunication

boolean

.spec.listenersConfig.internalListeners[*].usedForInnerBrokerCommunication

boolean Required

.spec.listenersConfig.serviceAnnotations

object

.spec.listenersConfig.sslSecrets

object

SSLSecrets defines the Kafka SSL secrets

.spec.listenersConfig.sslSecrets.create

boolean

.spec.listenersConfig.sslSecrets.issuerRef

object

ObjectReference is a reference to an object with a given name, kind and group.

.spec.listenersConfig.sslSecrets.issuerRef.group

string

Group of the resource being referred to.

.spec.listenersConfig.sslSecrets.issuerRef.kind

string

Kind of the resource being referred to.

.spec.listenersConfig.sslSecrets.issuerRef.name

string Required

Name of the resource being referred to.

.spec.listenersConfig.sslSecrets.jksPasswordName

string

.spec.listenersConfig.sslSecrets.pkiBackend

string

PKIBackend represents an interface implementing the PKIManager

.spec.listenersConfig.sslSecrets.tlsSecretName

string Required

.spec.monitoringConfig

object

.spec.propagateLabels

boolean

.spec.rackAwareness

object

RackAwareness defines the required fields to enable kafka’s rack aware feature

.spec.rackAwareness.labels

array Required

.spec.rackAwareness.labels[*]

string

.spec.readOnlyConfig

string

.spec.removeUnusedIngressResources

boolean

RemoveUnusedIngressResources when true, the unnecessary resources from the previous ingress state will be removed. when false, they will be kept so the Kafka cluster remains available for those Kafka clients which are still using the previous ingress setting.

.spec.rollingUpgradeConfig

object Required

RollingUpgradeConfig defines the desired config of the RollingUpgrade

.spec.rollingUpgradeConfig.failureThreshold

integer Required

FailureThreshold controls how many failures the cluster can tolerate during a rolling upgrade. Once the number of failures reaches this threshold a rolling upgrade flow stops. The number of failures is computed as the sum of distinct broker replicas with either offline replicas or out of sync replicas and the number of alerts triggered by alerts with ‘rollingupgrade’

.spec.zkAddresses

.status.rollingUpgradeStatus.lastSuccess

string Required

.status.state

string Required

ClusterState holds info about the cluster state

6.1.2 - KafkaTopic CRD schema reference (group kafka.banzaicloud.io)

KafkaTopic is the Schema for the kafkatopics API

KafkaTopic

KafkaTopic is the Schema for the kafkatopics API

Full name:: kafkatopics.kafka.banzaicloud.io
Group:: kafka.banzaicloud.io
Singular name:: kafkatopic
Plural name:: kafkatopics
Scope:: Namespaced
Versions:: v1alpha1

Version v1alpha1

Properties

.apiVersion

string

.kind

string

.metadata

object

.spec

object

KafkaTopicSpec defines the desired state of KafkaTopic

.spec.clusterRef

object Required

ClusterReference states a reference to a cluster for topic/user provisioning

.spec.clusterRef.name

string Required

.spec.clusterRef.namespace

string

.spec.config

object

KafkaUser is the Schema for the kafka users API

KafkaUser

KafkaUser is the Schema for the kafka users API

Full name:: kafkausers.kafka.banzaicloud.io
Group:: kafka.banzaicloud.io
Singular name:: kafkauser
Plural name:: kafkausers
Scope:: Namespaced
Versions:: v1alpha1

Version v1alpha1

Properties

.apiVersion

string

.kind

string

.metadata

object

.spec

object

KafkaUserSpec defines the desired state of KafkaUser

.spec.annotations

object

Annotations defines the annotations placed on the certificate or certificate signing request object

.spec.clusterRef

object Required

ClusterReference states a reference to a cluster for topic/user provisioning

.spec.clusterRef.name

string Required

.spec.clusterRef.namespace

string

.spec.createCert

boolean

.spec.dnsNames

array

.spec.dnsNames[*]

string

.spec.includeJKS

boolean

.spec.pkiBackendSpec

object

.spec.pkiBackendSpec.issuerRef

object

ObjectReference is a reference to an object with a given name, kind and group.

.spec.pkiBackendSpec.issuerRef.group

string

Group of the resource being referred to.

.spec.pkiBackendSpec.issuerRef.kind

string

Kind of the resource being referred to.

.spec.pkiBackendSpec.issuerRef.name

string Required

Name of the resource being referred to.

.spec.topicGrants[*].patternType

string

KafkaPatternType hold the Resource Pattern Type of kafka ACL

.spec.topicGrants[*].topicName

string Required

.status

object

KafkaUserStatus defines the observed state of KafkaUser

.status.acls

array

.status.acls[*]

string

.status.state

string Required

UserState defines the state of a KafkaUser

6.2 - KafkaCluster CR Examples

The following KafkaCluster custom resource examples show you some basic use cases. You can use these examples as a base for your own Kafka cluster.

KafkaCluster CR with detailed explanation

This is our most descriptive KafkaCluster CR. You can find a lot of valuable explanation about the settings.

Detailed CR with descriptions

Kafka cluster with monitoring

This is a very simple KafkaCluster CR with Prometheus monitoring enabled.

Simple KafkaCluster with monitoring

Kafka cluster with ACL, SSL, and rack awareness

You can read more details about rack awareness here.

Use SSL and rack awareness

Kafka cluster with broker configuration

Kafka cluster with custom SSL certificates for external listeners

You can specify custom SSL certificates for listeners.
For details about SSL configuration, see Securing Kafka With SSL.

Use custom SSL certificate for an external listener
Use custom SSL certificate for controller and inter-broker communication. In this case you also need to provide the client SSL certificate for Koperator.
Hybrid solution: some listeners have custom SSL certificates and some use certificates Koperator has generated automatically using cert-manager.

Kafka cluster with SASL

You can use SASL authentication on the listeners. For details, see Expose the Kafka cluster to external applications.

Use SASL authentication on the listeners

Kafka cluster with load balancers and brokers in the same availability zone

You can create a broker-ingress mapping to eliminate traffic across availability zones between load balancers and brokers by configuring load balancers for brokers in same availability zone.

Load balancers and brokers in same availability zone

Kafka cluster with Istio

You can use Istio as the ingress controller for your external listeners. Koperator now uses standard Istio resources (Gateway, VirtualService) instead of the deprecated banzaicloud istio-operator, providing better compatibility and working with any Istio installation.

Kafka cluster with Istio as ingress controller

Kafka cluster with custom advertised address for external listeners and brokers

You can set custom advertised IP address for brokers.
This is useful when you’re advertising the brokers on an IP address different from the Kubernetes node IP address.
You can also set custom advertised address for external listeners.
For details, see Expose the Kafka cluster to external applications.

Custom advertised address for external listeners

Kafka cluster with Kubernetes scheduler affinity settings

You can set node affinity for your brokers.

Custom affinity settings

Kafka cluster with custom storage class

You can configure your brokers to use custom storage classes.

Custom storage class

Kafka cluster with KRaft mode (ZooKeeper-free)

You can deploy Kafka clusters using KRaft mode, which eliminates the need for ZooKeeper by using Kafka’s built-in consensus mechanism. This is the future of Kafka and is recommended for new deployments.

Simple KafkaCluster with KRaft mode

For detailed information about KRaft configuration and deployment, see KRaft Mode (ZooKeeper-free Kafka).

7 - Provisioning Kafka Topics

Create topic

You can create Kafka topics either:

directly against the cluster with command line utilities, or
via the KafkaTopic CRD.

Below is an example KafkaTopic CR you can apply with kubectl.

For a full list of configuration options, see the official Kafka documentation.

Update topic

If you want to update the configuration of the topic after it’s been created, you can either:

edit the manifest and run kubectl apply again, or
run kubectl edit -n kafka kafkatopic example-topic and then update the configuration in the editor that gets spawned.

You can increase the partition count for a topic the same way, or by running the following one-liner using patch:

kubectl patch -n kafka kafkatopic example-topic --patch '{"spec": {"partitions": 5}}' --type=merge

kafkatopic.kafka.banzaicloud.io/example-topic patched

Note: Topics created by the Koperator are not enforced in any way. From the Kubernetes perspective, Kafka Topics are external resources.

8 - Kafka clusters with pre-provisioned volumes

This guide describes how to configure KafkaCluster to deploy Apache Kafka clusters which use pre-provisioned volumes instead of dynamically provisioned ones. Using static volumes is useful in environments where dynamic volume provisioning is not supported. Koperator uses persistent volume claim Kubernetes resources to dynamically provision volumes for the Kafka broker log directories.

Kubernetes provides a feature which allows binding persistent volume claims to existing persistent volumes either through the volumeName or the selector field. This allows Koperator to use pre-created persistent volumes as Kafka broker log directories instead of dynamically provisioning persistent volumes. For this binding to work:

the configuration fields specified under storageConfigs.pvcSpec (such as accessModes, storageClassName) must match the specification of the pre-created persistent volume, and
the resources.requests.storage must fit onto the capacity of the persistent volume.

For further details on how the persistent volume claim binding works, consult the Kubernetes documentation.

In the following example, it is assumed that you (or your administrator) have already created four persistent volumes. The example shows you how to create a Kafka cluster with two brokers, each broker configured with two log directories to use the four pre-provisioned volumes:

apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    namespace: kafka # namespace of the kafka cluster this volume is for in case there are multiple kafka clusters with the same name in different namespaces
    kafka_cr: kafka # name of the kafka cluster this volume is for
    brokerId: "0" # the id of the broker this volume is for
    mountPath: kafka-logs-1 # path mounted as broker log dir  
spec:
  ...
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  storageClassName: my-storage-class
  ...
---
apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    namespace: kafka # namespace of the kafka cluster this volume is for in case there are multiple kafka clusters with the same name in different namespaces
    kafka_cr: kafka # name of the kafka cluster this volume is for
    brokerId: "0" # the id of the broker this volume is for
    mountPath: kafka-logs-2 # path mounted as broker log dir  
spec:
  ...
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  storageClassName: my-storage-class
  ...
---
apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    namespace: kafka # namespace of the kafka cluster this volume is for in case there are multiple kafka clusters with the same name in different namespaces
    kafka_cr: kafka # name of the kafka cluster this volume is for
    brokerId: "1" # the id of the broker this volume is for
    mountPath: kafka-logs-1 # path mounted as broker log dir  
spec:
  ...
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  storageClassName: my-storage-class
  ...
---
apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    namespace: kafka # namespace of the kafka cluster this volume is for in case there are multiple kafka clusters with the same name in different namespaces
    kafka_cr: kafka # name of the kafka cluster this volume is for
    brokerId: "1" # the id of the broker this volume is for
    mountPath: kafka-logs-2 # path mounted as broker log dir  
spec:
  ...
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  storageClassName: my-storage-class
  ...

Broker-level storage configuration to use pre-provisioned volumes

The storageConfigs specified at the broker level to use the above described pre-created persistent volumes as broker log dirs:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  namespace: kafka
  name: kafka
spec:
...
brokers:
  - id: 0
    brokerConfigGroup: default
    brokerConfig:
      storageConfigs:
      - mountPath: /kafka-logs-1
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: "0"
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}' 
          storageClassName: my-storage-class
      - mountPath: /kafka-logs-2
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: "0"
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}'
          storageClassName: my-storage-class
  - id: 1
    brokerConfigGroup: default
    brokerConfig:
      storageConfigs:
      - mountPath: /kafka-logs-1
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: "1"
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}'
          storageClassName: my-storage-class
      - mountPath: /kafka-logs-2
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: "1"
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}'
          storageClassName: my-storage-class

Broker configuration group level storage config to use pre-provisioned volumes

The storageConfigs specified at the broker configuration group level to use the above described pre-created persistent volumes as broker log dirs:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  namespace: kafka
  name: kafka
spec:
  brokerConfigGroups:
    default:
      storageConfigs:
      - mountPath: /kafka-logs-1
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: '{{ .BrokerId }}'
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}'
          storageClassName: my-storage-class
      - mountPath: /kafka-logs-2
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: '{{ .BrokerId }}'
              # strip '/' from mount path as label selector values
              # has to start with an alphanumeric character': https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
              mountPath: '{{ trimPrefix "/" .MountPath }}'
          storageClassName: my-storage-class
      - mountPath: /mountpath/that/exceeds63characters/kafka-logs-123456789123456789
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
          selector: # bind to pre-provisioned persistent volumes by labels
            matchLabels:
              namespace: kafka
              kafka_cr: kafka
              brokerId: '{{ .BrokerId }}'
              # use sha1sum of mountPath to not exceed the 63 char limit for label selector values
              mountPath: '{{ .MountPath | sha1sum }}'
          storageClassName: my-storage-class
...

Storage config data fields

The following data fields are supported in the storage config:

.BrokerID - resolves to the current broker’s Id
.MountPath - resolves to the value of the mountPath field of the current storage config

Under the hood, go-templates enhanced with Sprig functions are used to resolve these fields to values that allow alterations to the resulting value (for examples, see above the use of trimPrefix and sha1sum template functions).

9 - Securing Kafka With SSL

The Koperator makes securing your Apache Kafka cluster with SSL simple.

Enable SSL encryption in Apache Kafka

To create an Apache Kafka cluster which has listener(s) with SSL encryption enabled, you must enable SSL encryption and configure the secrets in the listenersConfig section of your KafkaCluster Custom Resource. You can either provide your own CA certificate and the corresponding private key, or let the operator to create them for you from your cluster configuration. Using sslSecrets, Koperator generates client and server certificates signed using CA. The server certificate is shared across listeners. The client certificate is used by the Koperator, Cruise Control, and Cruise Control Metrics Reporter to communicate Kafka brokers using listener with SSL enabled.

Providing custom certificates per listener is supported from Koperator version 0.21.0+. Having configurations where certain external listeners use user provided certificates while others rely on the auto-generated ones provided by Koperator are also supported. See details below.

Using auto-generated certificates (ssLSecrets)

CAUTION:

After the cluster is created, you cannot change the way the listeners are configured without an outage. If a cluster is created with unencrypted (plain text) listener and you want to switch to SSL encrypted listeners (or the way around), you must manually delete each broker pod. The operator will restart the pods with the new listener configuration.

The following example enables SSL and automatically generates the certificates:

If sslSecrets.create is false, the operator will look for the secret at sslSecrets.tlsSecretName in the namespace of the KafkaCluster custom resource and expect these values:

Key	Value
`caCert`	The CA certificate
`caKey`	The CA private key

Using own certificates

Listeners not used for internal broker and controller communication

In this KafkaCluster custom resource, SSL is enabled for all listeners, and certificates are automatically generated for “inner” and “controller” listeners. The “external” and “internal” listeners will use the user-provided certificates. The serverSSLCertSecret key is a reference to the Kubernetes secret that contains the server certificate for the listener to be used for SSL communication.

In the server secret, the following keys must be set:

Key	Value
`keystore.jks`	Certificate and private key in JKS format
`truststore.jks`	Trusted CA certificate in JKS format
`password`	Password for the key and trust store

The certificates in the listener configuration must be in JKS format.

Listeners used for internal broker or controller communication

In this KafkaCluster custom resource, SSL is enabled for all listeners, and user-provided server certificates. In that case, when a custom certificate is used for a listener which is used for internal broker or controller communication, you must also specify the client certificate. The client certificate will be used by Koperator, Cruise Control, Cruise Control Metrics Reporter to communicate on SSL. The clientSSLCertSecret key is a reference to the Kubernetes secret where the custom client SSL certificate can be provided. The client certificate must be signed by the same CA authority as the server certificate for the corresponding listener. The clientSSLCertSecret has to be in the KafkaCluster custom resource spec field. The client secret must contain the keystore and truststore JKS files and the password for them in base64 encoded format.

In the server secret the following keys must be set:

Key	Value
`keystore.jks`	Certificate and private key in JKS format
`truststore.jks`	Trusted CA certificate in JKS format
`password`	Password for the key and trust store

In the client secret the following keys must be set:

Key	Value
`keystore.jks`	Certificate and private key in JKS format
`truststore.jks`	Trusted CA certificate in JKS format
`password`	Password for the key and trust store

Generate JKS certificate

Certificates in JKS format can be generated using OpenSSL and keystore applications. You can also use this script. The keystore.jks file must contain only one PrivateKeyEntry.

Kafka listeners use 2-way-SSL mutual authentication, so you must properly set the CNAME (Common Name) fields and if needed the SAN (Subject Alternative Name) fields in the certificates. In the following description we assume that the Kafka cluster is in the kafka namespace.

For the client certificate, CNAME must be “kafka-controller.kafka.mgt.cluster.local” (where .kafka. is the namespace of the kafka cluster).
For internal listeners which are exposed by a headless service (kafka-headless), CNAME must be “kafka-headless.kafka.svc.cluster.local”, and the SAN field must contain the following:
- *.kafka-headless.kafka.svc.cluster.local
- kafka-headless.kafka.svc.cluster.local
- *.kafka-headless.kafka.svc
- kafka-headless.kafka.svc
- *.kafka-headless.kafka
- kafka-headless.kafka
- kafka-headless
For internal listeners which are exposed by a normal service (kafka-all-broker), CNAME must be “kafka-all-broker.kafka.svc.cluster.local”
For external listeners, you need to use the advertised load balancer hostname as CNAME. The hostname need to be specified in the KafkaCluster custom resource with hostnameOverride, and the accessMethod has to be “LoadBalancer”. For details about this override, see Step 5 in Expose cluster using a LoadBalancer.

Using Kafka ACLs with SSL

The Koperator helps you create production-ready Apache Kafka cluster on Kubernetes, with scaling, rebalancing, and alerts based self healing.

If you choose not to enable ACLs for your Apache Kafka cluster, you may still use the KafkaUser resource to create new certificates for your applications. You can leave the topicGrants out as they will not have any effect.

To enable ACL support for your Apache Kafka cluster, pass the following configurations along with your brokerConfig:
```
authorizer.class.name=kafka.security.authorizer.AclAuthorizer
allow.everyone.if.no.acl.found=false
```
The operator will ensure that cruise control and itself can still access the cluster, however, to create new clients you will need to generate new certificates signed by the CA, and ensure ACLs on the topic. The operator can automate this process for you using the KafkaUser CRD. For example, to create a new producer for the topic test-topic against the KafkaCluster kafka, apply the following configuration:
```
cat << EOF | kubectl apply -n kafka -f -
apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaUser
metadata:
  name: example-producer
  namespace: kafka
spec:
  clusterRef:
    name: kafka
  secretName: example-producer-secret
  topicGrants:
    - topicName: test-topic
      accessType: write
EOF
```
This will create a user and store its credentials in the secret example-producer-secret. The secret contains these fields:

Key Value

ca.crt The CA certificate

tls.crt The user certificate

tls.key The user private key

Key	Value
`ca.crt`	The CA certificate
`tls.crt`	The user certificate
`tls.key`	The user private key

You can then mount these secrets to your pod. Alternatively, you can write them to your local machine by running:

kubectl get secret example-producer-secret -o jsonpath="{['data']['ca\.crt']}" | base64 -d > ca.crt
kubectl get secret example-producer-secret -o jsonpath="{['data']['tls\.crt']}" | base64 -d > tls.crt
kubectl get secret example-producer-secret -o jsonpath="{['data']['tls\.key']}" | base64 -d > tls.key

To create a consumer for the topic, run this command:

cat << EOF | kubectl apply -n kafka -f -
apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaUser
metadata:
  name: example-consumer
  namespace: kafka
spec:
  clusterRef:
    name: kafka
  secretName: example-consumer-secret
  includeJKS: true
  topicGrants:
    - topicName: test-topic
      accessType: read
EOF

The operator can also include a Java keystore format (JKS) with your user secret if you’d like. Add includeJKS: true to the spec like shown above, and then the user-secret will gain these additional fields:

Key	Value
`tls.jks`	The java keystore containing both the user keys and the CA (use this for your keystore AND truststore)
`pass.txt`	The password to decrypt the JKS (this will be randomly generated)

10 - Koperator capablities

As highlighted in the features section, Koperator removed the reliance on StatefulSet,and supports several different usecases.

Note: This is not a complete list, if you have a specific requirement or question, see our support options.

Vertical capacity scaling

You may have encountered situations where the horizontal scaling of a cluster is impossible. When only one Broker is throttling and needs more CPU or requires additional disks (because it handles the most partitions), a StatefulSet-based solution is useless, since it does not distinguish between replicas’ specifications. The handling of such a case requires unique Broker configurations. If there is a need to add a new disk to a unique Broker, there can be a waste of disk space (and money) with a StatefulSet-based solution, since it can’t add a disk to a specific Broker, the StatefulSet adds one to each replica.

With the Koperator, adding a new disk to any Broker is as easy as changing a CR configuration. Similarly, any Broker-specific configuration can be done on a Broker by Broker basis.

An unhandled error with Broker #1 in a three Broker cluster

In the event of an error with Broker #1, it is ideal to handle it without disrupting the other Brokers. To handle the error you would like to temporarily remove this Broker from the cluster, and fix its state, reconciling the node that serves the node, or maybe reconfigure the Broker using a new configuration. Again, when using StatefulSet, you lose the ability to remove specific Brokers from the cluster. StatefulSet only supports a field name replica that determines how many replicas an application should use. If there’s a downscale/removal, this number can be lowered, however, this means that Kubernetes will remove the most recently added Pod (Broker #3) from the cluster - which, in this case, happens to suit the above purposes quite well.

To remove the #1 Broker from the cluster, you need to lower the number of brokers in the cluster from three to one. This will cause a state in which only one Broker is live, while you kill the brokers that handle traffic. Koperator supports removing specific brokers without disrupting traffic in the cluster.

Fine grained Broker config support

Apache Kafka is a stateful application, where Brokers create/form a cluster with other Brokers. Every Broker is uniquely configurable (Koperator supports heterogenous environments, in which no nodes are the same, act the same or have the same specifications - from the infrastructure up through the Brokers’ Envoy configuration). Kafka has lots of Broker configs, which can be used to fine tune specific brokers, and Koperator did not want to limit these to ALL Brokers in a StatefulSet. Koperator supports unique Broker configs.

In each of the three scenarios listed above, Koperator does not use StatefulSet, relying, instead, on Pods, PVCs and ConfigMaps. While using StatefulSet is a very convenient starting point, as it handles roughly 80% of scenarios, it also introduces huge limitations when running Kafka on Kubernetes in production.

Monitoring based control

Use of monitoring is essential for any application, and all relevant information about Kafka should be published to a monitoring solution. When using Kubernetes, the de facto solution is Prometheus, which supports configuring alerts based on previously consumed metrics. Koperator was built as a standards-based solution (Prometheus and Alert Manager) that could handle and react to alerts automatically, so human operators wouldn’t have to. Koperator supports alert-based Kafka cluster management.

LinkedIn’s Cruise Control

LinkedIn knows how to operate Kafka in a better way. They built a tool, called Cruise Control, to operate their Kafka infrastructure. And Koperator is built to handle the infrastructure, but not to reinvent the wheel in so far as operating Kafka. Koperator was built to leverage the Kubernetes operator pattern and our Kubernetes expertise by handling all Kafka infrastructure related issues in the best possible way. Managing Kafka can be a separate issue, for which there already exist some unique tools and solutions that are standard across the industry, so LinkedIn’s Cruise Control is integrated with the Koperator.

11 - KRaft Mode (ZooKeeper-free Kafka)

Apache Kafka KRaft (Kafka Raft) mode is a new consensus mechanism that eliminates the dependency on Apache ZooKeeper. KRaft mode uses Kafka’s built-in Raft consensus algorithm to manage cluster metadata, making Kafka deployments simpler and more scalable.

Overview

KRaft mode offers several advantages over traditional ZooKeeper-based deployments:

Simplified Architecture: No need to deploy and manage a separate ZooKeeper cluster
Better Scalability: Improved metadata handling for large clusters
Faster Recovery: Quicker cluster startup and recovery times
Reduced Operational Complexity: Fewer moving parts to monitor and maintain
Future-Ready: KRaft is the future of Kafka and will eventually replace ZooKeeper
Production Stability: Kafka 3.9.1 includes significant KRaft improvements and bug fixes

Prerequisites

Koperator version 0.26.0 or later
Kafka version 3.9.1 or later (minimum: 3.3.0, but 3.9.1+ recommended for stability and features)
Kubernetes cluster with sufficient resources

Note: While KRaft mode is available starting from Kafka 3.3.0, version 3.9.1 includes significant stability improvements, bug fixes, and performance enhancements for KRaft deployments. For production environments, always use Kafka 3.9.1 or later.

KRaft Architecture in Koperator

In KRaft mode, Koperator deploys Kafka brokers with different process roles:

Controller nodes: Handle cluster metadata and leader election
Broker nodes: Handle client requests and data storage
Combined nodes: Can act as both controller and broker (not recommended for production)

Basic KRaft Configuration

To enable KRaft mode in your KafkaCluster custom resource, set the kRaft field to true:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka-kraft
spec:
  kRaft: true
  # ... other configuration

Process Roles Configuration

Configure different process roles for your brokers using the processRoles field:

Controller-only nodes

brokers:
  - id: 3
    brokerConfig:
      processRoles:
        - controller

Broker-only nodes

brokers:
  - id: 0
    brokerConfig:
      processRoles:
        - broker

Combined nodes (not recommended for production)

brokers:
  - id: 0
    brokerConfig:
      processRoles:
        - controller
        - broker

Listener Configuration for KRaft

KRaft mode requires specific listener configuration for controller communication:

listenersConfig:
  internalListeners:
    - type: "plaintext"
      name: "internal"
      containerPort: 29092
      usedForInnerBrokerCommunication: true
    - type: "plaintext"
      name: "controller"
      containerPort: 29093
      usedForInnerBrokerCommunication: false
      usedForControllerCommunication: true

Quick Start with KRaft

To quickly deploy a KRaft-enabled Kafka cluster, you can use this sample configuration:

Complete KRaft Example

Here’s a complete example of a KRaft-enabled Kafka cluster:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka-kraft
spec:
  kRaft: true
  headlessServiceEnabled: true
  clusterImage: "ghcr.io/adobe/koperator/kafka:2.13-3.9.1"
  
  brokerConfigGroups:
    default:
      storageConfigs:
        - mountPath: "/kafka-logs"
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
    broker:
      processRoles:
        - broker
      storageConfigs:
        - mountPath: "/kafka-logs-broker"
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi

  brokers:
    # Broker-only nodes
    - id: 0
      brokerConfigGroup: "broker"
    - id: 1
      brokerConfigGroup: "broker"
    - id: 2
      brokerConfigGroup: "broker"
    
    # Controller-only nodes
    - id: 3
      brokerConfigGroup: "default"
      brokerConfig:
        processRoles:
          - controller
    - id: 4
      brokerConfigGroup: "default"
      brokerConfig:
        processRoles:
          - controller
    - id: 5
      brokerConfigGroup: "default"
      brokerConfig:
        processRoles:
          - controller

  listenersConfig:
    internalListeners:
      - type: "plaintext"
        name: "internal"
        containerPort: 29092
        usedForInnerBrokerCommunication: true
      - type: "plaintext"
        name: "controller"
        containerPort: 29093
        usedForInnerBrokerCommunication: false
        usedForControllerCommunication: true

  cruiseControlConfig:
    cruiseControlTaskSpec:
      RetryDurationMinutes: 5
    topicConfig:
      partitions: 12
      replicationFactor: 3

Kafka Version Recommendations

Why Kafka 3.9.1 is Recommended

Kafka 3.9.1 includes several important improvements for KRaft mode:

Enhanced Stability: Critical bug fixes for controller failover and metadata consistency
Improved Performance: Better handling of large metadata operations and faster startup times
Security Enhancements: Updated security features and vulnerability fixes
Monitoring Improvements: Better metrics and observability for KRaft clusters
Production Readiness: Extensive testing and validation for production workloads

Best Practices

Controller Node Configuration

Use odd numbers: Deploy an odd number of controller nodes (3, 5, or 7) for proper quorum
Minimum 3 controllers: For production environments, use at least 3 controller nodes
Separate controllers: Use dedicated controller-only nodes for production workloads
Resource allocation: Controllers need less CPU and memory than brokers but require fast storage

Storage Considerations

Fast storage: Use SSD storage for controller nodes to ensure fast metadata operations
Separate storage: Consider using separate storage for controllers and brokers
Backup strategy: Implement proper backup strategies for controller metadata

Network Configuration

Controller listener: Always configure a dedicated listener for controller communication
Security: Apply the same security configurations (SSL, SASL) to controller listeners

Migration from ZooKeeper

CAUTION:

Migration from ZooKeeper-based clusters to KRaft mode is a complex process that requires careful planning. Always test the migration process in a non-production environment first.

Currently, Koperator does not support automatic migration from ZooKeeper to KRaft mode. For migration scenarios:

New deployments: Use KRaft mode for all new Kafka clusters
Existing clusters: Plan for a blue-green deployment strategy
Data migration: Use tools like MirrorMaker 2.0 for data migration between clusters

Monitoring KRaft Clusters

KRaft clusters expose additional metrics for monitoring controller health:

kafka.server:type=raft-metrics: Raft consensus metrics
kafka.server:type=broker-metadata-metrics: Metadata handling metrics

Configure your monitoring to track these KRaft-specific metrics alongside standard Kafka metrics.

Troubleshooting

Common Issues

Controller quorum loss: Ensure at least (n/2 + 1) controllers are healthy
Metadata inconsistency: Check controller logs for Raft consensus issues
Slow startup: Controllers may take longer to start during initial cluster formation

Useful Commands

Check controller status:

kubectl exec -it kafka-controller-3-xxx -- kafka-metadata-shell.sh --snapshot /kafka-logs/__cluster_metadata-0/00000000000000000000.log

View controller logs:

kubectl logs kafka-controller-3-xxx -f

Limitations

No ZooKeeper migration: Automatic migration from ZooKeeper is not supported
Kafka version: Requires Kafka 3.3.0 minimum, but 3.9.1 or later is strongly recommended
Feature parity: Some advanced ZooKeeper features may not be available in early KRaft versions

Resources

12 - Monitoring Apache Kafka on Kubernetes

This documentation shows you how to enable custom monitoring on an Apache Kafka cluster installed using the Koperator.

Using Helm for Prometheus

By default, the Koperator does not set annotations on the broker pods. To set annotations on the broker pods, specify them in the KafkaCluster CR. Also, you must open port 9020 on brokers and in CruiseControl to enable scraping. For example:

brokerConfigGroups:
  default:
    brokerAnnotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "9020"

...

cruiseControlConfig:
  cruiseControlAnnotations:
    prometheus.io/port: "9020"
    prometheus.io/scrape: "true"

Prometheus must be configured to recognize these annotations. The following example contains the required config.

# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the
# following annotations:
#
# * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
- job_name: 'kubernetes-pods'

  kubernetes_sd_configs:
    - role: pod

  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2
      target_label: __address__

If you are using the provided CR, the operator installs the official jmx exporter for Prometheus.

To change this behavior, modify the following lines at the end of the CR.

monitoringConfig:
   jmxImage describes the used prometheus jmx exporter agent container
    jmxImage: "ghcr.io/amuraru/jmx-javaagent:0.19.1"
   pathToJar describes the path to the jar file in the given image
    pathToJar: "/opt/jmx_exporter/jmx_prometheus_javaagent-0.19.1.jar"
   kafkaJMXExporterConfig describes jmx exporter config for Kafka
    kafkaJMXExporterConfig: |
     lowercaseOutputName: true

Using the ServiceMonitors

To use ServiceMonitors, we recommend to use Kafka with unique service/broker instead of headless service.

Configure the CR the following way:

  # Specify if the cluster should use headlessService for Kafka or individual services
  # using service/broker may come in handy in case of service mesh
  headlessServiceEnabled: false

Disabling Headless service means the operator will set up Kafka with unique services per broker.

Once you have a cluster up and running, create as many ServiceMonitors as brokers.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kafka-0
spec:
  selector:
    matchLabels:
      app: kafka
      brokerId: "0"
      kafka_cr: kafka
  endpoints:
  - port: metrics
    interval: 10s

13 - Expose the Kafka cluster to external applications

There are two methods to expose your Apache Kafka cluster so that external client applications that run outside the Kubernetes cluster can access it:

using LoadBalancer type services
using NodePort type services

The LoadBalancer method is a convenient way to publish your Kafka cluster, as you don’t have to set up a Load Balancer, provision public IPs, configure routing rules, and so on, since all these are taken care for you. Also, this method has the advantage of reducing your attack surface, since you don’t have to make the Kubernetes cluster’s nodes directly reachable from outside, because incoming external traffic is routed to the nodes of the Kubernetes cluster through the Load Balancer.

Kafka cluster external access through Load balancer

The NodePort method provides access to Kafka for external clients through the external public IP of the nodes of the Kubernetes cluster.

Kafka cluster external access through NodePort services

This NodePort method is a good fit when:

your Kubernetes distribution or hosting environment does not support Load Balancers,
business requirements make the extra hops introduced by the Load Balancer and Ingress controller unacceptable, or
the environment where the Kubernetes cluster is hosted is locked down, and thus the Kubernetes nodes are not reachable through their public IPs from outside.

External listeners

You can expose the Kafka cluster outside the Kubernetes cluster by declaring one or more externalListeners in the KafkaCluster custom resource. The following externalListeners configuration snippet creates two external access points through which the Kafka cluster’s brokers can be reached. These external listeners are registered in the advertised.listeners Kafka broker configuration as EXTERNAL1://...,EXTERNAL2://....

By default, external listeners use the LoadBalancer access method.

listenersConfig:
  externalListeners:
    - type: "plaintext"
      name: "external1"
      externalStartingPort: 19090
      containerPort: 9094
      # anyCastPort sets which port clients can use to reach all the brokers of the Kafka cluster, default is 29092
      # valid range: 0 < x < 65536
      # this doesn't have impact if using NodePort to expose the Kafka cluster
      anyCastPort: 443
      # ingressControllerTargetPort sets which port the ingress controller uses to handle the external client traffic through the "anyCastPort", default is 29092
      # valid range: 1023 < x < 65536
      # this doesn't have impact if using NodePort to expose the Kafka cluster
      # if specified, the ingressControllerTargetPort cannot collide with the reserved envoy ports (if using envoy) and the external broker port numbers
      ingressControllerTargetPort: 3000
    - type: "plaintext"
      name: "external2"
      externalStartingPort: 19090
      containerPort: 9095

Expose cluster using a LoadBalancer

To configure an external listener that uses the LoadBalancer access method, complete the following steps.

Edit the KafkaCluster custom resource.
Add an externalListeners section under listenersConfig. The following example creates a Load Balancer for the external listener, external1. Each broker in the cluster receives a dedicated port number on the Load Balancer which is computed as broker port number = externalStartingPort + broker id. This will be registered in each broker’s config as advertised.listeners=EXTERNAL1://<loadbalancer-public-ip>:<broker port number>.

There are currently two reserved container ports while using Envoy as the ingress controller: 8081 for health-check port, and 8080 for admin port. The external broker port numbers (externalStartingPort + broker id) cannot collide with the reserved envoy ports.

```yaml
listenersConfig:
  externalListeners:
    - type: "plaintext"
      name: "external1"
      externalStartingPort: 19090
      containerPort: 9094
      accessMethod: LoadBalancer
      # anyCastPort sets which port clients can use to reach all the brokers of the Kafka cluster, default is 29092
      # valid range: 0 < x < 65536
      anyCastPort: 443
      # ingressControllerTargetPort sets which port the ingress controller uses to handle the external client traffic through the "anyCastPort", default is 29092
      # valid range: 1023 < x < 65536
      # if specified, the ingressControllerTargetPort cannot collide with the reserved envoy ports (if using envoy) and the external broker port numbers
     ingressControllerTargetPort: 3000
```

Set the ingress controller. The ingress controllers that are currently supported for load balancing are:
- envoy: uses Envoy proxy as an ingress.
- istioingress: uses Istio proxy gateway as an ingress.
Configure the ingress controller you want to use:
- To use Envoy, set the ingressController field in the KafkaCluster custom resource to envoy. For an example, see.
  
  For OpenShift:
```
spec:
  # ...
  envoyConfig:
    podSecurityContext:
      runAsGroup: 19090
      runAsUser: 19090
  # ...
  ingressController: "envoy"
  # ...
```
  For Kubernetes:
```
spec:
  ingressController: "envoy"
```
- To use Istio ingress controller set the ingressController field to istioingress. Koperator now uses standard Istio resources (Gateway, VirtualService) instead of the deprecated banzaicloud istio-operator. This provides better compatibility and works with any Istio installation. The istioControlPlane configuration is no longer required.
```
spec:
  ingressController: "istioingress"
  istioIngressConfig:
    gatewayConfig:
      mode: ISTIO_MUTUAL  # or SIMPLE for non-mTLS
```
  For detailed Istio integration configuration and advanced features, see the Istio Integration Guide.
Configure additional parameters for the ingress controller as needed for your environment, for example, number of replicas, resource requirements and resource limits. You can be configure such parameters using the envoyConfig and istioIngressConfig fields, respectively.
(Optional) For external access through a static URL instead of the load balancer’s public IP, specify the URL in the hostnameOverride field of the external listener that resolves to the public IP of the load balancer. The broker address will be advertised as, advertised.listeners=EXTERNAL1://kafka-1.dev.my.domain:<broker port number>.
```
listenersConfig:
  externalListeners:
    - type: "plaintext"
      name: "external1"
      externalStartingPort: 19090
      containerPort: 9094
      accessMethod: LoadBalancer
      hostnameOverride: kafka-1.dev.my.domain
```
Apply the KafkaCluster custom resource to the cluster.

Expose cluster using a NodePort

Using the NodePort access method, external listeners make Kafka brokers accessible through either the external IP of a Kubernetes cluster’s node, or on an external IP that routes into the cluster.

To configure an external listener that uses the NodePort access method, complete the following steps.

Edit the KafkaCluster custom resource.
Add an externalListeners section under listenersConfig. The following example creates a NodePort type service separately for each broker. Brokers can be reached from outside the Kubernetes cluster at <any node public ip>:<broker port number> where the <broker port number> is computed as externalStartingPort + broker id. The externalStartingPort must fall into the range allocated for nodeports on the Kubernetes cluster, which is specified via –service-node-port-range (see the Kubernetes documentation).
```
listenersConfig:
  externalListeners:
    - type: "plaintext"
      name: "external1"
      externalStartingPort: 32000
      containerPort: 9094
      accessMethod: NodePort
```
(Optional) For external access through a dynamic URL, specify a suffix in the hostnameOverride field of the external listener:
```
listenersConfig:
  externalListeners:
    - type: "plaintext"
      name: "external1"
      externalStartingPort: 32000
      containerPort: 9094
      accessMethod: NodePort
      hostnameOverride: .dev.example.com
```
The hostnameOverride behaves differently here than with LoadBalancer access method. In this case, each broker will be advertised as advertised.listeners=EXTERNAL1://<kafka-cluster-name>-<broker-id>.<external listener name>.<namespace><value-specified-in-hostnameOverride-field>:<broker port number>. If a three-broker Kafka cluster named kafka is running in the kafka namespace, the advertised.listeners for the brokers will look like this:
- broker 0:
  - advertised.listeners=EXTERNAL1://kafka-0.external1.kafka.dev.my.domain:32000
- broker 1:
  - advertised.listeners=EXTERNAL1://kafka-1.external1.kafka.dev.my.domain:32001
- broker 2:
  - advertised.listeners=EXTERNAL1://kafka-2.external1.kafka.dev.my.domain:32002
Apply the KafkaCluster custom resource to the cluster.

NodePort external IP

The node IP of the node where the broker pod is scheduled will be used in the advertised.listeners broker configuration when the nodePortNodeAddressType is specified.
Its value determines which IP or domain name of the Kubernetes node will be used, the possible values are: Hostname, ExternalIP, InternalIP, InternalDNS and ExternalDNS.
The hostNameOverride and nodePortExternalIP must not be specified in this case.

brokers:
- id: 0
  brokerConfig:
    nodePortNodeAddressType: ExternalIP
- id: 1
  brokerConfig:
    nodePortNodeAddressType: ExternalIP
- id: 2
  brokerConfig:
    nodePortNodeAddressType: ExternalIP

If hostnameOverride and nodePortExternalIP fields are not set, then broker address is advertised as follows:

broker 0:
- advertised.listeners=EXTERNAL1://16.171.47.211:9094
broker 1:
- advertised.listeners=EXTERNAL1://16.16.66.201:9094
broker 2:
- advertised.listeners=EXTERNAL1://16.170.214.51:9094

Kafka brokers can be made accessible on external IPs that are not node IP, but can route into the Kubernetes cluster. These external IPs can be set for each broker in the KafkaCluster custom resource as in the following example:

brokers:
- id: 0
  brokerConfig:
    nodePortExternalIP:
        external1: 13.53.214.23 # if "hostnameOverride" is not set for "external1" external listener, then broker is advertised on this IP
- id: 1
    brokerConfig:
    nodePortExternalIP:
        external1: 13.48.71.170 # if "hostnameOverride" is not set for "external1" external listener, then broker is advertised on this IP
- id: 2
    brokerConfig:
    nodePortExternalIP:
        external1: 13.49.70.146 # if "hostnameOverride" is not set for "external1" external listener, then broker is advertised on this IP

If hostnameOverride field is not set, then broker address is advertised as follows:

broker 0:
- advertised.listeners=EXTERNAL1://13.53.214.23:9094
broker 1:
- advertised.listeners=EXTERNAL1://13.48.71.170:9094
broker 2:
- advertised.listeners=EXTERNAL1://13.49.70.146:9094

If both hostnameOverride and nodePortExternalIP fields are set:

broker 0:
- advertised.listeners=EXTERNAL1://kafka-0.external1.kafka.dev.my.domain:9094
broker 1:
- advertised.listeners=EXTERNAL1://kafka-1.external1.kafka.dev.my.domain:9094
broker 2:
- advertised.listeners=EXTERNAL1://kafka-2.external1.kafka.dev.my.domain:9094

Note: If nodePortExternalIP or nodePortNodeAddressType is set, then the containerPort from the external listener config is used as a broker port, and is the same for each broker.

SASL authentication on external listeners

To enable sasl_plaintext authentication on the external listener, modify the externalListeners section of the KafkaCluster CR according to the following example. This will enable an external listener on port 19090.

  listenersConfig:
    externalListeners:
    - config:
        defaultIngressConfig: ingress-sasl
        ingressConfig:
          ingress-sasl:
            istioIngressConfig:
              gatewayConfig:
                credentialName: istio://sds
                mode: SIMPLE
      containerPort: 9094
      externalStartingPort: 19090
      name: external
      type: sasl_plaintext

To connect to this listener using the Kafka 3.1.0 (and above) console producer, complete the following steps:

Set the producer properties like this. Replace the parameters between brackets as needed for your environment:

sasl.mechanism=OAUTHBEARER
security.protocol=SASL_SSL
sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler
sasl.oauthbearer.token.endpoint.url=<https://myidp.example.com/oauth2/default/v1/token>
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  clientId="<oauth-client-id>" \
  clientSecret="<client-secret>" \
  scope="kafka:write";
ssl.truststore.location=/ssl/trustore.jks
ssl.truststore.password=truststorepass
ssl.endpoint.identification.algorithm=

Run the following command:

kafka-console-producer.sh --bootstrap-server <your-loadbalancer-ip>:19090 --topic <your-topic-name> --producer.config producer.properties

To consume messages from this listener using the Kafka 3.1.0 (and above) console consumer, complete the following steps:

Set the consumer properties like this. Replace the parameters between brackets as needed for your environment:

group.id=consumer-1
group.instance.id=consumer-1-instance-1
client.id=consumer-1-instance-1
sasl.mechanism=OAUTHBEARER
security.protocol=SASL_SASL
sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.secured.OAuthBearerLoginCallbackHandler
sasl.oauthbearer.token.endpoint.url=<https://myidp.example.com/oauth2/default/v1/token>
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  clientId="<oauth-client-id>" \
  clientSecret="<client-secret>" \
  scope="kafka:read" ;
ssl.endpoint.identification.algorithm=
ssl.truststore.location=/ssl/trustore.jks
ssl.truststore.password=trustorepass

Run the following command:

kafka-console-consumer.sh --bootstrap-server <your-loadbalancer-ip>:19090 --topic <your-topic-name> --consumer.config /opt/kafka/config/consumer.properties --from-beginning

14 - Configure rack awareness

Kafka automatically replicates partitions across brokers, so if a broker fails, the data is safely preserved on another. Kafka’s rack awareness feature spreads replicas of the same partition across different failure groups (racks or availability zones). This extends the guarantees Kafka provides for broker-failure to cover rack and availability zone (AZ) failures, limiting the risk of data loss should all the brokers in the same rack or AZ fail at once.

Note: All brokers deployed by Koperator must belong to the same Kubernetes cluster.

Since rack awareness is so vitally important, especially in multi-region and hybrid-cloud environments, the Koperator provides an automated solution for it, and allows fine-grained broker rack configuration based on pod affinities and anti-affinities.

When well-known Kubernetes labels are available (for example, AZ, node labels, and so on), the Koperator attempts to improve broker resilience by default.

Rack Awareness

When the broker.rack configuration option is enabled on the Kafka brokers, Kafka spreads replicas of a partition over different racks. This prevents the loss of data even when an entire rack goes offline at once. According to the official Kafka documentation, “it uses an algorithm which ensures that the number of leaders per broker will be constant, regardless of how brokers are distributed across racks. This ensures balanced throughput.”

Note: The broker.rack configuration is a read-only config, changing it requires a broker restart.

Enable rack awareness

Enabling rack awareness on a production cluster in a cloud environment is essential, since regions, nodes, and network partitions may vary.

To configure rack awareness, add the following to the spec section of your KafkaCluster CRD.

  rackAwareness:
    labels:
      - "topology.kubernetes.io/region"
      - "topology.kubernetes.io/zone"
  oneBrokerPerNode: false

If oneBrokerPerNode is set to true, each broker starts on a new node (that is, literally, one broker per node). If there are not enough nodes for each broker, the broker pod remains in pending state.
If oneBrokerPerNode is set to false, the operator tries to schedule the brokers to unique nodes, but if the number of nodes is less than the number of brokers, brokers are scheduled to nodes on which a broker is already running.

Most cloud provider-managed Kubernetes clusters have well-known labels. One well-known label is topology.kubernetes.io/zone. Kubernetes adds this label to the nodes of the cluster and populates the label with zone information from the cloud provider. (If the node is in an on-prem cluster, the operator can also set this label, but it’s not strictly mandatory.)

On clusters which do not have well-known labels, you can set your own labels in the CR to achieve rack awareness.

Note that depending on your use case, you might need additional configuration on your Kafka brokers and clients. For example, to use follower-fetching, you must also set replica.selector.class: org.apache.kafka.common.replica.RackAwareReplicaSelector in your KafkaCluster CRD, and set the client.rack option in your client configuration to match the region of your brokers.

Under the hood

As mentioned earlier, broker.rack is a read-only broker config, so is set whenever the broker starts or restarts. The Koperator holds all its configs within a ConfigMap in each broker. Getting label values from nodes and using them to generate a ConfigMap is relatively easy, but to determine where the exact broker/pod is scheduled, the operator has to wait until the pod is actually scheduled to a node. Luckily, Kubernetes schedules pods even when a given ConfigMap is unavailable. However, the corresponding pod will remain in a pending state as long as the ConfigMap is not available to mount. The operator makes use of this pending state to gather all the necessary node labels and initialize a ConfigMap with the fetched data. To take advantage of this, we introduced a status field called RackAwarenessState in our CRD. The operator populates this status field with two values, WaitingForRackAwareness and Configured.

Rack Awareness

When a broker fails

What happens if a broker fails? Will Kubernetes schedule it to a different zone? When a pod fails, the operator fetches all the available information from the node(s) - including zone and region - and tries to place it back into the zone it was previously in. If it can’t, the pod remains pending.

To manually override this and schedule the broker into a different zone or region, set the broker.rack config to the location of the broker node.

15 - Supported versions and compatibility matrix

This page shows you the list of supported Koperator versions, and the versions of other components they are compatible with.

Available Koperator images

Note: Starting from version 0.25.0, Koperator images are published to ghcr.io/adobe/koperator instead of ghcr.io/banzaicloud/kafka-operator.

Image	Go version
ghcr.io/adobe/koperator/kafka-operator:0.28.0-adobe-20250923	1.25

Available Apache Kafka images

Note: Starting from version 0.25.0, Kafka images are published to ghcr.io/adobe/kafka instead of ghcr.io/banzaicloud/kafka.

Image	Java version
ghcr.io/adobe/kafka:2.13-3.9.1	21

Available JMX Exporter images

Image	Java version

|ghcr.io/amuraru/jmx-javaagent:0.19.1|21|

Available Cruise Control images

Image	Java version
adobe/cruise-control:3.0.3-adbe-20250804	21

16 - Benchmarking Kafka

How to setup the environment for the Kafka Performance Test.

GKE

Create a test cluster with 3 nodes for ZooKeeper, 3 for Kafka, 1 Master node and 2 node for clients.

Once your cluster is up and running you can set up the Kubernetes infrastructure.

Create a StorageClass which enables high performance disk requests.

kubectl create -f - <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
volumeBindingMode: WaitForFirstConsumer
EOF

EKS

Create a test cluster with 3 nodes for ZooKeeper, 3 for Kafka, 1 Master node and 2 node for clients.

CAUTION:
The ZooKeeper and the Kafka clusters need persistent volume (PV) to store data. Therefore, when installing the operator on Amazon EKS with Kubernetes version 1.23 or later, you must install the EBS CSI driver add-on on your cluster.

Once your cluster is up and running you can set up the Kubernetes infrastructure.

Create a StorageClass which enables high performance disk requests.

kubectl create -f - <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "50"
  fsType: ext4
volumeBindingMode: WaitForFirstConsumer
EOF

Install other required components

Create a ZooKeeper cluster with 3 replicas using Pravega’s Zookeeper Operator.

helm install zookeeper-operator \
--repo https://charts.pravega.io zookeeper-operator \
--namespace=zookeeper \
--create-namespace
kubectl create -f - <<EOF
apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
  name: zookeeper-server
  namespace: zookeeper
spec:
  replicas: 3
EOF

Install the Koperator CustomResourceDefinition resources (adjust the version number to the Koperator release you want to install) and the corresponding version of Koperator, the Operator for managing Apache Kafka on Kubernetes.

kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_cruisecontroloperations.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkatopics.yaml
kubectl apply -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkausers.yaml

helm install kafka-operator \
oci://ghcr.io/adobe/helm-charts/kafka-operator \
--namespace=kafka \
--create-namespace

Create a 3-broker Kafka Cluster using this YAML file.

This will install 3 brokers with fast SSD. If you would like the brokers in different zones, modify the following configurations to match your environment and use them in the broker configurations:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
...
spec:
  ...
  brokerConfigGroups:
    default:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: <node-label-key>
                operator: In
                values:
                - <node-label-value-zone-1>
                - <node-label-value-zone-2>
                - <node-label-value-zone-3>
  ...

Create a client container inside the cluster

kubectl create -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: kafka-test
spec:
  containers:
  - name: kafka-test
    image: "wurstmeister/kafka:2.12-2.1.1"
    # Just spin & wait forever
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 3000; done;" ]
EOF

Exec into this client and create the perftest, perftest2, perftes3 topics.

For internal listeners exposed by a headless service (KafkaCluster.spec.headlessServiceEnabled is set to true):


kubectl exec -it kafka-test -n kafka bash
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-headless.kafka:29092 --topic perftest --create --replication-factor 3 --partitions 3
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-headless.kafka:29092 --topic perftest2 --create --replication-factor 3 --partitions 3
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-headless.kafka:29092 --topic perftest3 --create --replication-factor 3 --partitions 3

For internal listeners exposed by a regular service (KafkaCluster.spec.headlessServiceEnabled set to false):

kubectl exec -it kafka-test -n kafka bash
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-all-broker.kafka:29092 --topic perftest --create --replication-factor 3 --partitions 3
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-all-broker.kafka:29092 --topic perftest2 --create --replication-factor 3 --partitions 3
./opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka-all-broker.kafka:29092 --topic perftest3 --create --replication-factor 3 --partitions 3

Monitoring environment is automatically installed. To monitor the infrastructure we used the official Node Exporter dashboard available with id 1860.

Run the tests

Run performance test against the cluster, by building this Docker image.

docker build -t <yourname>/perfload:0.1.0 /loadgens
docker push <yourname>/perfload:0.1.0

Submit the performance testing application:

kubectl create -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: loadtest
  name: perf-load
  namespace: kafka
spec:
  progressDeadlineSeconds: 600
  replicas: 4
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: loadtest
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: loadtest
    spec:
      containers:
      - args:
        - -brokers=kafka-0:29092,kafka-1:29092,kafka-2:29092
        - -topic=perftest
        - -required-acks=all
        - -message-size=512
        - -workers=20
        - -api-version=3.1.0
        image: yourorg/yourimage:yourtag
        imagePullPolicy: Always
        name: sangrenel
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
EOF

17 - Delete the operator

In case you want to delete the Koperator from your cluster, note that because of dependencies between the various components, they must be deleted in specific order.

CAUTION:

It’s important to delete the Koperator deployment as the last step.

Uninstall Koperator

Delete the Prometheus instance used by the Kafka cluster. If you used the sample Prometheus instance from the Koperator repository you can use the following command, otherwise do this step manually according to the way you deployed the Prometheus instance.

kubectl delete \
    -n kafka \
    -f https://raw.githubusercontent.com/adobe/koperator/0.28.0-adobe-20250923/config/samples/kafkacluster-prometheus.yaml

Expected output:

clusterrole.rbac.authorization.k8s.io/prometheus deleted
clusterrolebinding.rbac.authorization.k8s.io/prometheus deleted
prometheus.monitoring.coreos.com/kafka-prometheus deleted
prometheusrule.monitoring.coreos.com/kafka-alerts deleted
serviceaccount/prometheus deleted
servicemonitor.monitoring.coreos.com/cruisecontrol-servicemonitor deleted
servicemonitor.monitoring.coreos.com/kafka-servicemonitor deleted

Delete KafkaCluster Custom Resource (CR) that represent the Kafka cluster and Cruise Control.
```
kubectl delete kafkaclusters -n kafka kafka
```
Example output:
```
kafkacluster.kafka.banzaicloud.io/kafka deleted
```
Wait for the Kafka resources (Pods, PersistentVolumeClaims, Configmaps, etc) to be removed.
```
kubectl get pods -n kafka
```
Expected output:
```
NAME                                       READY   STATUS    RESTARTS   AGE
kafka-operator-operator-8458b45587-286f9   2/2     Running   0          62s
```
You would also need to delete other Koperator-managed CRs (if any) following the same fashion

Note: KafkaCluster, KafkaTopic and KafkaUser custom resources are protected with Kubernetes finalizers, so those won’t be actually deleted from Kubernetes until the Koperator removes those finalizers. After the Koperator has finished cleaning up everything, it removes the finalizers. In case you delete the Koperator deployment before it cleans up everything, you need to remove the finalizers manually.

Uninstall Koperator deployment.

helm uninstall kafka-operator -n kafka

Expected output:

release "kafka-operator" uninstalled

Delete Koperator Custom Resource Definitions (CRDs).

kubectl delete -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_cruisecontroloperations.yaml
kubectl delete -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml
kubectl delete -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkatopics.yaml
kubectl delete -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkausers.yaml

Uninstall Prometheus operator

Uninstall the prometheus-operator deployment.

helm uninstall -n prometheus prometheus

Expected output:

release "prometheus" uninstalled

If no other cluster resources uses prometheus-operator CRDs, delete the prometheus-operator’s CRDs.

Note: Red Hat OpenShift clusters require those CRDs to function so do not delete those on such clusters.
```
kubectl get crd | grep 'monitoring.coreos.com'| awk '{print $1};' | xargs kubectl delete crd
```

Uninstall Zookeeper Operator

Delete Zookeeper CR.

kubectl delete zookeeperclusters -n zookeeper zookeeper-server

Expected output:

zookeeperclusters.zookeeper.pravega.io/zookeeper-server deleted

Wait for the Zookeeper resources (Deployment, PersistentVolumeClaims, Configmaps, etc) to be removed.

kubectl get pods -n zookeeper

Expected output:

NAME                                  READY   STATUS    RESTARTS   AGE
zookeeper-operator-5857967dcc-gm5l5   1/1     Running   0          3m22s

Uninstall the zookeeper-operator deployment.

helm uninstall zookeeper-operator -n zookeeper

If no other cluster resource uses Zookeeper CRDs, delete Zookeeper Operator’s CRDs
```
kubectl delete customresourcedefinition zookeeperclusters.zookeeper.pravega.io
```

Uninstall Cert-Manager

Uninstall with Helm

Uninstall cert-manager deployment.

helm uninstall -n cert-manager cert-manager

Expected output:

release "cert-manager" uninstalled

If no other cluster resource uses cert-manager CRDs, delete cert-manager’s CRDs:

kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v1.11.0/cert-manager.crds.yaml

Expected output:

customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io deleted
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io deleted
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io deleted
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io deleted
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io deleted
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io deleted

18 - Tips and tricks for the Koperator

Rebalancing

The Koperator installs Cruise Control (CC) to oversee your Kafka cluster. When you change the cluster (for example, add new nodes), the Koperator engages CC to perform a rebalancing if needed. How and when CC performs rebalancing depends on its settings (see goal settings in the official CC documentation) and on how long CC was trained with Kafka’s behavior (this may take weeks).

You can also trigger rebalancing manually from the CC UI:

kubectl port-forward -n kafka svc/kafka-cruisecontrol-svc 8090:8090

Cruise Control UI will be available at http://localhost:8090.

Headless service

When the headlessServiceEnabled option is enabled (true) in your KafkaCluster CR, it creates a headless service for accessing the kafka cluster from within the Kubernetes cluster.

When the headlessServiceEnabled option is disabled (false), it creates a ClusterIP service. When using a ClusterIP service, your client application doesn’t need to be aware of every Kafka broker endpoint, it simply connects to kafka-all-broker:29092 which covers dynamically all the available brokers. That way if the Kafka cluster is scaled dynamically, there is no need to reconfigure the client applications.

Retrieving broker configuration during downscale operation

When a broker is downscaling, the broker configuration is missing from the kafkaCluster/spec/brokers field. You can retrieve the last broker configuration with the following command.

echo <value of the kafkaCluster/status/brokerState/brokerID/configurationBackup> | base64 -d | gzip -d

19 - Troubleshooting the operator

The following tips and commands can help you to troubleshoot your Koperator installation.

First things to do

Verify that the Koperator pod is running. Issue the following command: kubectl get pods -n kafka|grep kafka-operator The output should include a running pod, for example:

NAME                                          READY   STATUS      RESTARTS   AGE
kafka-operator-operator-6968c67c7b-9d2xq   2/2     Running   0          10m

Verify that the Kafka broker pods are running. Issue the following command: kubectl get pods -n kafka The output should include a numbered running pod for each broker, with names like kafka-0-zcxk7, kafka-1-2nhj5, and so on, for example:

NAME                                       READY   STATUS    RESTARTS   AGE
kafka-0-zcxk7                              1/1     Running   0          3h16m
kafka-1-2nhj5                              1/1     Running   0          3h15m
kafka-2-z4t84                              1/1     Running   0          3h15m
kafka-cruisecontrol-7f77ccf997-cqhsw       1/1     Running   1          3h15m
kafka-operator-operator-6968c67c7b-9d2xq   2/2     Running   0          3h17m
prometheus-kafka-prometheus-0              2/2     Running   1          3h16m

If you see any problems, check the logs of the affected pod, for example:
```
kubectl logs kafka-0-zcxk7 -n kafka
```

Check the status (State) of your resources. For example:

kubectl get KafkaCluster kafka -n kafka -o jsonpath="{.status}" |jq

Check the status of your cluster coordination system:

For ZooKeeper-based clusters: Check the status of your ZooKeeper deployment, and the logs of the zookeeper-operator and zookeeper pods.
```
kubectl get pods -n zookeeper
```
For KRaft-based clusters: Check the status of your controller nodes (they will have names like kafka-controller-3-xxx):
```
kubectl get pods -n kafka | grep controller
```

Check the KafkaCluster configuration

You can display the current configuration of your Kafka cluster using the following command: kubectl describe KafkaCluster kafka -n kafka

The output looks like the following:

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  creationTimestamp: "2022-11-21T16:02:55Z"
  finalizers:
  - finalizer.kafkaclusters.kafka.banzaicloud.io
  - topics.kafkaclusters.kafka.banzaicloud.io
  - users.kafkaclusters.kafka.banzaicloud.io
  generation: 4
  labels:
    controller-tools.k8s.io: "1.0"
  name: kafka
  namespace: kafka
  resourceVersion: "3474369"
  uid: f8744017-1264-47d4-8b9c-9ee982728ecc
spec:
  brokerConfigGroups:
    default:
      storageConfigs:
      - mountPath: /kafka-logs
        pvcSpec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
      terminationGracePeriodSeconds: 120
  brokers:
  - brokerConfigGroup: default
    id: 0
  - brokerConfigGroup: default
    id: 1
  clusterImage: ghcr.io/adobe/kafka:2.13-3.9.1
  cruiseControlConfig:
    clusterConfig: |
      {
        "min.insync.replicas": 3
      }
    config: |
    ...
    cruiseControlTaskSpec:
      RetryDurationMinutes: 0
  disruptionBudget: {}
  envoyConfig: {}
  headlessServiceEnabled: true
  istioIngressConfig: {}
  listenersConfig:
    externalListeners:
    - containerPort: 9094
      externalStartingPort: 19090
      name: external
      type: plaintext
    internalListeners:
    - containerPort: 29092
      name: plaintext
      type: plaintext
      usedForInnerBrokerCommunication: true
    - containerPort: 29093
      name: controller
      type: plaintext
      usedForControllerCommunication: true
      usedForInnerBrokerCommunication: false
  monitoringConfig: {}
  oneBrokerPerNode: false
  readOnlyConfig: |
    auto.create.topics.enable=false
    cruise.control.metrics.topic.auto.create=true
    cruise.control.metrics.topic.num.partitions=1
    cruise.control.metrics.topic.replication.factor=2
  rollingUpgradeConfig:
    failureThreshold: 1
  zkAddresses:
  - zookeeper-server-client.zookeeper:2181
status:
  alertCount: 0
  brokersState:
    "0":
      configurationBackup: H4sIAAAAAAAA/6pWykxRsjLQUUoqys9OLXLOz0vLTHcvyi8tULJSSklNSyzNKVGqBQQAAP//D49kqiYAAAA=
      configurationState: ConfigInSync
      gracefulActionState:
        cruiseControlState: GracefulUpscaleSucceeded
        volumeStates:
          /kafka-logs:
            cruiseControlOperationReference:
              name: kafka-rebalance-bhs7n
            cruiseControlVolumeState: GracefulDiskRebalanceSucceeded
      image: ghcr.io/adobe/kafka:2.13-3.9.1
      perBrokerConfigurationState: PerBrokerConfigInSync
      rackAwarenessState: ""
      version: 3.1.0
    "1":
      configurationBackup: H4sIAAAAAAAA/6pWykxRsjLUUUoqys9OLXLOz0vLTHcvyi8tULJSSklNSyzNKVGqBQQAAP//pYq+WyYAAAA=
      configurationState: ConfigInSync
      gracefulActionState:
        cruiseControlState: GracefulUpscaleSucceeded
        volumeStates:
          /kafka-logs:
            cruiseControlOperationReference:
              name: kafka-rebalance-bhs7n
            cruiseControlVolumeState: GracefulDiskRebalanceSucceeded
      image: ghcr.io/adobe/kafka:2.13-3.9.1
      perBrokerConfigurationState: PerBrokerConfigInSync
      rackAwarenessState: ""
      version: 3.1.0
  cruiseControlTopicStatus: CruiseControlTopicReady
  listenerStatuses:
    externalListeners:
      external:
      - address: a0abb7ab2e4a142d793f0ec0cb9b58ae-1185784192.eu-north-1.elb.amazonaws.com:29092
        name: any-broker
      - address: a0abb7ab2e4a142d793f0ec0cb9b58ae-1185784192.eu-north-1.elb.amazonaws.com:19090
        name: broker-0
      - address: a0abb7ab2e4a142d793f0ec0cb9b58ae-1185784192.eu-north-1.elb.amazonaws.com:19091
        name: broker-1
    internalListeners:
      plaintext:
      - address: kafka-headless.kafka.svc.cluster.local:29092
        name: headless
      - address: kafka-0.kafka-headless.kafka.svc.cluster.local:29092
        name: broker-0
      - address: kafka-1.kafka-headless.kafka.svc.cluster.local:29092
        name: broker-1
  rollingUpgradeStatus:
    errorCount: 0
    lastSuccess: ""
  state: ClusterRunning

Getting Support

If you encounter any problems that the documentation does not address, file an issue.

Various support channels are also available for Koperator.

Before asking for help, prepare the following information to make troubleshooting faster:

Koperator version
Kubernetes version (kubectl version)
Helm/chart version (if you installed the Koperator with Helm)
Koperator logs, for example kubectl logs kafka-operator-operator-6968c67c7b-9d2xq manager -n kafka and kubectl logs kafka-operator-operator-6968c67c7b-9d2xq kube-rbac-proxy -n kafka
Kafka broker logs
Koperator configuration
Kafka cluster configuration (kubectl describe KafkaCluster kafka -n kafka)
For ZooKeeper-based clusters:
- ZooKeeper configuration (kubectl describe ZookeeperCluster zookeeper-server -n zookeeper)
- ZooKeeper logs (kubectl logs zookeeper-operator-5c9b597bcc-vkdz9 -n zookeeper)
For KRaft-based clusters:
- Controller node logs (kubectl logs kafka-controller-3-xxx -n kafka) Do not forget to remove any sensitive information (for example, passwords and private keys) before sharing.

19.1 - Common errors

Upgrade failed

If you get the following error in the logs of the Koperator, update your KafkaCluster CRD. This error typically occurs when you upgrade your Koperator to a new version, but forget to update the KafkaCluster CRD.

Error: UPGRADE FAILED: cannot patch "kafka" with kind KafkaCluster: KafkaCluster.kafka.banzaicloud.io "kafka" is invalid

The recommended way to upgrade the Koperator is to upgrade the KafkaCluster CRD, then update the Koperator. For details, see Upgrade the operator.

20 - Kafka on Kubernetes - The Hard Way

Inspired by Kelsey Hightower’s kubernetes-the-hard-way, this comprehensive tutorial walks you through setting up a complete Kafka environment on Kubernetes using the Koperator from scratch.

What You’ll Learn

This tutorial will teach you how to:

Set up a multi-node Kubernetes cluster using kind
Install and configure all required dependencies manually
Deploy a production-ready Kafka cluster with monitoring
Test and validate your Kafka deployment
Handle disaster recovery scenarios
Troubleshoot common issues

Why “The Hard Way”?

This tutorial is called “the hard way” because it walks through each step manually rather than using automated scripts or simplified configurations. This approach helps you understand:

How each component works and interacts with others
The dependencies and relationships between services
How to troubleshoot when things go wrong
The complete architecture of a Kafka deployment on Kubernetes

Prerequisites

Before starting this tutorial, you should have:

Basic knowledge of Kubernetes concepts (pods, services, deployments)
Familiarity with Apache Kafka fundamentals
A local development machine with Docker installed
At least 8GB of RAM and 4 CPU cores available for the kind cluster

Tutorial Structure

This tutorial is organized into the following sections:

Prerequisites and Setup - Install required tools and prepare your environment
Kubernetes Cluster Setup - Create a multi-node kind cluster with proper labeling
Dependencies Installation - Install cert-manager, ZooKeeper operator, and Prometheus operator
Koperator Installation - Install the Kafka operator and its CRDs
Kafka Cluster Deployment - Deploy and configure a Kafka cluster with monitoring
Testing and Validation - Create topics, run producers/consumers, and performance tests
Disaster Recovery Scenarios - Test failure scenarios and recovery procedures
Troubleshooting - Common issues and debugging techniques

Architecture Overview

By the end of this tutorial, you’ll have deployed the following architecture:

┌─────────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster (kind)                    │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │   Control Plane │  │    Worker AZ1   │  │    Worker AZ2   │  │
│  │                 │  │                 │  │                 │  │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │   Worker AZ3    │  │   Worker AZ1    │  │   Worker AZ2    │  │
│  │                 │  │                 │  │                 │  │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
├─────────────────────────────────────────────────────────────────┤
│                        Applications                             │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │   Kafka Cluster │  │   ZooKeeper     │  │   Monitoring    │  │
│  │   (3 brokers)   │  │   (3 nodes)     │  │   Stack         │  │
│  │                 │  │                 │  │                 │  │
│  │  ┌─────────────┐│  │  ┌─────────────┐│  │  ┌─────────────┐│  │
│  │  │ Broker 101  ││  │  │    ZK-0     ││  │  │ Prometheus  ││  │
│  │  │ Broker 102  ││  │  │    ZK-1     ││  │  │ Grafana     ││  │
│  │  │ Broker 201  ││  │  │    ZK-2     ││  │  │ AlertMgr    ││  │
│  │  │ Broker 202  ││  │  └─────────────┘│  │  └─────────────┘│  │
│  │  │ Broker 301  ││  └─────────────────┘  └─────────────────┘  │
│  │  │ Broker 302  ││                                            │
│  │  └─────────────┘│                                            │
│  └─────────────────┘                                            │
├─────────────────────────────────────────────────────────────────┤
│                      Infrastructure                             │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │   cert-manager  │  │   Koperator     │  │   Cruise        │  │
│  │                 │  │                 │  │   Control       │  │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Key Features Demonstrated

This tutorial demonstrates:

Multi-AZ deployment with rack awareness
SSL/TLS encryption for secure communication
Monitoring and alerting with Prometheus and Grafana
Automatic scaling with Cruise Control
Persistent storage with proper volume management
External access configuration
Disaster recovery and failure handling

Time Commitment

Plan to spend approximately 2-3 hours completing this tutorial, depending on your familiarity with the tools and concepts involved.

Getting Started

Ready to begin? Start with the Prerequisites and Setup section.

Note: This tutorial is designed for learning and development purposes. For production deployments, consider using automated deployment tools and following your organization’s security and operational guidelines.

20.1 - Prerequisites and Setup

Prerequisites and Setup

Before starting this tutorial, you need to install several tools and prepare your development environment. This section will guide you through setting up everything required for the Kafka on Kubernetes deployment.

System Requirements

Hardware Requirements

CPU: Minimum 4 cores, recommended 8+ cores
Memory: Minimum 8GB RAM, recommended 16GB+ RAM
Storage: At least 20GB free disk space
Network: Stable internet connection for downloading container images

Operating System Support

This tutorial has been tested on:

macOS: 10.15+ (Catalina and newer)
Linux: Ubuntu 18.04+, CentOS 7+, RHEL 7+
Windows: Windows 10+ with WSL2

Required Tools Installation

1. Docker

Docker is required to run the kind Kubernetes cluster.

macOS (using Homebrew)

brew install --cask docker

Linux (Ubuntu/Debian)

# Update package index
sudo apt-get update

# Install required packages
sudo apt-get install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

# Add Docker's official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

# Set up the stable repository
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker Engine
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io

# Add your user to the docker group
sudo usermod -aG docker $USER

Windows

Download and install Docker Desktop from https://www.docker.com/products/docker-desktop

Verify Docker installation:

docker --version
docker run hello-world

2. kubectl

kubectl is the Kubernetes command-line tool.

macOS (using Homebrew)

brew install kubectl

Linux

# Download the latest release
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

# Make it executable
chmod +x kubectl

# Move to PATH
sudo mv kubectl /usr/local/bin/

Windows (using Chocolatey)

choco install kubernetes-cli

Verify kubectl installation:

kubectl version --client

3. kind (Kubernetes in Docker)

kind is a tool for running local Kubernetes clusters using Docker containers.

All Platforms

# For Linux
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.30.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# For macOS
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.30.0/kind-darwin-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# For Windows (in PowerShell)
curl.exe -Lo kind-windows-amd64.exe https://kind.sigs.k8s.io/dl/v0.30.0/kind-windows-amd64
Move-Item .\kind-windows-amd64.exe c:\some-dir-in-your-PATH\kind.exe

macOS (using Homebrew)

brew install kind

Verify kind installation:

kind version

4. Helm

Helm is the package manager for Kubernetes.

macOS (using Homebrew)

brew install helm

Linux

# Download and install
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Windows (using Chocolatey)

choco install kubernetes-helm

Verify Helm installation:

helm version

5. Git

Git is required to clone configuration files and examples.

macOS (using Homebrew)

brew install git

Linux (Ubuntu/Debian)

sudo apt-get install -y git

Windows

Download and install from https://git-scm.com/download/win

Verify Git installation:

git --version

Environment Setup

1. Create Working Directory

Create a dedicated directory for this tutorial:

mkdir -p ~/kafka-k8s-tutorial
cd ~/kafka-k8s-tutorial

2. Set Environment Variables

Set up some useful environment variables:

# Export variables for the session
export TUTORIAL_DIR=~/kafka-k8s-tutorial
export KAFKA_NAMESPACE=kafka
export ZOOKEEPER_NAMESPACE=zookeeper
export MONITORING_NAMESPACE=default

# Make them persistent (add to your shell profile)
echo "export TUTORIAL_DIR=~/kafka-k8s-tutorial" >> ~/.bashrc
echo "export KAFKA_NAMESPACE=kafka" >> ~/.bashrc
echo "export ZOOKEEPER_NAMESPACE=zookeeper" >> ~/.bashrc
echo "export MONITORING_NAMESPACE=default" >> ~/.bashrc

# Reload your shell or source the file
source ~/.bashrc

3. Verify Docker Resources

Ensure Docker has sufficient resources allocated:

# Check Docker system info
docker system info

# Check available resources
docker system df

Recommended Docker Desktop settings:

Memory: 8GB minimum, 12GB+ recommended
CPUs: 4 minimum, 6+ recommended
Disk: 20GB minimum

4. Download Tutorial Resources

Clone the reference repository for configuration files:

cd $TUTORIAL_DIR
git clone https://github.com/amuraru/k8s-kafka-the-hard-way.git
cd k8s-kafka-the-hard-way

Verification Checklist

Before proceeding to the next section, verify that all tools are properly installed:

# Check Docker
echo "Docker version:"
docker --version
echo ""

# Check kubectl
echo "kubectl version:"
kubectl version --client
echo ""

# Check kind
echo "kind version:"
kind version
echo ""

# Check Helm
echo "Helm version:"
helm version
echo ""

# Check Git
echo "Git version:"
git --version
echo ""

# Check working directory
echo "Working directory:"
ls -la $TUTORIAL_DIR

Troubleshooting Common Issues

Docker Permission Issues (Linux)

If you get permission denied errors with Docker:

# Add your user to the docker group
sudo usermod -aG docker $USER

# Log out and log back in, or run:
newgrp docker

kubectl Not Found

If kubectl is not found in your PATH:

# Check if kubectl is in your PATH
which kubectl

# If not found, ensure /usr/local/bin is in your PATH
echo $PATH

# Add to PATH if needed
export PATH=$PATH:/usr/local/bin

kind Cluster Creation Issues

If you encounter issues creating kind clusters:

# Check Docker is running
docker ps

# Check available disk space
df -h

# Check Docker resources in Docker Desktop settings

Next Steps

Once you have all prerequisites installed and verified, proceed to the Kubernetes Cluster Setup section to create your kind cluster.

Tip: Keep this terminal session open throughout the tutorial, as the environment variables and working directory will be used in subsequent steps.

20.2 - Kubernetes Cluster Setup

Kubernetes Cluster Setup

In this section, you’ll create a multi-node Kubernetes cluster using kind (Kubernetes in Docker) that simulates a production-like environment with multiple availability zones.

Cluster Architecture

We’ll create a 7-node cluster with the following configuration:

1 Control Plane node: Manages the Kubernetes API and cluster state
6 Worker nodes: Distributed across 3 simulated availability zones (2 nodes per AZ)

This setup allows us to demonstrate:

Multi-AZ deployment patterns
Rack awareness for Kafka brokers
High availability configurations
Realistic failure scenarios

Create Cluster Configuration

First, create the kind cluster configuration file:

cd $TUTORIAL_DIR

# Create kind configuration directory
mkdir -p ~/.kind

# Create the cluster configuration
cat > ~/.kind/kind-config.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    kubeadmConfigPatches:
    - |
      kind: InitConfiguration
      nodeRegistration:
        kubeletExtraArgs:
          node-labels: "ingress-ready=true"
  - role: worker
  - role: worker
  - role: worker
  - role: worker
  - role: worker
  - role: worker
containerdConfigPatches:
- |-
  [plugins."io.containerd.grpc.v1.cri".containerd]
    snapshotter = "overlayfs"
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
    endpoint = ["http://localhost:5000"]
EOF

Create the Kubernetes Cluster

Now create the kind cluster:

# Create the cluster (this may take 5-10 minutes)
kind create cluster \
  --name kafka \
  --config ~/.kind/kind-config.yaml \
  --image kindest/node:v1.33.4

# Wait for cluster to be ready
echo "Waiting for cluster to be ready..."
kubectl wait --for=condition=Ready nodes --all --timeout=300s

Expected output:

Creating cluster "kafka" ...
 ✓ Ensuring node image (kindest/node:v1.33.4) 🖼
 ✓ Preparing nodes 📦 📦 📦 📦 📦 📦 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-kafka"
You can now use your cluster with:

kubectl cluster-info --context kind-kafka

Verify Cluster Creation

Verify that all nodes are running and ready:

# Check cluster info
kubectl cluster-info --context kind-kafka

# List all nodes
kubectl get nodes -o wide

# Check node status
kubectl get nodes --show-labels

Expected output:

NAME                  STATUS   ROLES           AGE   VERSION
kafka-control-plane   Ready    control-plane   2m    v1.33.4
kafka-worker          Ready    <none>          2m    v1.33.4
kafka-worker2         Ready    <none>          2m    v1.33.4
kafka-worker3         Ready    <none>          2m    v1.33.4
kafka-worker4         Ready    <none>          2m    v1.33.4
kafka-worker5         Ready    <none>          2m    v1.33.4
kafka-worker6         Ready    <none>          2m    v1.33.4

Configure Multi-AZ Simulation

To simulate a multi-availability zone environment, we’ll label the nodes with region and zone information:

1. Label Nodes with Region

First, label all worker nodes with the same region:

# Label all worker nodes with region
kubectl label nodes \
  kafka-worker \
  kafka-worker2 \
  kafka-worker3 \
  kafka-worker4 \
  kafka-worker5 \
  kafka-worker6 \
  topology.kubernetes.io/region=region1

2. Label Nodes with Availability Zones

Now distribute the worker nodes across three availability zones:

# AZ1: kafka-worker and kafka-worker2
kubectl label nodes kafka-worker kafka-worker2 \
  topology.kubernetes.io/zone=az1

# AZ2: kafka-worker3 and kafka-worker4
kubectl label nodes kafka-worker3 kafka-worker4 \
  topology.kubernetes.io/zone=az2

# AZ3: kafka-worker5 and kafka-worker6
kubectl label nodes kafka-worker5 kafka-worker6 \
  topology.kubernetes.io/zone=az3

3. Verify Zone Configuration

Check that the zone labels are correctly applied:

# Display nodes with region and zone labels
kubectl get nodes \
  --label-columns=topology.kubernetes.io/region,topology.kubernetes.io/zone

# Show detailed node information
kubectl describe nodes | grep -E "Name:|topology.kubernetes.io"

Expected output:

NAME                  STATUS   ROLES           AGE   VERSION   REGION    ZONE
kafka-control-plane   Ready    control-plane   5m    v1.33.4   <none>    <none>
kafka-worker          Ready    <none>          5m    v1.33.4   region1   az1
kafka-worker2         Ready    <none>          5m    v1.33.4   region1   az1
kafka-worker3         Ready    <none>          5m    v1.33.4   region1   az2
kafka-worker4         Ready    <none>          5m    v1.33.4   region1   az2
kafka-worker5         Ready    <none>          5m    v1.33.4   region1   az2
kafka-worker6         Ready    <none>          5m    v1.33.4   region1   az3

Configure kubectl Context

Ensure you’re using the correct kubectl context:

# Set the current context to the kind cluster
kubectl config use-context kind-kafka

# Verify current context
kubectl config current-context

# Test cluster access
kubectl get namespaces

Cluster Resource Verification

Check the cluster’s available resources:

# Check node resources
kubectl top nodes 2>/dev/null || echo "Metrics server not yet available"

# Check cluster capacity
kubectl describe nodes | grep -A 5 "Capacity:"

# Check storage classes
kubectl get storageclass

# Check default namespace
kubectl get all

Understanding the Cluster Layout

Your cluster now has the following topology:

┌─────────────────────────────────────────────────────────────────┐
│                         kind-kafka cluster                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐                                            │
│  │ Control Plane   │                                            │
│  │ kafka-control-  │                                            │
│  │ plane           │                                            │
│  └─────────────────┘                                            │
│                                                                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │      AZ1        │  │      AZ2        │  │      AZ3        │  │
│  │                 │  │                 │  │                 │  │
│  │ kafka-worker    │  │ kafka-worker3   │  │ kafka-worker5   │  │
│  │ kafka-worker2   │  │ kafka-worker4   │  │ kafka-worker6   │  │
│  │                 │  │                 │  │                 │  │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Troubleshooting

Cluster Creation Issues

If cluster creation fails:

# Delete the failed cluster
kind delete cluster --name kafka

# Check Docker resources
docker system df
docker system prune -f

# Retry cluster creation
kind create cluster --name kafka --config ~/.kind/kind-config.yaml --image kindest/node:v1.33.4

Node Not Ready

If nodes are not ready:

# Check node status
kubectl describe nodes

# Check system pods
kubectl get pods -n kube-system

# Check kubelet logs (from Docker)
docker logs kafka-worker

Context Issues

If kubectl context is not set correctly:

# List available contexts
kubectl config get-contexts

# Set the correct context
kubectl config use-context kind-kafka

# Verify
kubectl config current-context

Cluster Cleanup (Optional)

If you need to start over:

# Delete the cluster
kind delete cluster --name kafka

# Verify deletion
kind get clusters

# Remove configuration
rm ~/.kind/kind-config.yaml

Next Steps

With your Kubernetes cluster ready and properly configured with multi-AZ simulation, you can now proceed to install the required dependencies. Continue to the Dependencies Installation section.

Note: The cluster will persist until you explicitly delete it with kind delete cluster --name kafka. You can stop and start Docker without losing your cluster state.

20.3 - Dependencies Installation

Dependencies Installation

Before installing the Koperator, we need to set up several dependencies that are required for a complete Kafka deployment. This section covers the installation of cert-manager, ZooKeeper operator, and Prometheus operator.

Overview

The dependencies we’ll install are:

cert-manager: Manages TLS certificates for secure communication
ZooKeeper Operator: Manages ZooKeeper clusters (required for traditional Kafka deployments)
Prometheus Operator: Provides monitoring and alerting capabilities

1. Install cert-manager

cert-manager is essential for TLS certificate management in Kafka deployments.

Install cert-manager CRDs

First, install the Custom Resource Definitions:

# Install cert-manager CRDs
kubectl create --validate=false -f https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.crds.yaml

Expected output:

customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created

Create cert-manager Namespace

# Create namespace for cert-manager
kubectl create namespace cert-manager

Install cert-manager using Helm

# Add cert-manager Helm repository
helm repo add cert-manager https://charts.jetstack.io
helm repo update

# Install cert-manager
helm install cert-manager cert-manager/cert-manager \
  --namespace cert-manager \
  --version v1.18.2

# Wait for cert-manager to be ready
kubectl wait --for=condition=Available deployment --all -n cert-manager --timeout=300s

Verify cert-manager Installation

# Check cert-manager pods
kubectl get pods -n cert-manager

# Check cert-manager services
kubectl get svc -n cert-manager

# Verify cert-manager is working
kubectl get certificates -A

Expected output:

NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-cainjector-7d55bf8f78-xyz123   1/1     Running   0          2m
cert-manager-webhook-97f8b47bc-abc456       1/1     Running   0          2m
cert-manager-7dd5854bb4-def789              1/1     Running   0          2m

2. Install ZooKeeper Operator

The ZooKeeper operator manages ZooKeeper clusters required by Kafka.

Create ZooKeeper Namespace

# Create namespace for ZooKeeper
kubectl create namespace zookeeper

Install ZooKeeper Operator CRDs

# Install ZooKeeper CRDs
kubectl create -f https://raw.githubusercontent.com/adobe/zookeeper-operator/master/config/crd/bases/zookeeper.pravega.io_zookeeperclusters.yaml

Clone ZooKeeper Operator Repository

# Clone the ZooKeeper operator repository
cd $TUTORIAL_DIR
rm -rf /tmp/zookeeper-operator
git clone --single-branch --branch master https://github.com/adobe/zookeeper-operator /tmp/zookeeper-operator
cd /tmp/zookeeper-operator

Install ZooKeeper Operator using Helm

# Install ZooKeeper operator
helm template zookeeper-operator \
  --namespace=zookeeper \
  --set crd.create=false \
  --set image.repository='adobe/zookeeper-operator' \
  --set image.tag='0.2.15-adobe-20250914' \
  ./charts/zookeeper-operator | kubectl create -n zookeeper -f -

# Wait for operator to be ready
kubectl wait --for=condition=Available deployment --all -n zookeeper --timeout=300s

Deploy ZooKeeper Cluster

Create a 3-node ZooKeeper cluster:

# Create ZooKeeper cluster
kubectl create --namespace zookeeper -f - <<EOF
apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
  name: zk
  namespace: zookeeper
spec:
  replicas: 3
  image:
    repository: adobe/zookeeper
    tag: 3.8.4-0.2.15-adobe-20250914
    pullPolicy: IfNotPresent
  config:
    initLimit: 10
    tickTime: 2000
    syncLimit: 5
  probes:
    livenessProbe:
      initialDelaySeconds: 41
  persistence:
    reclaimPolicy: Delete
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi
EOF

Verify ZooKeeper Installation

# Check ZooKeeper cluster status
kubectl get zookeepercluster -n zookeeper -o wide

# Watch ZooKeeper cluster creation
kubectl get pods -n zookeeper -w
# Press Ctrl+C to stop watching when all pods are running

# Check ZooKeeper services
kubectl get svc -n zookeeper

# Verify ZooKeeper cluster is ready
kubectl wait --for=condition=Ready pod --all -n zookeeper --timeout=600s

Expected output:

NAME   REPLICAS   READY REPLICAS   VERSION                           DESIRED VERSION                   INTERNAL ENDPOINT    EXTERNAL ENDPOINT   AGE
zk     3          3               3.8.4-0.2.15-adobe-20250914      3.8.4-0.2.15-adobe-20250914      zk-client:2181                           5m

3. Install Prometheus Operator

The Prometheus operator provides comprehensive monitoring for Kafka and ZooKeeper.

Install Prometheus Operator CRDs

# Install Prometheus Operator CRDs
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml
kubectl create -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/example/prometheus-operator-crd/monitoring.coreos.com_probes.yaml

Install Prometheus Stack using Helm

# Add Prometheus community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Install kube-prometheus-stack (includes Prometheus, Grafana, and AlertManager)
helm install monitoring \
  --namespace=default \
  prometheus-community/kube-prometheus-stack \
  --set prometheusOperator.createCustomResource=false

# Wait for monitoring stack to be ready
kubectl wait --for=condition=Available deployment --all -n default --timeout=600s

Verify Prometheus Installation

# Check monitoring pods
kubectl get pods -l release=monitoring

# Check monitoring services
kubectl get svc -l release=monitoring

# Check Prometheus targets (optional)
kubectl get prometheus -o wide

Expected output:

NAME                                                   READY   STATUS    RESTARTS   AGE
monitoring-kube-prometheus-prometheus-node-exporter-*  1/1     Running   0          3m
monitoring-kube-state-metrics-*                       1/1     Running   0          3m
monitoring-prometheus-operator-*                      1/1     Running   0          3m
monitoring-grafana-*                                  3/3     Running   0          3m

Access Monitoring Dashboards

Get Grafana Admin Password

# Get Grafana admin password
kubectl get secret --namespace default monitoring-grafana \
  -o jsonpath="{.data.admin-password}" | base64 --decode
echo ""

Set Up Port Forwarding (Optional)

You can access the monitoring dashboards using port forwarding:

# Prometheus (in a separate terminal)
kubectl --namespace default port-forward svc/monitoring-kube-prometheus-prometheus 9090 &

# Grafana (in a separate terminal)
kubectl --namespace default port-forward svc/monitoring-grafana 3000:80 &

# AlertManager (in a separate terminal)
kubectl --namespace default port-forward svc/monitoring-kube-prometheus-alertmanager 9093 &

Access the dashboards at:

Prometheus: http://localhost:9090
Grafana: http://localhost:3000 (admin/[password from above])
AlertManager: http://localhost:9093

Verification Summary

Verify all dependencies are properly installed:

echo "=== cert-manager ==="
kubectl get pods -n cert-manager

echo -e "\n=== ZooKeeper ==="
kubectl get pods -n zookeeper
kubectl get zookeepercluster -n zookeeper

echo -e "\n=== Monitoring ==="
kubectl get pods -l release=monitoring

echo -e "\n=== All Namespaces ==="
kubectl get namespaces

Troubleshooting

cert-manager Issues

# Check cert-manager logs
kubectl logs -n cert-manager deployment/cert-manager

# Check webhook connectivity
kubectl get validatingwebhookconfigurations

ZooKeeper Issues

# Check ZooKeeper operator logs
kubectl logs -n zookeeper deployment/zookeeper-operator

# Check ZooKeeper cluster events
kubectl describe zookeepercluster zk -n zookeeper

Prometheus Issues

# Check Prometheus operator logs
kubectl logs -l app.kubernetes.io/name=prometheus-operator

# Check Prometheus configuration
kubectl get prometheus -o yaml

Next Steps

With all dependencies successfully installed, you can now proceed to install the Koperator itself. Continue to the Koperator Installation section.

Note: The monitoring stack will start collecting metrics immediately. You can explore the Grafana dashboards to see cluster metrics even before deploying Kafka.

20.4 - Koperator Installation

Koperator Installation

In this section, you’ll install the Koperator (formerly BanzaiCloud Kafka Operator), which will manage your Kafka clusters on Kubernetes. The installation process involves installing Custom Resource Definitions (CRDs) and the operator itself.

Overview

The Koperator installation consists of:

Creating the Kafka namespace
Installing Koperator CRDs (Custom Resource Definitions)
Installing the Koperator using Helm
Verifying the installation

1. Create Kafka Namespace

First, create a dedicated namespace for Kafka resources:

# Create namespace for Kafka
kubectl create namespace kafka

# Verify namespace creation
kubectl get namespaces | grep kafka

Expected output:

kafka         Active   10s

2. Install Koperator CRDs

The Koperator requires several Custom Resource Definitions to manage Kafka clusters, topics, and users.

Install Required CRDs

# Install KafkaCluster CRD
kubectl apply --server-side -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml

# Install KafkaTopic CRD
kubectl apply --server-side -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkatopics.yaml

# Install KafkaUser CRD
kubectl apply --server-side -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkausers.yaml

# Install CruiseControlOperation CRD
kubectl apply --server-side -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_cruisecontroloperations.yaml

Verify CRD Installation

# Check that all CRDs are installed
kubectl get crd | grep kafka.banzaicloud.io

# Get detailed information about the CRDs
kubectl get crd kafkaclusters.kafka.banzaicloud.io -o yaml | head -20

Expected output:

cruisecontroloperations.kafka.banzaicloud.io   2024-01-15T10:30:00Z
kafkaclusters.kafka.banzaicloud.io             2024-01-15T10:30:00Z
kafkatopics.kafka.banzaicloud.io               2024-01-15T10:30:00Z
kafkausers.kafka.banzaicloud.io                2024-01-15T10:30:00Z

3. Install Koperator using Helm

Now install the Koperator using the OCI Helm chart:

# Install Koperator using OCI Helm chart
helm install kafka-operator oci://ghcr.io/adobe/helm-charts/kafka-operator \
  --namespace=kafka \
  --set webhook.enabled=false \
  --version 0.28.0-adobe-20250923

# Wait for the operator to be ready
kubectl wait --for=condition=Available deployment --all -n kafka --timeout=300s

Expected output:

Pulled: ghcr.io/adobe/helm-charts/kafka-operator:0.28.0-adobe-20250923
Digest: sha256:...
NAME: kafka-operator
LAST DEPLOYED: Mon Jan 15 10:35:00 2024
NAMESPACE: kafka
STATUS: deployed
REVISION: 1

4. Verify Koperator Installation

Check Operator Pods

# Check Koperator pods
kubectl get pods -n kafka

# Check pod details
kubectl describe pods -n kafka

# Check operator logs
kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager -n kafka

Expected output:

NAME                                    READY   STATUS    RESTARTS   AGE
kafka-operator-operator-xyz123-abc456   2/2     Running   0          2m

Check Operator Services

# Check services in kafka namespace
kubectl get svc -n kafka

# Check operator deployment
kubectl get deployment -n kafka

Verify Operator Functionality

# Check if the operator is watching for KafkaCluster resources
kubectl get kafkaclusters -n kafka

# Check operator configuration
kubectl get deployment kafka-operator-operator -n kafka -o yaml | grep -A 10 -B 10 image:

5. Understanding Koperator Components

The Koperator installation includes several components:

Manager Container

Purpose: Main operator logic
Responsibilities: Watches Kafka CRDs and manages Kafka clusters
Resource Management: Creates and manages Kafka broker pods, services, and configurations

Webhook (Disabled)

Purpose: Admission control and validation
Status: Disabled in this tutorial for simplicity
Production Note: Should be enabled in production environments

RBAC Resources

The operator creates several RBAC resources:

ServiceAccount
ClusterRole and ClusterRoleBinding
Role and RoleBinding

# Check RBAC resources
kubectl get serviceaccount -n kafka
kubectl get clusterrole | grep kafka-operator
kubectl get rolebinding -n kafka

6. Operator Configuration

View Operator Configuration

# Check operator deployment configuration
kubectl get deployment kafka-operator-operator -n kafka -o yaml

# Check operator environment variables
kubectl get deployment kafka-operator-operator -n kafka -o jsonpath='{.spec.template.spec.containers[0].env}' | jq .

Key Configuration Options

The operator is configured with the following key settings:

Webhook disabled: Simplifies the tutorial setup
Namespace: Operates in the kafka namespace
Image: Uses the official Adobe Koperator image
Version: 0.28.0-adobe-20250923

7. Operator Capabilities

The Koperator provides the following capabilities:

Kafka Cluster Management

Automated broker deployment and scaling
Rolling updates and configuration changes
Persistent volume management
Network policy configuration

Security Features

TLS/SSL certificate management
SASL authentication support
Network encryption
User and ACL management

Monitoring Integration

JMX metrics exposure
Prometheus integration
Grafana dashboard support
Custom alerting rules

Advanced Features

Cruise Control integration for rebalancing
External access configuration
Multi-AZ deployment support
Rack awareness

8. Troubleshooting

Operator Not Starting

If the operator pod is not starting:

# Check pod events
kubectl describe pod -l app.kubernetes.io/instance=kafka-operator -n kafka

# Check operator logs
kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager -n kafka --previous

# Check resource constraints
kubectl top pod -n kafka

CRD Issues

If CRDs are not properly installed:

# Reinstall CRDs
kubectl delete crd kafkaclusters.kafka.banzaicloud.io
kubectl apply --server-side -f https://raw.githubusercontent.com/adobe/koperator/refs/heads/master/config/base/crds/kafka.banzaicloud.io_kafkaclusters.yaml

# Check CRD status
kubectl get crd kafkaclusters.kafka.banzaicloud.io -o yaml

Helm Installation Issues

If Helm installation fails:

# Check Helm release status
helm list -n kafka

# Uninstall and reinstall
helm uninstall kafka-operator -n kafka
helm install kafka-operator oci://ghcr.io/adobe/helm-charts/kafka-operator \
  --namespace=kafka \
  --set webhook.enabled=false \
  --version 0.28.0-adobe-20250923

9. Verification Checklist

Before proceeding to the next section, ensure:

echo "=== Namespace ==="
kubectl get namespace kafka

echo -e "\n=== CRDs ==="
kubectl get crd | grep kafka.banzaicloud.io

echo -e "\n=== Operator Pod ==="
kubectl get pods -n kafka

echo -e "\n=== Operator Logs (last 10 lines) ==="
kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager -n kafka --tail=10

echo -e "\n=== Ready for Kafka Cluster Deployment ==="
kubectl get kafkaclusters -n kafka

Expected final output:

=== Namespace ===
NAME    STATUS   AGE
kafka   Active   10m

=== CRDs ===
cruisecontroloperations.kafka.banzaicloud.io   2024-01-15T10:30:00Z
kafkaclusters.kafka.banzaicloud.io             2024-01-15T10:30:00Z
kafkatopics.kafka.banzaicloud.io               2024-01-15T10:30:00Z
kafkausers.kafka.banzaicloud.io                2024-01-15T10:30:00Z

=== Operator Pod ===
NAME                                    READY   STATUS    RESTARTS   AGE
kafka-operator-operator-xyz123-abc456   2/2     Running   0          5m

=== Ready for Kafka Cluster Deployment ===
No resources found in kafka namespace.

Next Steps

With the Koperator successfully installed and running, you’re now ready to deploy a Kafka cluster. Continue to the Kafka Cluster Deployment section to create your first Kafka cluster.

Note: The operator will continuously monitor the kafka namespace for KafkaCluster resources. Once you create a KafkaCluster resource, the operator will automatically provision the necessary Kafka infrastructure.

20.5 - Kafka Cluster Deployment

Kafka Cluster Deployment

In this section, you’ll deploy a production-ready Kafka cluster with monitoring, alerting, and dashboard integration. The deployment will demonstrate multi-AZ distribution, persistent storage, and comprehensive observability.

Overview

We’ll deploy:

A 6-broker Kafka cluster distributed across 3 availability zones
Prometheus monitoring with ServiceMonitor resources
AlertManager rules for auto-scaling and alerting
Grafana dashboard for Kafka metrics visualization
Cruise Control for cluster management and rebalancing

1. Deploy Kafka Cluster

Create Kafka Cluster Configuration

First, let’s create a comprehensive Kafka cluster configuration:

cd $TUTORIAL_DIR

# Create the KafkaCluster resource
kubectl create -n kafka -f - <<EOF
apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka
  namespace: kafka
spec:
  headlessServiceEnabled: true
  zkAddresses:
    - "zk-client.zookeeper:2181"
  rackAwareness:
    labels:
      - "topology.kubernetes.io/zone"
  brokerConfigGroups:
    default:
      brokerAnnotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9020"
      storageConfigs:
        - mountPath: "/kafka-logs"
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourceRequirements:
        limits:
          cpu: "2"
          memory: "4Gi"
        requests:
          cpu: "1"
          memory: "2Gi"
      jvmPerformanceOpts: "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Dsun.awt.fontpath=/usr/share/fonts/TTF"
      config:
        "auto.create.topics.enable": "true"
        "cruise.control.metrics.topic.auto.create": "true"
        "cruise.control.metrics.topic.num.partitions": "1"
        "cruise.control.metrics.topic.replication.factor": "2"
        "default.replication.factor": "2"
        "min.insync.replicas": "1"
        "num.partitions": "3"
        "offsets.topic.replication.factor": "2"
        "replica.lag.time.max.ms": "30000"
        "transaction.state.log.replication.factor": "2"
        "transaction.state.log.min.isr": "1"
  brokers:
    - id: 101
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
    - id: 102
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
    - id: 201
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
    - id: 202
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
    - id: 301
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
    - id: 302
      brokerConfigGroup: "default"
      nodePortExternalIP:
        externalIP: "127.0.0.1"
  rollingUpgradeConfig:
    failureThreshold: 1
  cruiseControlConfig:
    cruiseControlEndpoint: "kafka-cruisecontrol-svc.kafka:8090"
    config: |
      # Copyright 2017 LinkedIn Corp. Licensed under the BSD 2-Clause License (the "License").
      # Sample Cruise Control configuration file.
      
      # Configuration for the metadata client.
      # =======================================
      
      # The maximum interval in milliseconds between two metadata refreshes.
      metadata.max.age.ms=300000
      
      # Client id for the Cruise Control. It is used for the metadata client.
      client.id=kafka-cruise-control
      
      # The size of TCP send buffer for Kafka network client.
      send.buffer.bytes=131072
      
      # The size of TCP receive buffer for Kafka network client.
      receive.buffer.bytes=131072
      
      # The time to wait for response from a server.
      request.timeout.ms=30000
      
      # Configurations for the load monitor
      # ===================================
      
      # The number of metric fetcher thread to fetch metrics for the Kafka cluster
      num.metric.fetchers=1
      
      # The metric sampler class
      metric.sampler.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler
      
      # Configurations for CruiseControlMetricsReporter
      cruise.control.metrics.reporter.interval.ms=10000
      cruise.control.metrics.reporter.kubernetes.mode=true
      
      # The sample store class name
      sample.store.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.KafkaSampleStore
      
      # The config for the Kafka sample store to save the partition metric samples
      partition.metric.sample.store.topic=__CruiseControlMetrics
      
      # The config for the Kafka sample store to save the model training samples
      broker.metric.sample.store.topic=__CruiseControlModelTrainingSamples
      
      # The replication factor of Kafka metric sample store topic
      sample.store.topic.replication.factor=2
      
      # The config for the number of Kafka sample store consumer threads
      num.sample.loading.threads=8
      
      # The partition assignor class for the metric samplers
      metric.sampler.partition.assignor.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.DefaultMetricSamplerPartitionAssignor
      
      # The metric sampling interval in milliseconds
      metric.sampling.interval.ms=120000
      
      # The partition metrics window size in milliseconds
      partition.metrics.window.ms=300000
      
      # The number of partition metric windows to keep in memory
      num.partition.metrics.windows=1
      
      # The minimum partition metric samples required for a partition in each window
      min.samples.per.partition.metrics.window=1
      
      # The broker metrics window size in milliseconds
      broker.metrics.window.ms=300000
      
      # The number of broker metric windows to keep in memory
      num.broker.metrics.windows=20
      
      # The minimum broker metric samples required for a broker in each window
      min.samples.per.broker.metrics.window=1
      
      # The configuration for the BrokerCapacityConfigFileResolver (supports JBOD and non-JBOD broker capacities)
      capacity.config.file=config/capacity.json
      
      # Configurations for the analyzer
      # ===============================
      
      # The list of goals to optimize the Kafka cluster for with pre-computed proposals
      default.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal
      
      # The list of supported goals
      goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PreferredLeaderElectionGoal
      
      # The list of supported hard goals
      hard.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal
      
      # The minimum percentage of well monitored partitions out of all the partitions
      min.monitored.partition.percentage=0.95
      
      # The balance threshold for CPU
      cpu.balance.threshold=1.1
      
      # The balance threshold for disk
      disk.balance.threshold=1.1
      
      # The balance threshold for network inbound utilization
      network.inbound.balance.threshold=1.1
      
      # The balance threshold for network outbound utilization
      network.outbound.balance.threshold=1.1
      
      # The balance threshold for the replica count
      replica.count.balance.threshold=1.1
      
      # The capacity threshold for CPU in percentage
      cpu.capacity.threshold=0.8
      
      # The capacity threshold for disk in percentage
      disk.capacity.threshold=0.8
      
      # The capacity threshold for network inbound utilization in percentage
      network.inbound.capacity.threshold=0.8
      
      # The capacity threshold for network outbound utilization in percentage
      network.outbound.capacity.threshold=0.8
      
      # The threshold for the number of replicas per broker
      replica.capacity.threshold=1000
      
      # The weight adjustment in the optimization algorithm
      cpu.low.utilization.threshold=0.0
      disk.low.utilization.threshold=0.0
      network.inbound.low.utilization.threshold=0.0
      network.outbound.low.utilization.threshold=0.0
      
      # The metric anomaly percentile upper threshold
      metric.anomaly.percentile.upper.threshold=90.0
      
      # The metric anomaly percentile lower threshold
      metric.anomaly.percentile.lower.threshold=10.0
      
      # How often should the cached proposal be expired and recalculated if necessary
      proposal.expiration.ms=60000
      
      # The maximum number of replicas that can reside on a broker at any given time.
      max.replicas.per.broker=10000
      
      # The number of threads to use for proposal candidate precomputing.
      num.proposal.precompute.threads=1
      
      # the topics that should be excluded from the partition movement.
      #topics.excluded.from.partition.movement
      
      # Configurations for the executor
      # ===============================
      
      # The max number of partitions to move in/out on a given broker at a given time.
      num.concurrent.partition.movements.per.broker=10
      
      # The interval between two execution progress checks.
      execution.progress.check.interval.ms=10000
      
      # Configurations for anomaly detector
      # ===================================
      
      # The goal violation notifier class
      anomaly.notifier.class=com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier
      
      # The metric anomaly finder class
      metric.anomaly.finder.class=com.linkedin.kafka.cruisecontrol.detector.KafkaMetricAnomalyFinder
      
      # The anomaly detection interval
      anomaly.detection.interval.ms=10000
      
      # The goal violation to detect.
      anomaly.detection.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal
      
      # The interested metrics for metric anomaly analyzer.
      metric.anomaly.analyzer.metrics=BROKER_PRODUCE_LOCAL_TIME_MS_MAX,BROKER_PRODUCE_LOCAL_TIME_MS_MEAN,BROKER_CONSUMER_FETCH_LOCAL_TIME_MS_MAX,BROKER_CONSUMER_FETCH_LOCAL_TIME_MS_MEAN,BROKER_FOLLOWER_FETCH_LOCAL_TIME_MS_MAX,BROKER_FOLLOWER_FETCH_LOCAL_TIME_MS_MEAN,BROKER_LOG_FLUSH_TIME_MS_MAX,BROKER_LOG_FLUSH_TIME_MS_MEAN
      
      # The zk path to store the anomaly detector state. This is to avoid duplicate anomaly detection due to controller failure.
      anomaly.detection.state.path=/CruiseControlAnomalyDetector/AnomalyDetectorState
      
      # Enable self healing for all anomaly detectors, unless the particular anomaly detector is explicitly disabled
      self.healing.enabled=true
      
      # Enable self healing for broker failure detector
      self.healing.broker.failure.enabled=true
      
      # Enable self healing for goal violation detector
      self.healing.goal.violation.enabled=true
      
      # Enable self healing for metric anomaly detector
      self.healing.metric.anomaly.enabled=true
      
      # configurations for the webserver
      # ================================
      
      # HTTP listen port for the Cruise Control
      webserver.http.port=8090
      
      # HTTP listen address for the Cruise Control
      webserver.http.address=0.0.0.0
      
      # Whether CORS support is enabled for API or not
      webserver.http.cors.enabled=false
      
      # Value for Access-Control-Allow-Origin
      webserver.http.cors.origin=http://localhost:8090/
      
      # Value for Access-Control-Request-Method
      webserver.http.cors.allowmethods=OPTIONS,GET,POST
      
      # Headers that should be exposed to the Browser (Webapp)
      # This is a special header that is used by the
      # User Tasks subsystem and should be explicitly
      # Enabled when CORS mode is used as part of the
      # Admin Interface
      webserver.http.cors.exposeheaders=User-Task-ID
      
      # REST API default prefix (dont forget the ending /*)
      webserver.api.urlprefix=/kafkacruisecontrol/*
      
      # Location where the Cruise Control frontend is deployed
      webserver.ui.diskpath=./cruise-control-ui/dist/
      
      # URL path prefix for UI
      webserver.ui.urlprefix=/*
      
      # Time After which request is converted to Async
      webserver.request.maxBlockTimeMs=10000
      
      # Default Session Expiry Period
      webserver.session.maxExpiryTimeMs=60000
      
      # Session cookie path
      webserver.session.path=/
      
      # Server Access Logs
      webserver.accesslog.enabled=true
      
      # Location of HTTP Request Logs
      webserver.accesslog.path=access.log
      
      # HTTP Request Log retention days
      webserver.accesslog.retention.days=14
EOF

Monitor Cluster Deployment

Watch the cluster deployment progress:

# Watch KafkaCluster status
kubectl get kafkacluster kafka -n kafka -w -o wide
# Press Ctrl+C to stop watching when cluster is ready

# Check broker pods
kubectl get pods -n kafka -l kafka_cr=kafka

# Check all resources in kafka namespace
kubectl get all -n kafka

Expected output (after 5-10 minutes):

NAME    AGE   WARNINGS
kafka   10m   

NAME                                    READY   STATUS    RESTARTS   AGE
pod/kafka-101-xyz123                    1/1     Running   0          8m
pod/kafka-102-abc456                    1/1     Running   0          8m
pod/kafka-201-def789                    1/1     Running   0          8m
pod/kafka-202-ghi012                    1/1     Running   0          8m
pod/kafka-301-jkl345                    1/1     Running   0          8m
pod/kafka-302-mno678                    1/1     Running   0          8m
pod/kafka-cruisecontrol-xyz789          1/1     Running   0          6m

2. Configure Monitoring

Create Prometheus ServiceMonitor

Create monitoring configuration for Prometheus to scrape Kafka metrics:

# Create ServiceMonitor for Kafka metrics
kubectl apply -n kafka -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kafka-servicemonitor
  namespace: kafka
  labels:
    app: kafka
    release: monitoring
spec:
  selector:
    matchLabels:
      app: kafka
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kafka-jmx-servicemonitor
  namespace: kafka
  labels:
    app: kafka-jmx
    release: monitoring
spec:
  selector:
    matchLabels:
      app: kafka
  endpoints:
  - port: jmx-metrics
    interval: 30s
    path: /metrics
EOF

Create AlertManager Rules

Set up alerting rules for Kafka monitoring and auto-scaling:

# Create PrometheusRule for Kafka alerting
kubectl apply -n kafka -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: kafka-alerts
  namespace: kafka
  labels:
    app: kafka
    release: monitoring
spec:
  groups:
  - name: kafka.rules
    rules:
    - alert: KafkaOfflinePartitions
      expr: kafka_controller_kafkacontroller_offlinepartitionscount > 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Kafka has offline partitions"
        description: "Kafka cluster {{ \$labels.instance }} has {{ \$value }} offline partitions"
    
    - alert: KafkaUnderReplicatedPartitions
      expr: kafka_server_replicamanager_underreplicatedpartitions > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Kafka has under-replicated partitions"
        description: "Kafka cluster {{ \$labels.instance }} has {{ \$value }} under-replicated partitions"
    
    - alert: KafkaHighProducerRequestRate
      expr: rate(kafka_network_requestmetrics_requests_total{request="Produce"}[5m]) > 1000
      for: 10m
      labels:
        severity: warning
        command: "upScale"
      annotations:
        summary: "High Kafka producer request rate"
        description: "Kafka producer request rate is {{ \$value }} requests/sec"
    
    - alert: KafkaLowProducerRequestRate
      expr: rate(kafka_network_requestmetrics_requests_total{request="Produce"}[5m]) < 100
      for: 30m
      labels:
        severity: info
        command: "downScale"
      annotations:
        summary: "Low Kafka producer request rate"
        description: "Kafka producer request rate is {{ \$value }} requests/sec"
EOF

3. Load Grafana Dashboard

Apply Complete Kafka Dashboard

The complete Kafka Looking Glass dashboard provides comprehensive monitoring with dozens of panels covering all aspects of Kafka performance:

# Apply the complete Kafka Looking Glass dashboard directly
kubectl apply -n default \
  -f https://raw.githubusercontent.com/amuraru/k8s-kafka-operator/master/grafana-dashboard.yaml

Dashboard Features

The complete Kafka Looking Glass dashboard includes:

Overview Section:

Brokers online count
Cluster version information
Active controllers
Topic count
Offline partitions
Under-replicated partitions

Performance Metrics:

Message throughput (in/out per second)
Bytes throughput (in/out per second)
Request latency breakdown
Network request metrics
Replication rates

Broker Health:

JVM memory usage
Garbage collection metrics
Thread states
Log flush times
Disk usage

Topic Analysis:

Per-topic throughput
Partition distribution
Leader distribution
Consumer lag metrics

ZooKeeper Integration:

ZooKeeper quorum size
Leader count
Request latency
Digest mismatches

Error Monitoring:

Offline broker disks
Orphan replicas
Under-replicated partitions
Network issues

4. Verify Deployment

Check Cluster Status

# Describe the KafkaCluster
kubectl describe kafkacluster kafka -n kafka

# Check broker distribution across zones
kubectl get pods -n kafka -l kafka_cr=kafka -o wide

# Check persistent volumes
kubectl get pv,pvc -n kafka

Access Cruise Control

# Port forward to Cruise Control (in a separate terminal)
kubectl port-forward -n kafka svc/kafka-cruisecontrol-svc 8090:8090 &

# Check Cruise Control status (optional)
curl -s http://localhost:8090/kafkacruisecontrol/v1/state | jq .

Verify Monitoring Integration

# Check if Prometheus is scraping Kafka metrics
kubectl port-forward -n default svc/monitoring-kube-prometheus-prometheus 9090 &

# Visit http://localhost:9090 and search for kafka_ metrics

Access the Kafka Looking Glass Dashboard

# Get Grafana admin password
kubectl get secret --namespace default monitoring-grafana \
  -o jsonpath="{.data.admin-password}" | base64 --decode
echo ""

# Port forward to Grafana
kubectl port-forward -n default svc/monitoring-grafana 3000:80 &

Visit http://localhost:3000 and:

Login with admin/[password from above]
Navigate to Dashboards → Browse
Look for “Kafka Looking Glass” dashboard
The dashboard should show real-time metrics from your Kafka cluster

The dashboard will automatically detect your cluster using the template variables:

Namespace: Should auto-select “kafka”
Cluster Name: Should auto-select “kafka”
Broker: Shows all brokers (101, 102, 201, 202, 301, 302)
Topic: Shows all topics in your cluster

Next Steps

With your Kafka cluster successfully deployed and monitoring configured, you can now proceed to test the deployment. Continue to the Testing and Validation section to create topics and run producer/consumer tests.

Note: The cluster deployment may take 10-15 minutes to complete. The brokers will be distributed across the three availability zones you configured earlier, providing high availability and fault tolerance.

20.6 - Testing and Validation

Testing and Validation

In this section, you’ll thoroughly test your Kafka cluster deployment by creating topics, running producers and consumers, and performing performance tests. This validates that your cluster is working correctly and can handle production workloads.

Overview

We’ll perform the following tests:

Basic connectivity tests - Verify cluster accessibility
Topic management - Create, list, and configure topics
Producer/Consumer tests - Send and receive messages
Performance testing - Load testing with high throughput
Monitoring validation - Verify metrics collection
Multi-AZ validation - Confirm rack awareness

1. Basic Connectivity Tests

List Existing Topics

First, verify that you can connect to the Kafka cluster:

# List topics using kubectl run
kubectl run kafka-topics --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --list

Expected output:

__CruiseControlMetrics
__CruiseControlModelTrainingSamples
__consumer_offsets

Check Cluster Information

# Get cluster metadata
kubectl run kafka-metadata --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-broker-api-versions.sh \
  --bootstrap-server kafka-headless:29092

2. Topic Management

Create a Test Topic

Create a topic with multiple partitions and replicas:

# Create a test topic with 12 partitions distributed across brokers
kubectl run kafka-topics --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic perf_topic \
  --replica-assignment 101:201:301,102:202:302,101:201:301,102:202:302,101:201:301,102:202:302,101:201:301,102:202:302,101:201:301,102:202:302,101:201:301,102:202:302 \
  --create

Describe the Topic

# Describe the topic to verify configuration
kubectl run kafka-topics --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic perf_topic \
  --describe

Expected output:

Topic: perf_topic	TopicId: xyz-123-abc	PartitionCount: 12	ReplicationFactor: 3
	Topic: perf_topic	Partition: 0	Leader: 101	Replicas: 101,201,301	Isr: 101,201,301
	Topic: perf_topic	Partition: 1	Leader: 102	Replicas: 102,202,302	Isr: 102,202,302
	...

Configure Topic Retention

# Set custom retention period (12 minutes for testing)
kubectl run kafka-configs --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-configs.sh \
  --zookeeper zk-client.zookeeper:2181/kafka \
  --alter --entity-name perf_topic \
  --entity-type topics \
  --add-config retention.ms=720000

3. Producer/Consumer Tests

Simple Message Test

Start a Producer

# Start a simple producer (run in one terminal)
kubectl run kafka-producer \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic perf_topic

Type some test messages:

Hello Kafka!
This is a test message
Testing multi-AZ deployment

Start a Consumer

# Start a consumer (run in another terminal)
kubectl run kafka-consumer \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic perf_topic \
  --from-beginning

You should see the messages you sent from the producer.

Clean Up Test Pods

# Clean up the test pods
kubectl delete pod kafka-producer --ignore-not-found=true
kubectl delete pod kafka-consumer --ignore-not-found=true

4. Performance Testing

Producer Performance Test

Run a high-throughput producer test:

# Start producer performance test
kubectl run kafka-producer-perf \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-producer-perf-test.sh \
  --producer-props bootstrap.servers=kafka-headless:29092 acks=all \
  --topic perf_topic \
  --record-size 1000 \
  --throughput 29000 \
  --num-records 2110000

Expected output:

100000 records sent, 28500.0 records/sec (27.18 MB/sec), 2.1 ms avg latency, 45 ms max latency.
200000 records sent, 29000.0 records/sec (27.66 MB/sec), 1.8 ms avg latency, 38 ms max latency.
...

Consumer Performance Test

In another terminal, run a consumer performance test:

# Start consumer performance test
kubectl run kafka-consumer-perf \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-consumer-perf-test.sh \
  --broker-list kafka-headless:29092 \
  --group perf-consume \
  --messages 10000000 \
  --topic perf_topic \
  --show-detailed-stats \
  --from-latest \
  --timeout 100000

Expected output:

start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2024-01-15 10:30:00:000, 2024-01-15 10:30:10:000, 95.37, 9.54, 100000, 10000.0, 1500, 8500, 11.22, 11764.7

5. Monitoring Validation

Check Kafka Metrics in Prometheus

# Port forward to Prometheus (if not already done)
kubectl port-forward -n default svc/monitoring-kube-prometheus-prometheus 9090 &

# Check if Kafka metrics are being collected
curl -s "http://localhost:9090/api/v1/query?query=kafka_server_brokertopicmetrics_messagesin_total" | jq .

Access Grafana Dashboard

# Port forward to Grafana (if not already done)
kubectl port-forward -n default svc/monitoring-grafana 3000:80 &

# Get Grafana admin password
kubectl get secret --namespace default monitoring-grafana \
  -o jsonpath="{.data.admin-password}" | base64 --decode
echo ""

Visit http://localhost:3000 and:

Login with admin/[password]
Navigate to Dashboards
Look for “Kafka Looking Glass” dashboard
Verify metrics are being displayed

Check AlertManager

# Port forward to AlertManager
kubectl port-forward -n default svc/monitoring-kube-prometheus-alertmanager 9093 &

Visit http://localhost:9093 to see any active alerts.

6. Multi-AZ Validation

Verify Broker Distribution

Check that brokers are distributed across availability zones:

# Check broker pod distribution
kubectl get pods -n kafka -l kafka_cr=kafka -o wide \
  --sort-by='.spec.nodeName'

# Check node labels
kubectl get nodes \
  --label-columns=topology.kubernetes.io/zone \
  -l topology.kubernetes.io/zone

Verify Rack Awareness

# Check if rack awareness is working by examining topic partition distribution
kubectl run kafka-topics --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic perf_topic \
  --describe

Verify that replicas are distributed across different broker IDs (which correspond to different AZs).

7. Advanced Testing

Test Topic Creation via CRD

Create a topic using Kubernetes CRD:

# Create topic using KafkaTopic CRD
kubectl apply -n kafka -f - <<EOF
apiVersion: kafka.banzaicloud.io/v1alpha1
kind: KafkaTopic
metadata:
  name: test-topic-crd
  namespace: kafka
spec:
  clusterRef:
    name: kafka
  name: test-topic-crd
  partitions: 6
  replicationFactor: 2
  config:
    "retention.ms": "604800000"
    "cleanup.policy": "delete"
EOF

Verify CRD Topic Creation

# Check KafkaTopic resource
kubectl get kafkatopic -n kafka

# Verify topic exists in Kafka
kubectl run kafka-topics --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --list | grep test-topic-crd

Test Consumer Groups

# Create multiple consumers in the same group
for i in {1..3}; do
  kubectl run kafka-consumer-group-$i \
    --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
    --restart=Never \
    -- /opt/kafka/bin/kafka-console-consumer.sh \
    --bootstrap-server kafka-headless:29092 \
    --topic perf_topic \
    --group test-group &
done

# Check consumer group status
kubectl run kafka-consumer-groups --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-consumer-groups.sh \
  --bootstrap-server kafka-headless:29092 \
  --group test-group \
  --describe

8. Performance Metrics Summary

After running the performance tests, you should see metrics similar to:

Producer Performance

Throughput: 25,000-30,000 records/sec
Latency: 1-3 ms average
Record Size: 1KB

Consumer Performance

Throughput: 10,000+ records/sec
Lag: Minimal (< 100 records)

Resource Utilization

CPU: 20-40% per broker
Memory: 2-3GB per broker
Disk I/O: Moderate

9. Cleanup Test Resources

# Clean up performance test pods
kubectl delete pod kafka-producer-perf --ignore-not-found=true
kubectl delete pod kafka-consumer-perf --ignore-not-found=true

# Clean up consumer group pods
for i in {1..3}; do
  kubectl delete pod kafka-consumer-group-$i --ignore-not-found=true
done

# Optionally delete test topics
kubectl delete kafkatopic test-topic-crd -n kafka

Troubleshooting

Producer/Consumer Connection Issues

# Check broker connectivity
kubectl run kafka-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /bin/bash

# Inside the pod, test connectivity
telnet kafka-headless 29092

Performance Issues

# Check broker resource usage
kubectl top pods -n kafka

# Check broker logs
kubectl logs -n kafka kafka-101-xyz123

# Check JVM metrics
kubectl exec -n kafka kafka-101-xyz123 -- jps -v

Monitoring Issues

# Check ServiceMonitor
kubectl get servicemonitor -n kafka

# Check Prometheus targets
curl -s "http://localhost:9090/api/v1/targets" | jq '.data.activeTargets[] | select(.labels.job | contains("kafka"))'

Next Steps

With your Kafka cluster thoroughly tested and validated, you can now explore disaster recovery scenarios. Continue to the Disaster Recovery Scenarios section to test failure handling and recovery procedures.

Note: Keep the performance test results for comparison after implementing any configuration changes or during disaster recovery testing.

20.7 - Disaster Recovery Scenarios

Disaster Recovery Scenarios

In this section, you’ll test various failure scenarios to understand how the Koperator handles disasters and recovers from failures. This is crucial for understanding the resilience of your Kafka deployment and validating that data persistence works correctly.

Overview

We’ll test the following disaster scenarios:

Broker JVM crash - Process failure within a pod
Broker pod deletion - Kubernetes pod failure
Node failure simulation - Worker node unavailability
Persistent volume validation - Data persistence across failures
Network partition simulation - Connectivity issues
ZooKeeper failure - Dependency service failure

Prerequisites

Before starting disaster recovery tests, ensure you have:

# Verify cluster is healthy
kubectl get kafkacluster kafka -n kafka
kubectl get pods -n kafka -l kafka_cr=kafka

# Create a test topic with data
kubectl run kafka-producer-dr --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic disaster-recovery-test <<EOF
message-1-before-disaster
message-2-before-disaster
message-3-before-disaster
EOF

1. Initial State Documentation

Record Current State

Document the initial state before testing disasters:

echo "=== Initial Kafka Cluster State ==="

# Get broker pods
echo "Kafka Broker Pods:"
kubectl get pods -l kafka_cr=kafka -n kafka -o wide

# Get persistent volumes
echo -e "\nPersistent Volumes:"
kubectl get pv | grep kafka

# Get persistent volume claims
echo -e "\nPersistent Volume Claims:"
kubectl get pvc -n kafka | grep kafka

# Get broker services
echo -e "\nKafka Services:"
kubectl get svc -n kafka | grep kafka

# Save state to file for comparison
kubectl get pods -l kafka_cr=kafka -n kafka -o yaml > /tmp/kafka-pods-initial.yaml
kubectl get pvc -n kafka -o yaml > /tmp/kafka-pvc-initial.yaml

Expected output:

Kafka Broker Pods:
NAME         READY   STATUS    RESTARTS   AGE   IP           NODE
kafka-101    1/1     Running   0          30m   10.244.1.5   kafka-worker
kafka-102    1/1     Running   0          30m   10.244.2.5   kafka-worker2
kafka-201    1/1     Running   0          30m   10.244.3.5   kafka-worker3
kafka-202    1/1     Running   0          30m   10.244.4.5   kafka-worker4
kafka-301    1/1     Running   0          30m   10.244.5.5   kafka-worker5
kafka-302    1/1     Running   0          30m   10.244.6.5   kafka-worker6

2. Broker JVM Crash Test

Simulate JVM Crash

Kill the Java process inside a broker pod:

# Get a broker pod name
BROKER_POD=$(kubectl get pods -n kafka -l kafka_cr=kafka -o jsonpath='{.items[0].metadata.name}')
echo "Testing JVM crash on pod: $BROKER_POD"

# Kill the Java process (PID 1 in the container)
kubectl exec -n kafka $BROKER_POD -- kill 1

# Monitor pod restart
kubectl get pods -n kafka -l kafka_cr=kafka -w
# Press Ctrl+C after observing the restart

Verify Recovery

# Check if pod restarted
kubectl get pods -n kafka -l kafka_cr=kafka

# Verify the same PVC is reused
kubectl describe pod -n kafka $BROKER_POD | grep -A 5 "Volumes:"

# Test data persistence
kubectl run kafka-consumer-dr --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic disaster-recovery-test \
  --from-beginning \
  --timeout-ms 10000

Expected Result: ✅ PASSED - Pod restarts, same PVC is reused, data is preserved.

3. Broker Pod Deletion Test

Delete a Broker Pod

# Get another broker pod
BROKER_POD_2=$(kubectl get pods -n kafka -l kafka_cr=kafka -o jsonpath='{.items[1].metadata.name}')
echo "Testing pod deletion on: $BROKER_POD_2"

# Record the PVC before deletion
kubectl get pod -n kafka $BROKER_POD_2 -o yaml | grep -A 10 "volumes:" > /tmp/pvc-before-deletion.yaml

# Delete the pod
kubectl delete pod -n kafka $BROKER_POD_2

# Monitor recreation
kubectl get pods -n kafka -l kafka_cr=kafka -w
# Press Ctrl+C after new pod is running

Verify Pod Recreation

# Check new pod is running
kubectl get pods -n kafka -l kafka_cr=kafka

# Verify PVC reattachment
NEW_BROKER_POD=$(kubectl get pods -n kafka -l kafka_cr=kafka | grep $BROKER_POD_2 | awk '{print $1}')
kubectl get pod -n kafka $NEW_BROKER_POD -o yaml | grep -A 10 "volumes:" > /tmp/pvc-after-deletion.yaml

# Compare PVC usage
echo "PVC comparison:"
diff /tmp/pvc-before-deletion.yaml /tmp/pvc-after-deletion.yaml || echo "PVCs are identical - Good!"

# Test cluster functionality
kubectl run kafka-test-after-deletion --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --list

Expected Result: ✅ PASSED - New pod created, same PVC reattached, cluster functional.

4. Node Failure Simulation

Cordon and Drain a Node

# Get a worker node with Kafka pods
NODE_WITH_KAFKA=$(kubectl get pods -n kafka -l kafka_cr=kafka -o wide | grep kafka | head -1 | awk '{print $7}')
echo "Simulating failure on node: $NODE_WITH_KAFKA"

# Get pods on this node before cordoning
echo "Pods on node before cordoning:"
kubectl get pods -n kafka -l kafka_cr=kafka -o wide | grep $NODE_WITH_KAFKA

# Cordon the node (prevent new pods)
kubectl cordon $NODE_WITH_KAFKA

# Drain the node (evict existing pods)
kubectl drain $NODE_WITH_KAFKA --ignore-daemonsets --delete-emptydir-data --force

Monitor Pod Rescheduling

# Watch pods being rescheduled
kubectl get pods -n kafka -l kafka_cr=kafka -o wide -w
# Press Ctrl+C after pods are rescheduled

# Verify pods moved to other nodes
echo "Pods after node drain:"
kubectl get pods -n kafka -l kafka_cr=kafka -o wide | grep -v $NODE_WITH_KAFKA

Restore Node

# Uncordon the node
kubectl uncordon $NODE_WITH_KAFKA

# Verify node is ready
kubectl get nodes | grep $NODE_WITH_KAFKA

Expected Result: ✅ PASSED - Pods rescheduled to healthy nodes, PVCs reattached, cluster remains functional.

5. Persistent Volume Validation

Detailed PVC Analysis

echo "=== Persistent Volume Analysis ==="

# List all Kafka PVCs
kubectl get pvc -n kafka | grep kafka

# Check PV reclaim policy
kubectl get pv | grep kafka | head -3

# Verify PVC-PV binding
for pvc in $(kubectl get pvc -n kafka -o jsonpath='{.items[*].metadata.name}' | grep kafka); do
  echo "PVC: $pvc"
  kubectl get pvc -n kafka $pvc -o jsonpath='{.spec.volumeName}'
  echo ""
done

Test Data Persistence Across Multiple Failures

# Create test data
kubectl run kafka-persistence-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic persistence-test <<EOF
persistence-message-1
persistence-message-2
persistence-message-3
EOF

# Delete multiple broker pods simultaneously
kubectl delete pods -n kafka -l kafka_cr=kafka --grace-period=0 --force

# Wait for recreation
kubectl wait --for=condition=Ready pod -l kafka_cr=kafka -n kafka --timeout=300s

# Verify data survived
kubectl run kafka-persistence-verify --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-consumer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic persistence-test \
  --from-beginning \
  --timeout-ms 10000

Expected Result: ✅ PASSED - All messages preserved across multiple pod deletions.

6. Network Partition Simulation

Create Network Policy to Isolate Broker

# Create a network policy that isolates one broker
kubectl apply -n kafka -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-broker
  namespace: kafka
spec:
  podSelector:
    matchLabels:
      brokerId: "101"
  policyTypes:
  - Ingress
  - Egress
  ingress: []
  egress: []
EOF

Monitor Cluster Behavior

# Check cluster state during network partition
kubectl run kafka-network-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic persistence-test \
  --describe

# Check under-replicated partitions
kubectl logs -n kafka kafka-101 | grep -i "under.replicated" | tail -5

Remove Network Partition

# Remove the network policy
kubectl delete networkpolicy isolate-broker -n kafka

# Verify cluster recovery
sleep 30
kubectl run kafka-recovery-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic persistence-test \
  --describe

Expected Result: ✅ PASSED - Cluster detects partition, maintains availability, recovers when partition is resolved.

7. ZooKeeper Failure Test

Scale Down ZooKeeper

# Check current ZooKeeper state
kubectl get pods -n zookeeper

# Scale down ZooKeeper to 1 replica (simulating failure)
kubectl patch zookeepercluster zk -n zookeeper --type='merge' -p='{"spec":{"replicas":1}}'

# Monitor Kafka behavior
kubectl logs -n kafka kafka-101 | grep -i zookeeper | tail -10

Test Kafka Functionality During ZK Degradation

# Try to create a topic (should fail or be delayed)
timeout 30 kubectl run kafka-zk-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic zk-failure-test \
  --create --partitions 3 --replication-factor 2 || echo "Topic creation failed as expected"

# Test existing topic access (should still work)
kubectl run kafka-existing-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-console-producer.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic persistence-test <<EOF
message-during-zk-failure
EOF

Restore ZooKeeper

# Scale ZooKeeper back to 3 replicas
kubectl patch zookeepercluster zk -n zookeeper --type='merge' -p='{"spec":{"replicas":3}}'

# Wait for ZooKeeper recovery
kubectl wait --for=condition=Ready pod -l app=zookeeper -n zookeeper --timeout=300s

# Verify Kafka functionality restored
kubectl run kafka-zk-recovery --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --topic zk-recovery-test \
  --create --partitions 3 --replication-factor 2

Expected Result: ✅ PASSED - Kafka maintains existing functionality during ZK degradation, full functionality restored after ZK recovery.

8. Disaster Recovery Summary

Generate Recovery Report

echo "=== Disaster Recovery Test Summary ==="
echo "Test Date: $(date)"
echo ""

echo "1. Broker JVM Crash Test: PASSED"
echo "   - Pod restarted automatically"
echo "   - PVC reused successfully"
echo "   - Data preserved"
echo ""

echo "2. Broker Pod Deletion Test: PASSED"
echo "   - New pod created automatically"
echo "   - PVC reattached successfully"
echo "   - Cluster remained functional"
echo ""

echo "3. Node Failure Simulation: PASSED"
echo "   - Pods rescheduled to healthy nodes"
echo "   - PVCs reattached successfully"
echo "   - No data loss"
echo ""

echo "4. Persistent Volume Validation: PASSED"
echo "   - Data survived multiple pod deletions"
echo "   - PVC-PV bindings maintained"
echo "   - Storage reclaim policy working"
echo ""

echo "5. Network Partition Test: PASSED"
echo "   - Cluster detected partition"
echo "   - Maintained availability"
echo "   - Recovered after partition resolution"
echo ""

echo "6. ZooKeeper Failure Test: PASSED"
echo "   - Existing functionality maintained during ZK degradation"
echo "   - Full functionality restored after ZK recovery"
echo ""

# Final cluster health check
echo "=== Final Cluster Health Check ==="
kubectl get kafkacluster kafka -n kafka
kubectl get pods -n kafka -l kafka_cr=kafka
kubectl get pvc -n kafka | grep kafka

9. Recovery Time Objectives (RTO) Analysis

Based on the tests, typical recovery times are:

JVM Crash Recovery: 30-60 seconds
Pod Deletion Recovery: 60-120 seconds
Node Failure Recovery: 2-5 minutes
Network Partition Recovery: 30-60 seconds
ZooKeeper Recovery: 1-3 minutes

10. Cleanup

# Clean up test topics
kubectl run kafka-cleanup --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --delete --topic disaster-recovery-test

kubectl run kafka-cleanup-2 --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --delete --topic persistence-test

# Remove temporary files
rm -f /tmp/kafka-pods-initial.yaml /tmp/kafka-pvc-initial.yaml
rm -f /tmp/pvc-before-deletion.yaml /tmp/pvc-after-deletion.yaml

Next Steps

With disaster recovery scenarios tested and validated, you now have confidence in your Kafka cluster’s resilience. Continue to the Troubleshooting section to learn about common issues and debugging techniques.

Key Takeaway: The Koperator provides excellent resilience with automatic recovery, persistent data storage, and minimal downtime during various failure scenarios.

20.8 - Troubleshooting Guide

Troubleshooting Guide

This section provides comprehensive troubleshooting guidance for common issues you might encounter during the Kafka deployment and operation. It includes diagnostic commands, common error patterns, and resolution strategies.

Overview

Common categories of issues:

Cluster Setup Issues - Problems during initial deployment
Connectivity Issues - Network and service discovery problems
Performance Issues - Throughput and latency problems
Storage Issues - Persistent volume and disk problems
Monitoring Issues - Metrics collection and dashboard problems
Operator Issues - Koperator-specific problems

1. Diagnostic Commands

Essential Debugging Commands

# Set namespace context for convenience
kubectl config set-context --current --namespace=kafka

# Quick cluster health check
echo "=== Cluster Health Overview ==="
kubectl get kafkacluster kafka -o wide
kubectl get pods -l kafka_cr=kafka
kubectl get svc | grep kafka
kubectl get pvc | grep kafka

Detailed Diagnostics

# Comprehensive cluster diagnostics
function kafka_diagnostics() {
    echo "=== Kafka Cluster Diagnostics ==="
    echo "Timestamp: $(date)"
    echo ""
    
    echo "1. KafkaCluster Resource:"
    kubectl describe kafkacluster kafka
    echo ""
    
    echo "2. Broker Pods:"
    kubectl get pods -l kafka_cr=kafka -o wide
    echo ""
    
    echo "3. Pod Events:"
    kubectl get events --sort-by=.metadata.creationTimestamp | grep kafka | tail -10
    echo ""
    
    echo "4. Persistent Volumes:"
    kubectl get pv | grep kafka
    echo ""
    
    echo "5. Services:"
    kubectl get svc | grep kafka
    echo ""
    
    echo "6. Operator Status:"
    kubectl get pods -l app.kubernetes.io/instance=kafka-operator
    echo ""
}

# Run diagnostics
kafka_diagnostics

2. Cluster Setup Issues

Issue: Koperator Pod Not Starting

Symptoms:

Operator pod in CrashLoopBackOff or ImagePullBackOff
KafkaCluster resource not being processed

Diagnosis:

# Check operator pod status
kubectl get pods -l app.kubernetes.io/instance=kafka-operator

# Check operator logs
kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager --tail=50

# Check operator events
kubectl describe pod -l app.kubernetes.io/instance=kafka-operator

Common Solutions:

# 1. Restart operator deployment
kubectl rollout restart deployment kafka-operator-operator

# 2. Check RBAC permissions
kubectl auth can-i create kafkaclusters --as=system:serviceaccount:kafka:kafka-operator-operator

# 3. Reinstall operator
helm uninstall kafka-operator -n kafka
helm install kafka-operator oci://ghcr.io/adobe/helm-charts/kafka-operator \
  --namespace=kafka \
  --set webhook.enabled=false \
  --version 0.28.0-adobe-20250923

Issue: Kafka Brokers Not Starting

Symptoms:

Broker pods stuck in Pending or Init state
Brokers failing health checks

Diagnosis:

# Check broker pod status
kubectl get pods -l kafka_cr=kafka -o wide

# Check specific broker logs
BROKER_POD=$(kubectl get pods -l kafka_cr=kafka -o jsonpath='{.items[0].metadata.name}')
kubectl logs $BROKER_POD --tail=100

# Check pod events
kubectl describe pod $BROKER_POD

Common Solutions:

# 1. Check resource constraints
kubectl describe nodes | grep -A 5 "Allocated resources"

# 2. Check storage class
kubectl get storageclass

# 3. Check ZooKeeper connectivity
kubectl run zk-test --rm -i --tty=true \
  --image=busybox \
  --restart=Never \
  -- telnet zk-client.zookeeper 2181

# 4. Force broker recreation
kubectl delete pod $BROKER_POD

3. Connectivity Issues

Issue: Cannot Connect to Kafka Cluster

Symptoms:

Timeout errors when connecting to Kafka
DNS resolution failures

Diagnosis:

# Test DNS resolution
kubectl run dns-test --rm -i --tty=true \
  --image=busybox \
  --restart=Never \
  -- nslookup kafka-headless.kafka.svc.cluster.local

# Test port connectivity
kubectl run port-test --rm -i --tty=true \
  --image=busybox \
  --restart=Never \
  -- telnet kafka-headless 29092

# Check service endpoints
kubectl get endpoints kafka-headless

Solutions:

# 1. Verify service configuration
kubectl get svc kafka-headless -o yaml

# 2. Check network policies
kubectl get networkpolicy -A

# 3. Restart CoreDNS (if DNS issues)
kubectl rollout restart deployment coredns -n kube-system

Issue: External Access Not Working

Diagnosis:

# Check external services
kubectl get svc | grep LoadBalancer

# Check ingress configuration
kubectl get ingress -A

# Test external connectivity
kubectl run external-test --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-broker-api-versions.sh \
  --bootstrap-server <EXTERNAL_IP>:<EXTERNAL_PORT>

4. Performance Issues

Issue: High Latency or Low Throughput

Diagnosis:

# Check broker resource usage
kubectl top pods -n kafka

# Check JVM metrics
kubectl exec -n kafka $BROKER_POD -- jstat -gc 1

# Check disk I/O
kubectl exec -n kafka $BROKER_POD -- iostat -x 1 5

# Check network metrics
kubectl exec -n kafka $BROKER_POD -- ss -tuln

Performance Tuning:

# 1. Increase broker resources
kubectl patch kafkacluster kafka --type='merge' -p='
{
  "spec": {
    "brokerConfigGroups": {
      "default": {
        "resourceRequirements": {
          "requests": {
            "cpu": "2",
            "memory": "4Gi"
          },
          "limits": {
            "cpu": "4",
            "memory": "8Gi"
          }
        }
      }
    }
  }
}'

# 2. Optimize JVM settings
kubectl patch kafkacluster kafka --type='merge' -p='
{
  "spec": {
    "brokerConfigGroups": {
      "default": {
        "jvmPerformanceOpts": "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -Xms4g -Xmx4g"
      }
    }
  }
}'

Issue: Disk Space Problems

Diagnosis:

# Check disk usage in broker pods
for pod in $(kubectl get pods -l kafka_cr=kafka -o jsonpath='{.items[*].metadata.name}'); do
  echo "=== $pod ==="
  kubectl exec $pod -- df -h /kafka-logs
done

# Check PVC usage
kubectl get pvc | grep kafka

Solutions:

# 1. Increase PVC size (if storage class supports expansion)
kubectl patch pvc kafka-101-storage-0 -p='{"spec":{"resources":{"requests":{"storage":"20Gi"}}}}'

# 2. Configure log retention
kubectl patch kafkacluster kafka --type='merge' -p='
{
  "spec": {
    "brokerConfigGroups": {
      "default": {
        "config": {
          "log.retention.hours": "168",
          "log.segment.bytes": "1073741824",
          "log.retention.check.interval.ms": "300000"
        }
      }
    }
  }
}'

5. Monitoring Issues

Issue: Metrics Not Appearing in Prometheus

Diagnosis:

# Check ServiceMonitor
kubectl get servicemonitor -n kafka

# Check Prometheus targets
kubectl port-forward -n default svc/monitoring-kube-prometheus-prometheus 9090 &
curl -s "http://localhost:9090/api/v1/targets" | jq '.data.activeTargets[] | select(.labels.job | contains("kafka"))'

# Check metrics endpoints
kubectl exec -n kafka $BROKER_POD -- curl -s localhost:9020/metrics | head -10

Solutions:

# 1. Recreate ServiceMonitor
kubectl delete servicemonitor kafka-servicemonitor -n kafka
kubectl apply -n kafka -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kafka-servicemonitor
  namespace: kafka
  labels:
    app: kafka
    release: monitoring
spec:
  selector:
    matchLabels:
      app: kafka
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
EOF

# 2. Check Prometheus configuration
kubectl get prometheus -o yaml | grep -A 10 serviceMonitorSelector

Issue: Grafana Dashboard Not Loading

Diagnosis:

# Check Grafana pod
kubectl get pods -l app.kubernetes.io/name=grafana

# Check dashboard ConfigMap
kubectl get configmap -l grafana_dashboard=1

# Check Grafana logs
kubectl logs -l app.kubernetes.io/name=grafana

Solutions:

# 1. Restart Grafana
kubectl rollout restart deployment monitoring-grafana

# 2. Recreate dashboard ConfigMap
kubectl delete configmap kafka-looking-glass-dashboard
# Then recreate using the configuration from the deployment section

6. Operator Issues

Issue: KafkaCluster Resource Not Reconciling

Diagnosis:

# Check operator logs for errors
kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager --tail=100

# Check KafkaCluster status
kubectl describe kafkacluster kafka

# Check operator events
kubectl get events --field-selector involvedObject.kind=KafkaCluster

Solutions:

# 1. Restart operator
kubectl rollout restart deployment kafka-operator-operator

# 2. Check CRD versions
kubectl get crd kafkaclusters.kafka.banzaicloud.io -o yaml | grep version

# 3. Force reconciliation
kubectl annotate kafkacluster kafka kubectl.kubernetes.io/restartedAt="$(date +%Y-%m-%dT%H:%M:%S%z)"

7. Common Error Patterns

Error: “No space left on device”

Solution:

# Check disk usage
kubectl exec -n kafka $BROKER_POD -- df -h

# Clean up old log segments
kubectl exec -n kafka $BROKER_POD -- find /kafka-logs -name "*.log" -mtime +7 -delete

# Increase PVC size or configure retention

Error: “Connection refused”

Solution:

# Check if broker is listening
kubectl exec -n kafka $BROKER_POD -- netstat -tuln | grep 9092

# Check broker configuration
kubectl exec -n kafka $BROKER_POD -- cat /opt/kafka/config/server.properties | grep listeners

# Restart broker if needed
kubectl delete pod $BROKER_POD

Error: “ZooKeeper connection timeout”

Solution:

# Check ZooKeeper status
kubectl get pods -n zookeeper

# Test ZooKeeper connectivity
kubectl run zk-test --rm -i --tty=true \
  --image=busybox \
  --restart=Never \
  -- telnet zk-client.zookeeper 2181

# Check ZooKeeper logs
kubectl logs -n zookeeper zk-0

8. Monitoring Access

Quick Access to Monitoring Tools

# Function to start all monitoring port-forwards
function start_monitoring() {
    echo "Starting monitoring port-forwards..."
    
    # Prometheus
    kubectl port-forward -n default svc/monitoring-kube-prometheus-prometheus 9090 &
    echo "Prometheus: http://localhost:9090"
    
    # Grafana
    kubectl port-forward -n default svc/monitoring-grafana 3000:80 &
    echo "Grafana: http://localhost:3000"
    echo "Grafana password: $(kubectl get secret --namespace default monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 --decode)"
    
    # AlertManager
    kubectl port-forward -n default svc/monitoring-kube-prometheus-alertmanager 9093 &
    echo "AlertManager: http://localhost:9093"
    
    # Cruise Control
    kubectl port-forward -n kafka svc/kafka-cruisecontrol-svc 8090:8090 &
    echo "Cruise Control: http://localhost:8090"
    
    echo "All monitoring tools are now accessible!"
}

# Run the function
start_monitoring

Stop All Port-Forwards

# Function to stop all port-forwards
function stop_monitoring() {
    echo "Stopping all port-forwards..."
    pkill -f "kubectl port-forward"
    echo "All port-forwards stopped."
}

9. Emergency Procedures

Complete Cluster Reset

# WARNING: This will delete all data!
function emergency_reset() {
    echo "WARNING: This will delete all Kafka data!"
    read -p "Are you sure? (yes/no): " confirm
    
    if [ "$confirm" = "yes" ]; then
        # Delete KafkaCluster
        kubectl delete kafkacluster kafka -n kafka
        
        # Delete all Kafka pods
        kubectl delete pods -l kafka_cr=kafka -n kafka --force --grace-period=0
        
        # Delete PVCs (this deletes data!)
        kubectl delete pvc -l app=kafka -n kafka
        
        # Recreate cluster
        echo "Recreate your KafkaCluster resource to start fresh"
    else
        echo "Reset cancelled"
    fi
}

Backup Critical Data

# Backup ZooKeeper data
kubectl exec -n zookeeper zk-0 -- tar czf /tmp/zk-backup.tar.gz /data

# Copy backup locally
kubectl cp zookeeper/zk-0:/tmp/zk-backup.tar.gz ./zk-backup-$(date +%Y%m%d).tar.gz

# Backup Kafka topic metadata
kubectl run kafka-backup --rm -i --tty=true \
  --image=ghcr.io/adobe/koperator/kafka:2.13-3.9.1 \
  --restart=Never \
  -- /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server kafka-headless:29092 \
  --list > kafka-topics-backup-$(date +%Y%m%d).txt

10. Getting Help

Collect Support Information

# Generate support bundle
function collect_support_info() {
    local output_dir="kafka-support-$(date +%Y%m%d-%H%M%S)"
    mkdir -p $output_dir
    
    # Cluster information
    kubectl cluster-info > $output_dir/cluster-info.txt
    kubectl get nodes -o wide > $output_dir/nodes.txt
    
    # Kafka resources
    kubectl get kafkacluster kafka -n kafka -o yaml > $output_dir/kafkacluster.yaml
    kubectl get pods -n kafka -o wide > $output_dir/kafka-pods.txt
    kubectl get svc -n kafka > $output_dir/kafka-services.txt
    kubectl get pvc -n kafka > $output_dir/kafka-pvcs.txt
    
    # Logs
    kubectl logs -l app.kubernetes.io/instance=kafka-operator -c manager --tail=1000 > $output_dir/operator-logs.txt
    
    # Events
    kubectl get events -n kafka --sort-by=.metadata.creationTimestamp > $output_dir/kafka-events.txt
    
    # Create archive
    tar czf $output_dir.tar.gz $output_dir
    rm -rf $output_dir
    
    echo "Support bundle created: $output_dir.tar.gz"
}

# Run support collection
collect_support_info

Next Steps

You’ve now completed the comprehensive Kafka on Kubernetes tutorial! For production deployments, consider:

Security hardening - Enable SSL/TLS, SASL authentication
Backup strategies - Implement regular data backups
Monitoring alerts - Configure production alerting rules
Capacity planning - Size resources for production workloads
Disaster recovery - Plan for multi-region deployments

Remember: Always test changes in a development environment before applying them to production clusters.

21 - Support

The Koperator helps you create production-ready Apache Kafka cluster on Kubernetes, with scaling, rebalancing, and alerts based self healing.

Community support

If you encounter problems while using Koperator that the documentation does not address, open an issue

22 - Developer Guide

Contributing

If you find this project useful here’s how you can help:

Send a pull request with your new features and bug fixes
Help new users with issues they may encounter
Support the development of this project and star this repo!

When you are opening a PR to Koperator the first time we will require you to sign a standard CLA.

How to run Koperator in your cluster with your changes

Koperator is built on the kubebuilder project.

To build the operator and run tests:

Run make

If you make changes and would like to try your own version, create your own image:

make docker-build IMG={YOUR_USERNAME}/kafka-operator:v0.0.1
make docker-push IMG={YOUR_USERNAME}/kafka-operator:v0.0.1
make deploy IMG={YOUR_USERNAME}/kafka-operator:v0.0.1

Watch the operator’s logs with:

kubectl logs -f -n kafka kafka-operator-controller-manager-0 -c manager

Alternatively, run the operator on your machine:

export $KUBECONFIG
make install
make run

Create CR and let the operator set up Kafka in your cluster (you can change the spec of Kafka for your needs in the yaml file):

Remember you need an Apache ZooKeeper server to run Kafka

kubectl create -n kafka -f config/samples/simplekafkacluster.yaml

Istio Integration

Koperator now supports Istio integration using standard Istio resources instead of the deprecated banzaicloud istio-operator. This provides better compatibility and works with any Istio installation.

Prerequisites for Istio Integration

Install Istio in your cluster (any method - operator, Helm, or manual)
Ensure Istio CRDs are available
Configure the istioIngressConfig section in your KafkaCluster spec

Example Istio Configuration

apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: kafka
spec:
  ingressController: "istioingress"
  istioIngressConfig:
    gatewayConfig:
      mode: ISTIO_MUTUAL
  # ... rest of your configuration

Note: The istioControlPlane configuration is no longer required. Koperator creates standard Kubernetes Deployment and Service resources along with Istio Gateway and VirtualService resources.

For comprehensive Istio integration documentation including advanced configuration, troubleshooting, and migration guides, see the Istio Integration Guide.

Limitations on minikube

Minikube does not have a load balancer implementation, thus our envoy service will not get an external IP and the operator will get stuck at this point.

A possible solution to overcome this problem is to use https://github.com/elsonrodriguez/minikube-lb-patch. The operator will be able to proceed if you run the following command:

kubectl run minikube-lb-patch --replicas=1 --image=elsonrodriguez/minikube-lb-patch:0.1 --namespace=kube-system

23 - License

Koperator is licensed under the Apache License, Version 2.0.

Copyright

The Koperator project has evolved through multiple ownership transitions:

Apache License 2.0

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Third-Party Components

This project includes third-party components that are subject to their own license terms. See the NOTICE file in the source repository for details.

Contributing

By contributing to this project, you agree that your contributions will be licensed under the same Apache License 2.0.