Apache Kafka is the backbone of modern real-time data pipelines. As a distributed system Kafka must maintain a consistent and shared state across all its nodes. Every broker in the cluster needs to agree on critical information, such as which broker is the leader for a partition and what the current configuration is. This requirement for agreement, even in the face of network failures or broker crashes, is known as consensus.
For years, Kafka outsourced this critical task to a separate system: Apache ZooKeeper. However, this approach introduced its own set of complexities and limitations. Now, with the introduction of KRaft (Kafka Raft), Kafka is moving towards a self-managed, ZooKeeper-less architecture.
Here, we’ll explore this evolution, comparing the legacy ZooKeeper-based architecture with the new, powerful KRaft mode.
Kafka with ZooKeeper (Legacy)
A Kafka cluster is a distributed system, and it needs a reliable way to manage its shared state and ensure all nodes agree on it. The system needs a consensus protocol to ensure that all brokers have a consistent view of state, even in the face of network partitions or broker failures.
ZooKeeper is a distributed coordination service. It provides a key-value store and its purpose is to solve consensus problems in distributed systems. In this model, the Kafka cluster was two separate distributed systems that had to be managed.
Here’s how it worked:
- All Kafka brokers register themselves and maintain a heartbeat with the ZooKeeper quorum.
- ZooKeeper holds an election to choose the controller broker, which is responsible for managing the state of partitions and replicas.
- All state changes, like creating a topic or a broker joining/leaving, are written to ZooKeeper first.
- The controller broker watches ZooKeeper for changes and then pushes updates to the other brokers.
The Drawbacks of ZooKeeper
While functional, this architecture had significant downsides:
- Operational Complexity: It required managing and monitoring two separate, complex distributed systems. You had to have expertise in both Kafka and ZooKeeper, with separate configurations, hardware, and failure modes.
- ZooKeeper as a Bottleneck: All metadata changes had to go through ZooKeeper, which could become a performance bottleneck, especially in very large clusters with thousands of partitions.
- Slow Failover: The process of controller failover could be slow, sometimes taking many seconds or even minutes, leading to a period of cluster unavailability. This was a critical issue for systems requiring high availability. We had to deploy and manage two separate complex distributed systems instead of just one.
Kafka with KRaft
KRaft is the built-in consensus protocol that replaces ZooKeeper. KRaft is an implementation of the Raft protocol that runs inside the Kafka cluster itself. Instead of offloading consensus to an external system, Kafka now manages its own state.
This new model introduces the concept of controller nodes within the Kafka cluster:
- A subset of brokers in the Kafka cluster are designated as controller nodes (typically have 3).
- Controller nodes form their own Raft quorum to manage consensus and elect a leader among themselves.
- The cluster’s metadata is no longer stored in ZooKeeper. It is stored in an internal Kafka topic called
__cluster_metadata
. This topic acts as an event log for all state changes. The active controller writes to this log, and all other brokers consume from it.
By using an internal topic, Kafka leverages its own battle-tested replication protocol to manage its own state, a truly self-contained and elegant solution.
The Showdown: Performance and Simplicity
The move to KRaft isn’t just a cosmetic change. It brings substantial benefits in performance, scalability, and operational simplicity.
1. Simplified Architecture
The most obvious benefit is the removal of a major dependency. You no longer need to deploy, configure, secure, and monitor a separate ZooKeeper cluster. This reduces hardware footprint, configuration complexity, and the potential points of failure.
2. Faster Recovery and Failover
One of the most significant improvements is the speed of failover and recovery. Since consensus is managed internally by a dedicated set of controller nodes using the highly efficient Raft protocol, leadership changes happen almost instantaneously.
This chart clearly illustrates the performance gains:
- Controlled Shutdown Time: With KRaft, shutting down a controller is an order of magnitude faster.
- Recovery Time after Uncontrolled Shutdown: This is where KRaft truly shines. Recovering from a sudden failure is dramatically faster than with ZooKeeper, reducing potential cluster unavailability from minutes to mere seconds.
3. Enhanced Scalability
By removing ZooKeeper as a bottleneck, Kafka in KRaft mode can support a much larger number of partitions in a single cluster, scaling to millions of partitions, compared to the tens or hundreds of thousands in a ZooKeeper-based setup.
Conclusion
The introduction of KRaft marks a new era for Apache Kafka. By bringing consensus management in-house, Kafka has become a more streamlined, resilient, and performant system. The deprecation of ZooKeeper simplifies the operational burden on engineers and paves the way for clusters that are not only larger but also more reliable.
If you are starting a new Kafka deployment, using KRaft is the recommended path forward. It’s a fundamental improvement that solidifies Kafka’s position as the leading platform for real-time data streaming.