Current state: Under Discussion
Discussion thread: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Kafka Brokers today rely on Apache Zookeeper. Many folks in the community have expressed a desire to either change the zkclient and start using Apache Curator or allowing other systems like etcd, consul, Apache Cassandra and others to handle the role Zookeeper is currently playing. By allowing the brokers to have both a way to plug-in another server for storing the meta data and also for leader election then we can have the ability to-do this.
This KIP proposes approach for isolating coordination related functionality to separate modules. These modules should come with a public interface that can have pluggable implementations.
Zookeeper has advanced low-level primitives for coordinating distributed systems – ephemeral nodes, key-value storage, watchers. Such concepts may not be available in other consensus frameworks. At the same time such low-level primitives (especially ephemeral nodes) are error prone and usually a cause of subtle bugs in Kafka coordination code.
That's why instead of focusing on question “how Kafka does coordination with Zookeeper” it is proposed to concentrate on question “what general problems of distributed systems are solved in Kafka by means of Zookeeper”. Having defined interface boundaries this way, we'll be able to hide implementation details under concrete realizations developed with corresponding built-in facilities available in particular tools (e.g. ephemeral nodes vs TTLs).
It is proposed to separate such high-level concerns:
- Group membership protocol (Kafka brokers form a cluster; consumer connectors form a consumer group)
- Leader election (electing controller among brokers)
- Distributed key-value storage (topic config storage etc etc etc)
- Data-change listeners (triggering events - partition reassignment, catching up isr-s etc)
Below each module is presented by its interface.
(NOTE: Initial version of the interfaces is in Scala to make it cleaner and shorter. The final version (actual Kafka interfaces) is planned to be written in Java).
Compatibility, Deprecation, and Migration Plan
Shared interface for plugable consensus and metadata storage systems should be compatible for Zookeeper-based implementation. Also this implementation will likely be the default one.
As part of this KIP it will be required to rework some system and replication tools. It will not be possible anymore to rely on Zookeeper as a default metadata storage system, also it will not be possible to use it to trigger particular administrative commands. Most of the tools are related to topic management (create topics, reassign partitions etc) and consumer group management (offset checker etc).
The approach to topic tools is covered in KIP-4 - we will move all administrative logic to brokers. KIP-4 is currently under development and has agreed Wire Protocol changes.
The consumer group tools should be covered separately. Having New Java Consumer in 0.9 release with server-side coordinator may let us deprecate old consumer and thus all tools related to it. Consumer group tools should work as usual if brokers are run with Zookeeper based implementation of the shared interface.
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.