Status
Current state: [Under discussion]
Discussion thread: here
JIRA: KIP-64 -Allow underlying distributed filesystem to take over replication depending on configuration
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Distributed data stores can be vastly improved by integrating with Kafka. Some of these improvements are:
- They can participate easily in the whole Kafka ecosystem
- Data ingesting speeds can be improved
Distributed data stores come with their own replication. Kafka replication is a duplication of functionality for them.Kafka should defer replication to underlying file system if the configuration mandates it.
In the newly added configuration a flush to the filesystem should consider a signal that the message is replicated.
Public Interfaces
A new configuration entry which tells the broker that it is running on a distributed storage.
Proposed Changes
Code changes:
- A new configuration entry which tells the broker that it is running on a distributed storage.
- When this configuration is set:
- Replicas do not run the replication code
- Whenever flush returns on a log, all the other brokers in the system are marked as part of ISR
Changes in behavior:
- No replication traffic between the brokers at Kafka level
- Async replication behavior(request.required.acks=0 or 1) does not change
- Replication factor is ignored / taken care by the underlying storage
- After flush all the brokers in the system are marked as in-sync replicas
- request.timeout.ms needs to be greater than log.flush.interval.ms to ensure that unncessary retires do not happen.
Deployment changes:
- Distributed storage is mounted (via NFS/SMB) on log dirs
Compatibility, Deprecation, and Migration Plan
The behavior kicks in only after this particular flag is set. Otherwise the existing code works as is.
Rejected Alternatives
- Keep the existing replication functionality. This is rejected because:
- Duplication of functionality
- Perf deterioration
- Unnecessary copies of data
- Have this as a per topic configuration. This is rejected because:
- Hybrid deployment is not anticipated. Folks will use deployments with either replicated storage OR without it.