Current state: Accepted
Discussion thread: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
KAFKA-5501Getting issue details...
introduced a ZooKeeper client wrapper called
kafka.zookeeper.ZooKeeperClient that encourages pipelined requests to zookeeper. This client pipelines requests to ZooKeeper by performing a "scatter-gather" of asynchronous calls provided by the underlying
org.apache.zookeeper.ZooKeeper client. That is, the client would send a pipelined sequence of requests and wait for all of their responses.
The client left as is risks imposing heavy load on ZooKeeper. ZooKeeper itself only has a coarse-grained throttling mechanism in place through its
zookeeper.globalOutstandingLimit config which defaults to 1000. This config is insufficient for several reasons:
- the limit is meant to protect ZooKeeper from memory pressure associated with a backlog of requests.
- the limit is applied across all connections. Even with this config, one misbehaved client will affect the other clients.
We need a throttling mechanism in the client-side to give administrators control over Kafka's impact on ZooKeeper, and for this we propose a new broker config called
This KIP only adds a new broker configuration described below.
This KIP proposes a new broker config called
zookeeper.max.in.flight.requests which represents the maximum number of unacknowledged requests the client will send to ZooKeeper before blocking.
This config must be set to at least 1.
The default value is set to 10. We ran experiments showing the impact of
zookeeper.max.in.flight.requests on completion times for various ZooKeeper-intensive controller protocols. The default was chosen to be the smallest number beyond which the experiment results have found diminishing returns.
Compatibility, Deprecation, and Migration Plan
From a protocol standpoint, the change is fully backwards compatible. Setting the default value to 10 implies administrators may see an increase in load in ZooKeeper than what was seen prior to the controller using
KAFKA-5642Getting issue details...
. Those who wish to retain existing ZooKeeper load should set
zookeeper.max.in.flight.requests to 1.
Several options for the default value were considered:
- Set the default value to be unbounded (Integer.MAX_VALUE), effectively stating that no such limit be applied and to pipeline as aggressively as possible. This was rejected because a config should not default to removing control over the system.
- Set the default value to 1, effectively disabling pipelining and maintaining Kafka's synchronous requests to ZooKeeper. This was rejected because we want users to see the benefits of pipelining.