Current state: Accepted [One of "Under Discussion", "Accepted", "Rejected"]
Discussion thread: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Unlike the LeaderAndIsrRequest, the StopReplicaRequest does not include the leader epoch which makes it vulnerable to reordering. This KIP proposes to add the leader epoch for each partition in the StopReplicaRequest and the broker will verify the epoch before proceeding with the StopReplicaRequest.
We will bump the version of the StopReplicaRequest/StopReplicaResponse and add the leader epoch for each partition in the request.
The controller will include the leader epoch of each partition when sending out an StopReplicaRequest. The broker will verify the epoch of each partitions and send an `FENCED_LEADER_EPOCH` error when the leader epoch received is older than the known one. When a topic is deleted, the leader epoch is not bumped. In this case, we will send a sentinel (-2) which overrides any existing epoch. Older version of the request will use a sentinel (-1) to indicate the the leader epoch is not present when the controller is still on the old version during the upgrade.
Starting from V3, only one StopReplica request will be sent by the controller, combining the partitions to be deleted and the partitions to stopped only.
Compatibility, Deprecation, and Migration Plan
The change is backward compatible with older broker.