Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix inconsistencies in naming



Partition-level errors:


Previously, the broker would only send its ID to the controller using the ZK watch mechanism. Since only the broker ID was sent (and not the broker epoch) the controller needed to probe the broker to learn its true state. Theoretically, a broker could have been repaired and restarted before the controller had a chance to react to the event. The controller would probe the brokers using LeaderAndIsr requests to learn in what state the replicas were. In this proposal, we now include the broker ID and epoch in the request, so the controller can safely update its internal replica state based on the request data. If the broker had, in fact, been restarted since sending the AlterReplicaState, the controller would be gated by the broker epoch and would not take any action. 

RPC semantics

  • The NewState field EventType field in the request is will only support the value 0x1 which will represent "offline"
  • The Reason EventReason field is a textual description of why the event is being sent