Status
Current state: ["Accepted"]
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
A Pre-Kraft broker will currently register 0 for every controller metric if it is not the active controller. Pre-Kraft brokers do this because in the case of a controller failover, the controller metrics of the newly elected node will no longer be 0 since it has become the active controller. KRaft brokers do not need to register 0 for every controller metric because processes with just the "broker" role are not eligible to become controllers. There is some performance overhead associated with registering metrics, so moving forward it would be best if KRaft processes with just the "broker" role do not expose controller metrics.
Public Interfaces
The behavior of which nodes expose controller metrics will be different in KRaft clusters than the behavior when using Kafka with Zookeeper. KRaft nodes will only expose controller metrics if the process has the "controller" role. If a KRaft node has the "controller" role and is not the active controller it will also expose the metrics. This can help the user see if one of the standby controllers is misbehaving. A misbehaving node's metrics could be lagging behind relative to other controllers' metrics since controller metrics are calculated as metadata records are replayed.
All metrics with the following MBean prefix will be exposed on all controllers kafka.controller:type=KafkaController.
Proposed Changes
Pre-Kraft brokers expose 0 for controller metrics unless they are the active controller. KRaft nodes with only the "broker" role should not since they are not controller eligible.
Compatibility, Deprecation, and Migration Plan
- Certain tools that expect brokers to expose controller metrics may need to be extended for KRaft to no longer expect brokers to expose controller metrics.
Rejected Alternatives
- Expose 0 for controller metrics on KRaft brokers.
- There is non-negligible performance impact associated with doing this, so it would be best to change the behavior of controller metrics moving forward instead.
- Expose 0 for controller metrics on KRaft standby controllers.
- Exposing the controller metrics on both the active controller and the standby controllers can help the user see if one of the standby controllers is misbehaving.