Current state: "Accepted"
Discussion thread: https://firstname.lastname@example.org/msg77721.html
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Consumer group rebalancing may impact the performance of clients. The rebalancing process may also sometimes take longer than expected. It would be good to have some metrics which provide visibility into how many rebalances are in progress.
Public Interface Additions and Changes
The group state name "AwaitingSync" is a bit confusing. It is part of rebalancing, but it does not have "Rebalancing" in the name. We propose renaming this state to "CompletingRebalance", to reflect the fact that it is the final part of the rebalancing operation.
Then we will add metrics identifying how many consumer groups are in each state.
- NumGroupsPreparingRebalance: the number of consumer groups which are currently in the PreparingRebalance state.
- NumGroupsCompletingRebalance: the number of consumer groups which are currently in the CompletingRebalance state.
- NumGroupsStable: the number of groups which are currently in the Stable state.
- NumGroupsDead: the number of groups which are currently in the Dead state.
- NumGroupsEmpty: the number of groups which are currently in the Empty state.
In combination with the existing NumGroups metric, this will show what percentage of groups are in a particular state at a given time.
Compatibility, Deprecation, and Migration Plan
Instead of adding a metric, we could look through the broker logs to see when consumer group rebalances begin and end. However, this would be more difficult for metrics monitoring systems to track, since they would have to parse the broker logs.
Another option would be to provide more information about groups through the AdminClient. While this would be useful, it doesn't serve exactly the same function of giving a summary of what is going on which a metric does.