...
FailedPartitionsCount - Count of partitions that have failed. Instead of separate metrics, clientId is used as a tag to distinguish between Replica and ReplicaAlterLogDir fetchers, keeping it consistent with metrics metric like MaxLag.
Proposed Changes
...
- The metric FailedPartitionCount would keep track of the failed partitions. It's a newly added metric which would handle partition failure in a better wayhelp keep track of failed partitions. It would would avoid losing several healthy partitions in case partition failure occurs.
...