Status

Current state: Accepted

Vote thread: https://lists.apache.org/thread/vvw45tcdmb23d8gpgb3x4fklx3tp7hpm

Discussion thread: https://lists.apache.org/thread/w4ml9ffkj1j31j8kjpbywq9jsw5ck5sr

JIRA: KAFKA-19254 - Getting issue details... STATUS

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently, there exists a kafka.server:type=MetadataLoader,name=CurrentMetadataVersion  metric which tells the operator the metadata version feature level for every node in a cluster. However, now that there several features with support beyond the metadata.version  feature, adding a metric to display those features' levels would be helpful for monitoring upgrade/downgrade scenarios for clusters using those features. Additionally, adding metrics for the minimum and maximum supported feature levels of each node would be helpful for the operator to determine if a feature upgrade/downgrade is safe to perform.

Public Interfaces

Monitoring

Add metrics for each production feature (i.e. metadata.versionkraft.version , transaction.version , group.version , eligible.leader.replicas.version , share.version, and streams.version ).

This can be accomplished via generic "feature level" metrics that are tagged by feature name. Using generic metrics allows for new production features to automatically expose their feature level and supported feature levels going forward. 

NameTypeDescription
kafka.server:type=MetadataLoader,name=FinalizedLevel,featureName=XShortThe finalized value of the feature level for a feature named "X".
kafka.server:type=node-metrics,name=minimum-supported-level,feature-name=XShortThe minimum supported feature level for a feature named "X" on the node.

kafka.server:type=node-metrics,name=maximum-supported-level,feature-name=X

ShortThe maximum supported feature level for a feature named "X" on the node.

The FinalizedLevel  metric will report the finalized feature level for each production feature if it exists. If a feature does not have a finalized level, it will not have an associated FinalizedLevel metric. Similarly, if the finalized level of the metric is "removed" by setting it to 0, the associated FinalizedLevel metric will also be removed. This does not apply to metadata.version , whose minimum version is 7, and kraft.version , since a level of 0 for that feature does not mean KRaft is "disabled". This metric should still reside in the MetadataLoader metric group because its value is derived from the metadata log's feature records.

minimum-supported-level  and maximum-supported-level  are metrics whose values are dependent only on the software version being run, so it seems appropriate to define a new metric type to house these sorts of metrics going forward (release version, commit hash, etc.).

Compatibility, Deprecation, and Migration Plan

We will deprecate the kafka.server:type=MetadataLoader,name=CurrentMetadataVersion metric since this per-feature tagged metric also exposes this value, and the previous metric can be removed starting in 5.0.

Test Plan

We will add junit tests to verify the new metrics.

Alternatives

Running kafka-feature describe  on every node is one way to track feature level upgrades/downgrades, but it is not straightforward to monitor since it is not a metric.

This KIP discusses the above approach: KIP-1160: Enable returning supported features from a specific broker

  • No labels