Status

Current state: Accepted

Discussion thread: here

Vote thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Kafka currently emits metrics for producers and consumers with different naming conventions in the topic metrics. Specifically, Kafka consumer metrics replace periods (.) in topic names with underscores (_), while Kafka producer metrics retain the period separator. This inconsistency can lead to confusion, misinterpretation of metric names, and additional overhead for users managing Kafka metrics.

This KIP aims to unify the naming conventions for Kafka consumer metrics by following the Kafka producer metrics nomenclature, preserving the original format of topic names (including periods).

The current inconsistency in metric naming across Kafka consumers and producers presents challenges for users monitoring and managing Kafka metrics. For example:

  • Uniformity: By standardizing topic metric naming, users can streamline metric ingestion and analysis, ensuring a consistent naming convention for Kafka metrics.
  • Reduce Complexity: Metrics tools and dashboards can use the same naming conventions, simplifying the setup and maintenance of monitoring systems.
  • Enhance User Experience: Aligning consumer metrics with the producer naming convention reduces the cognitive load for users and makes metrics easier to interpret.

Currently, when a Kafka consumer metric is emitted, any periods (.) in the topic name are replaced with underscores (_). This behavior diverges from the Kafka producer metrics, which retain periods in topic names.

Context

The initial replacement of periods (.) with underscores (_) in Kafka consumer metrics appears to have been introduced to align with Graphite metrics hierarichal model, PR comment. However, it seems redundant as the KafkaMetricsGroup class already handles this transformation to support Graphite metrics and reporters that rely on the scope field of Yammer metrics. The scope field of Yammer MetricName is populated from metric tags field, with periods (.) replaced by underscores (_). Since this transformation already occurs within the KafkaMetricsGroup class, it's unnecessary to apply it elsewhere, especially in the client, which doesn't support Yammer metrics.

Kafka's JMX and Telemetry reporters use MBean names to output metrics. Unlike Yammer metrics, MBean names don't utilize the scope field and instead use unmodified tags field. The unnecessary transformation of topic names in Kafka consumer's Fetcher metrics leads to the discrepancy compared to Kafka producer metrics.

Public Interfaces

Affected Metrics

The following Kafka consumer metrics will be impacted by this change:

  • consumer-fetch-manager-metrics
    • Topic:
      • bytes-consumed-total
      • bytes-consumed-rate
      • records-consumed-rate
      • fetch-size-max
      • records-consumed-total
      • fetch-size-avg
      • records-per-request-avg
    • Topic-Partition:
      • records-lag-max
      • records-lag
      • records-lag-avg
      • preferred-read-replica
      • records-lead-min
      • records-lead-avg
      • records-lead

With this KIP, these metrics will also emit topic names without any modification to periods, matching the producer naming scheme. New consumer-fetch-manager-metrics will not have replaced dot (.) in topic names. "However for backward compatibility, the period replaced metrics will also be emitted. They are now deprecated and will be removed in a future major Kafka version (5.0)."

Proposed Changes

  • Add additional metric without periods replacement in Consumer Metrics Topic Names:

    • If topic contains period (.), add metrics with same name in the Kafka consumer metrics to preserve the topic names as they are, without replacing periods with underscores. Keep emitting the existing period replaced metrics as well for backward compatibility.
    • If topic doesn't contain period (.), no change to existing metrics. 

Note: A drawback of the approach is that metrics for topics containing periods will be emitted twice: once with the actual topic name and once with periods replaced. Users must monitor either metric and may need to discard duplicate metrics for accurate reporting. But this is necessary to maintain backward compatibility and prevent disruption to existing user reporting.


Example:

Kafka 4.x:

  • Topic Name: topic.a.b
    • Deprecated Metric: A metric with the tag topic_a_b will continue to be emitted, but it will be marked as deprecated.
    • New Metric: A new metric with the topic tag topic.a.b and same name has been introduced. Users should transition to this new metric.
  • Topic Name: topic_c_d
    • Metric: A metric with the tag topic_c_d will continue to be emitted without any changes.
    • No Additional Metrics: No additional metrics are needed as the existing metric functions correctly.

Kafka 5.0:

The period replacement policy will be completely removed.

Compatibility, Deprecation, and Migration Plan

To ease the transition for users who rely on the existing underscore-based consumer metric naming:

  • Users relying on the existing underscore-based topic names for consumer metrics do not need to immediately change anything as old metrics will be only marked as deprectaed.
  • However, to ensure accurate reporting, users may need to filter out duplicate metrics, as new metrics with actual topic names will be emitted alongside the deprecated ones.
  • Once deprecated metrics removed in future releases then users may need to update their monitoring systems and dashboards to use the original topic names (with periods).

Test Plan

  • Unit Tests: Update unit tests for Kafka consumer metrics to validate that topic names with periods are also emitted without modification, only when required.

Rejected Alternatives

  • Correct the topic names without deprecating existing metrics: Users who are currently monitoring metrics with replaced topic names might experience disruption.


  • No labels