Status

Current state:Under Discussion

Discussion thread: https://lists.apache.org/thread/mt2xb26v939vzh9hqxkmk2d6mtp78v7d

JIRA: https://issues.apache.org/jira/browse/KAFKA-20199

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

In KRaft mode, NodeToControllerChannelManagerImpl manages communication between nodes and the active controller. This component is used by multiple process roles, including:

- broker-only nodes
- controller-only nodes
- combined broker/controller nodes

Currently, when NodeToControllerChannelManagerImpl creates a Selector, it supplies the metric tag map:

Map.of("BrokerId", String.valueOf(config.brokerId()))

These tags are propagated through Selector and used by SelectorMetrics when registering metrics via:

metrics.metricName(..., metricTags)

As a result, all Selector metrics created by this component include the tag:

BrokerId=<id>

This behavior is appropriate for broker processes. However, for controller-only nodes, the process is not acting as a broker. Using the BrokerId tag in this case is misleading and can cause confusion when interpreting metrics in monitoring systems.

Public Interfaces

This KIP changes the metric tag key used for Selector metrics emitted by NodeToControllerChannelManagerImpl when running on controller-only nodes.

- Current behavior:
BrokerId=<id>

- Proposed behavior (controller-only nodes):
NodeId=<id>

The tag map is applied to all metrics registered by SelectorMetrics, including (but not limited to):

- connection-count
- connection-creation-rate
- connection-close-rate
- successful-authentication
- failed-authentication
- incoming-byte-rate
- outgoing-byte-rate
- request-rate
- response-rate
- io-wait-time-ns-avg
- io-time-ns-avg

Broker-only and combined-role nodes remain unchanged.

### Example

Before:
connection-count{BrokerId=2}

After (controller-only node):
connection-count{NodeId=2}

Proposed Changes

Modify NodeToControllerChannelManagerImpl so that the metric tag supplied when creating the Selector depends on the process role.

### Current behavior

Map.of("BrokerId", String.valueOf(config.brokerId()))

### Proposed behavior

- Controller-only nodes:
Map.of("NodeId", String.valueOf(config.nodeId()))

- Broker-only and combined-role nodes:
Map.of("BrokerId", String.valueOf(config.brokerId()))

This change affects only the tag key used for controller-only nodes. The tag map continues to be propagated unchanged to SelectorMetrics.

Compatibility, Deprecation, and Migration Plan

This change affects monitoring setups for controller-only nodes.

Metrics emitted by controller-only nodes will change their tag key from BrokerId to NodeId. Existing dashboards, queries, or alerts that rely on BrokerId for controller-only nodes will need to be updated accordingly.

Metrics emitted by broker-only and combined-role nodes remain unchanged.

No deprecation period is introduced, since the current behavior for controller-only nodes is misleading and inconsistent with the process role.

Test Plan

The following tests will be added:

1. Unit tests verifying that NodeToControllerChannelManagerImpl supplies the correct metric tag based on the process role.

2. Integration tests verifying that:
- controller-only nodes expose Selector metrics tagged with NodeId
- broker-only and combined-role nodes continue to expose metrics tagged with BrokerId

Rejected Alternatives


Keep BrokerId for all roles
Preserves compatibility but leaves misleading metrics for controller-only nodes.

Use NodeId for all roles
Would also change metrics for broker nodes, increasing compatibility impact.

Emit both BrokerId and NodeId
Adds complexity and potential ambiguity without clear operational benefit.

  • No labels