Status

Current state: Proposed

Discussion thread: Start to End; Also see comment section of KAFKA-18495

JIRA: KAFKA-18495 

Component: Streams

Released: 5.0.0

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).


Motivation

The opennumberfiles metric, exposed by the RocksDB state store implementation in Kafka Streams, is currently invalid. This metric was intended to report the approximate number of open files in the RocksDB instance. It was previously calculated using a RocksDB internal metric called NO_FILE_CLOSES (along with potentially other related metrics).

The calculation involved subtracting the value of NO_FILE_CLOSES from another metric representing the total number of files. However, starting from RocksDB version 8.7.3, the NO_FILE_CLOSES metric has been removed. This change in RocksDB's internal metrics broke the calculation of opennumberfiles in Kafka Streams.

As a result, the opennumberfiles metric no longer provides accurate information. The current workaround is to have this metric always return -1, but it is still misleading to users. This situation can confuse users who rely on this metric for monitoring the state of their RocksDB-backed state stores and may trigger false alarms in their monitoring systems.


Public Interfaces

This KIP proposes to remove the following public metric:

kafka.streams:type=stream-state-metrics,task-id={task-id},rocksdb-state-store={store-name}/opennumberfiles


Proposed Changes

The proposed solution is to completely remove the opennumberfiles metric from the RocksDB state store implementation.

The following changes will be made:

  1. Code Removal in RocksDBMetricsRecorder:

    • The numberOfOpenFilesSensor Sensor object will be removed.
    • The line numberOfOpenFilesSensor = RocksDBMetrics.numberOfOpenFilesSensor(streamsMetrics, metricContext); that initializes the sensor will be removed.
    • The line numberOfOpenFilesSensor.record(numberOfOpenFiles, now); that records the metric's value will be removed.
    • The method RocksDBMetrics.numberOfOpenFilesSensor that creates the Sensor will be removed from RocksDBMetrics class.
  2. Documentation Update: The opennumberfiles metric will be removed from the Kafka Streams documentation where RocksDB metrics are described.

Explanation of the Changes:

By removing the sensor and the associated recording logic, we ensure that Kafka Streams no longer attempts to collect or report this invalid metric. Removing the numberOfOpenFilesSensor method from RocksDBMetrics class cleans up the metric creation code as well.


Compatibility, Deprecation, and Migration Plan

  • Impact: Users who are currently monitoring the opennumberfiles metric will need to update their monitoring systems and remove any references to this metric.
  • Migration: Since the metric is already invalid and reporting -1, there is no direct replacement. Users should review the updated RocksDB documentation for alternative metrics that provide insights into file management within RocksDB.
  • Deprecation: Given that the metric has been identified as returning a constant value of -1 due to its underlying dependency on a removed RocksDB metric, a deprecation period is deemed unnecessary. The metric's current state does not provide useful information, and its removal aligns with the goal of maintaining accurate and reliable metrics.


Test Plan

The change will be tested by verifying that:

  1. The opennumberfiles metric is no longer present in the metrics reported by Kafka Streams.
  2. The removal of the metric does not introduce any regressions or unexpected behavior in the RocksDB state store implementation.
  3. Unit tests for RocksDBMetricsRecorder will be updated to reflect the removal of the metric.


Rejected Alternatives

1. Keep the metric and always return -1:

  • Why it was rejected: While this is the current workaround, keeping a permanently invalid metric is misleading and goes against the principle of providing accurate monitoring data.

2. Attempt to approximate opennumberfiles using other metrics:

  • Why it was rejected: There are no suitable replacement metrics in the newer version of RocksDB that could be used to accurately calculate the number of open files. Any approximation would likely be inaccurate and potentially more misleading than the current situation.
  • No labels