Status
Current state: Proposed
Discussion thread: Start to End; Also see comment section of KAFKA-18495
JIRA: KAFKA-18495
Component: Streams
Released: 5.0.0
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
The opennumberfiles metric, exposed by the RocksDB state store implementation in Kafka Streams, is currently invalid. This metric was intended to report the approximate number of open files in the RocksDB instance. It was previously calculated using a RocksDB internal metric called NO_FILE_CLOSES (along with potentially other related metrics).
The calculation involved subtracting the value of NO_FILE_CLOSES from another metric representing the total number of files. However, starting from RocksDB version 8.7.3, the NO_FILE_CLOSES metric has been removed. This change in RocksDB's internal metrics broke the calculation of opennumberfiles in Kafka Streams.
As a result, the opennumberfiles metric no longer provides accurate information. The current workaround is to have this metric always return -1, but it is still misleading to users. This situation can confuse users who rely on this metric for monitoring the state of their RocksDB-backed state stores and may trigger false alarms in their monitoring systems.
Public Interfaces
This KIP proposes to remove the following public metric:
kafka.streams:type=stream-state-metrics,task-id={task-id},rocksdb-state-store={store-name}/opennumberfiles
Proposed Changes
The proposed solution is to completely remove the opennumberfiles
metric from the RocksDB state store implementation.
The following changes will be made:
Code Removal in
RocksDBMetricsRecorder
:- The
numberOfOpenFilesSensor
Sensor
object will be removed. - The line
numberOfOpenFilesSensor = RocksDBMetrics.numberOfOpenFilesSensor(streamsMetrics, metricContext);
that initializes the sensor will be removed. - The line
numberOfOpenFilesSensor.record(numberOfOpenFiles, now);
that records the metric's value will be removed. - The method
RocksDBMetrics.numberOfOpenFilesSensor
that creates theSensor
will be removed fromRocksDBMetrics
class.
- The
Documentation Update: The
opennumberfiles
metric will be removed from the Kafka Streams documentation where RocksDB metrics are described.
Explanation of the Changes:
By removing the sensor and the associated recording logic, we ensure that Kafka Streams no longer attempts to collect or report this invalid metric. Removing the numberOfOpenFilesSensor
method from RocksDBMetrics
class cleans up the metric creation code as well.
Compatibility, Deprecation, and Migration Plan
- Impact: Users who are currently monitoring the
opennumberfiles
metric will need to update their monitoring systems and remove any references to this metric. - Migration: Since the metric is already invalid and reporting -1, there is no direct replacement. Users should review the updated RocksDB documentation for alternative metrics that provide insights into file management within RocksDB.
- Deprecation: Given that the metric has been identified as returning a constant value of -1 due to its underlying dependency on a removed RocksDB metric, a deprecation period is deemed unnecessary. The metric's current state does not provide useful information, and its removal aligns with the goal of maintaining accurate and reliable metrics.
Test Plan
The change will be tested by verifying that:
- The
opennumberfiles
metric is no longer present in the metrics reported by Kafka Streams. - The removal of the metric does not introduce any regressions or unexpected behavior in the RocksDB state store implementation.
- Unit tests for
RocksDBMetricsRecorder
will be updated to reflect the removal of the metric.
Rejected Alternatives
1. Keep the metric and always return -1:
- Why it was rejected: While this is the current workaround, keeping a permanently invalid metric is misleading and goes against the principle of providing accurate monitoring data.
2. Attempt to approximate opennumberfiles
using other metrics:
- Why it was rejected: There are no suitable replacement metrics in the newer version of RocksDB that could be used to accurately calculate the number of open files. Any approximation would likely be inaccurate and potentially more misleading than the current situation.