Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update KIP name

Table of Contents

Status

Current state: Under Discussion

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Iterator leaks are caused when a user opens an Iterator on a StateStore, but never closes it. Despite Iterators being AutoCloseable, it's still possible for them to fail to close, if a user does not allocate them in a try-with-resources block. Iterator leaks can directly cause memory leaks, or other performance problems, depending on the StateStore implementation.

...

We introduce several new metrics to aid users in finding leaked Iterators, as well as identifying the cause of a high number of pinned blocks, or other kinds of memory leaks and performance problems.

Public Interfaces

New Metrics

Each added metric will be on store-level and have the following tags:

...

NameRecording LevelMetric TypeDescription
num-open-iteratorsINFOGaugeThe current number of Iterators on the store that have been created, but not yet closed.
iterator-duration-avgDEBUGAverageThe average time spent between creating an Iterator and closing it, in milliseconds.
iterator-duration-maxDEBUGMaximumThe maximum time spent between creating an Iterator and closing it, in milliseconds.
oldest-iterator-open-since-msDEBUGGaugeThe UNIX timestamp the oldest still open iterator was created, in milliseconds.

Proposed Changes

MeteredKeyValue store, which by default automatically wraps all state stores created by Kafka Streams, already defines a MeteredKeyValueIterator , which automatically wraps all Iterators provided by the underlying StateStore. We will add our new metrics to MeteredKeyValueIterator.

...

  • The metric continues to climb indefinitely over time, as it indicates that old Iterators are not being closed.
  • The metric is consistently very high, despite not continuing to climb, which indicates an Iterator that takes a long time to complete iteration and close. This could indicate a performance problem in the application, like iterating over every key in a very large StateStore.

Compatibility, Deprecation, and Migration Plan

  • No impact is expected for existing users, as no existing interfaces or behaviour is being changed.
  • The performance overheads of tracking the opening/closing of Iterators is expected to be negligible to the point that it will not be measurable in real-world applications.

Test Plan

MeteredKeyValueStoreTest will be updated with new tests for each metric.

Rejected Alternatives

  • Originally, this KIP focused on providing metrics exclusively for RocksDB Iterators. However, it was determined that it would be useful to have these metrics available for all StateStores. The original implementation involved using RocksDBs internal Tickers to track open/closed Iterators. Our new implementation maintains these counts in Kafka Streams, which enables these metrics to be reported for any StateStore.