Status
Current state: Adopted
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Currently, most of client metrics are measured in milliseconds, and the ones that are measured in nanoseconds have a the `-ns` infix, or sufix, in their name. However, this naming convention has not been followed in the following 3 cases:
- bufferpool-wait-time-total
- io-waittime-total
- iotime-total
Metrics provided by Kafka should be consistent and deliver what is expected without the need to look at the code or some documentation page.
Additionally, `io-waittime-total` and `iotime-total` metric names don't follow the general naming convention as they are not consistent with the rest of their family, like `io-wait-time-ns-avg` or `io-time-ns-avg`. Please notice the lack of hyphen in "waittime" and "iotime".
Public Interfaces
Three new client metrics will be introduced:
- bufferpool-wait-time-ns-total: with the same values as the current `bufferpool-wait-time-total`
- io-wait-time-ns-total: with the same values as the current `io-waittime-total`
- io-time-ns-total: with the same values as the current `iotime-total`
Proposed Changes
The idea of this KIP is to add these new 3 metrics.`bufferpool-wait-time-total` will be added in org.apache.kafka.clients.producer.internals.BufferPool#BufferPool. `io-wait-time-ns-total` and `io-time-ns-total` will be added to org.apache.kafka.common.network.Selector.SelectorMetrics#SelectorMetrics.
Additionally, these metrics together with the ones with wrong names will be added in the documentation site, as they weren't present there at the moment.
The following already existing metrics will be deprecated:
- bufferpool-wait-time-total
- io-waittime-total
- iotime-total
Compatibility, Deprecation, and Migration Plan
In order to ease the migration and transition, the new metrics will live along the wrongly named ones. So any user using the wrong names would have time to migrate to the correct ones.
Both existing wrongly named metrics and new ones will be added in the ops.html page under the "Common monitoring metrics for producer/consumer/connect/streams" section. The wrongly named ones will contain a warning letting users know that this metric is deprecated and will be removed in following releases.
Rejected Alternatives
Renaming the current metrics was rejected as it wasn't offering backwards compatibility.