DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: Under Discussion
Discussion thread: https://lists.apache.org/thread/7oprocltq9f94x0c7761bhzs85t8b0jv
JIRA:
KAFKA-20180
-
Getting issue details...
STATUS
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Currently, there exists only one metric on the broker that enables visibility into the broker's ability to successfully fetch and apply metadata from the cluster metadata partition. This metric is last-applied-record-lag-ms, which reports the difference between the local system time and the timestamp of the last record from the cluster metadata partition that was applied by the broker.
The main issue with this metric is that its value is the difference between a broker's local system time at the time of metric collection and the leader's append timestamp of the broker's last applied record. This means the frequency/interval of metrics collection determines this metric's value, which may present challenges for monitoring the metric.
Instead, having a latency metric would be more intuitive for operators to monitor and alert on. The value of this metric would be the difference between when the broker applied its most recent metadata image locally and the leader's append timestamp for that metadata image.
Public Interfaces
Monitoring
We will introduce a histogram metric to capture the p50, p99, p999, and the maximum latency for a broker's last applied image latency.
| Name | Type |
|---|---|
| kafka.server:type=broker-metadata-metrics,name=last-applied-image-latency-ms-50percentile | Histogram |
| kafka.server:type=broker-metadata-metrics,name=last-applied-image-latency-ms-99percentile | Histogram |
| kafka.server:type=broker-metadata-metrics,name=last-applied-image-latency-ms-999percentile | Histogram |
| kafka.server:type=broker-metadata-metrics,name=last-applied-image-latency-ms-max | Histogram |
Compatibility, Deprecation, and Migration Plan
This KIP only introduces new metrics.
Test Plan
Unit and integration tests for the newly added metrics.
2 Comments
Federico Valeri
Kevin Wu hi, just want to let you know that KIP-1279 is already taken by Cluster Mirroring proposal and the discussion in ongoing.
Kevin Wu
Got it. Will update the KIP number.