...
The POC implementation of the proposed metrics can be found here: https://github.com/apache/kafka/pull/12391
Metric Name | Level | Type | Description | Notes | |
---|---|---|---|---|---|
active-restoring-tasks | thread / INFO | count | The number of active tasks currently undergoing restoration | ||
standby-updating-tasks | thread / INFO | count | The number of active tasks currently undergoing updating | ||
active-paused-tasks | thread / INFO | count | The number of active tasks paused restoring | ||
standby-paused-tasks | thread / INFO | count | The number of standby tasks paused updating | ||
idle-ratio | thread / INFO | gauge (percentage) | The fraction of time the thread spent on being idle | idle-ratio + restore/update-ratio + checkpoint-ratio should be 1 | |
active-restore-ratio | thread / INFO | gauge (percentage) | The fraction of time the thread spent on restoring active | ortasks | idle-ratio + restore/update-ratio + checkpoint-ratio should be 1; only one of the restore/update-ratio should be non-zero |
standby-update-ratio | thread / INFO | gauge (percentage) | The fraction of time the thread spent on updating standby tasks | idle-ratio + restore/update-ratio + checkpoint-ratio should be 1; only one of the restore/update-ratio should be non-zero | |
checkpoint-ratio | thread / INFO | gauge (percentage) | The fraction of time the thread spent on checkpointing restored progress | idle-ratio + restore/update-ratio + checkpoint-ratio should be 1 | |
active-restore-records-rate | thread / INFO | rate | The average per-second number of records restored for all active tasks | min(active-restore-records-rate, standby-update-records-rate) == 0 | |
standby-update-records-rate | thread / INFO | rate | The average per-second number of records updated for all standby tasks | min(active-restore-records-rate, standby-update-records-rate) == 0 | |
restore-call-rate | thread / INFO | rate | The average per-second number of restore calls triggered | ||
restore-total | task / DEBUG | count | The total number of records processed during restoration | when the task | |
records | restore-rate | task / DEBUG | rate | The average per-second number of records restored | |
active-restore-remaining-records-total | task / INFO | count | The number of records remained to be restored |
Along with these new metrics, we would also deprecate the metrics below:
...