write-waiting-time-(avg|total)

As explained for bytes-flushed-(rate|total) and flush-time-(avg|min|max), when the memtable is almost full, data in the memtable is flushed to disk by a background process. During flush and compaction above, from time to time RocksDB flushes data from the memtable to disk and reorganises data on the disk with compactions. During flushes and compactions a write to the database might need to wait until these processes finish. These metrics measure the average and total waiting time of a write process until flush and compaction finish.

If flush and compaction happen too often this time may increase and signal a bottleneck. Users can then take action by, e.g., increasing the size of the memtable to decrease the rate of flushes or changing the compaction settings.This

num-open-files and num-file-errors-total

Part of the data in RocksDB is kept in files. This files need to be opened and closed. Metric num-open-files measures the number of currently open files and metric num-file-errors-total measures the number of file errors. Both metrics may help to find issues connected to OS and file systems.

Compatibility, Deprecation, and Migration Plan

What impact (if any) will there be on existing users?
If we are changing behavior how will we phase out the older behavior?
If we need special migration tools, describe them here.
When will we remove the existing behavior?

Rejected Alternatives

...

Since metrics are added and no other metrics are modified, this KIP should not
affect backward-compatibility
deprecate public interfaces
need a migration plan other than adding the new metrics to its own monitoring component

Rejected Alternatives

Metrics bytes-read-compaction-total and bytes-written-compaction-total did not seem useful to me since they would measure bytes moved between memory and disk due to compaction. The metric bytes-flushed-total gives at least a feeling about the size of the persisted data in the RocksDB instance.
The number of timed-out writes would

Space shortcuts

Child pages

Versions Compared

Old Version 4

New Version 5

Key

write-waiting-time-(avg|total)

num-open-files and num-file-errors-total

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Since metrics are added and no other metrics are modified, this KIP should not
affect backward-compatibility
deprecate public interfaces
need a migration plan other than adding the new metrics to its own monitoring component

Rejected Alternatives

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 4

New Version 5

Key

write-waiting-time-(avg|total)

num-open-files and num-file-errors-total

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

Since metrics are added and no other metrics are modified, this KIP should notaffect backward-compatibilitydeprecate public interfacesneed a migration plan other than adding the new metrics to its own monitoring component

Rejected Alternatives

Since metrics are added and no other metrics are modified, this KIP should not
affect backward-compatibility
deprecate public interfaces
need a migration plan other than adding the new metrics to its own monitoring component