Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Checkpointing is a routine operation for Flink Engine to ensure fault tolerance and enable rescaling. Besides that, end-to-end exactly-once hinges on committing data in a granularity of each checkpoint, a fundamental requirement for streaming DB. Hence making light and fast checkpoints is crucial. While generic log-based checkpoinitingcheckpointing[1] tackles the problem, it introduces an additional log layer to double-write state changes and incurs extra CPU and network overheads.

...

While several research projects explore disaggregated, embedded key-value stores (like those referenced in disaggregated RocksDB[12] and RocksDB-Cloud[23]), no widely adopted, open-source solutions exist yet. By carefully weighing usability, extensibility, complexity, and performance as well as the efforts to integrate with the Flink engine, we decided to build a disaggregated state store named ForSt on top of frocksdb. Additionally, we created a unified file system JNI proxy that leverages existing file system implementations in Flink ensuring compatibility with various file system options.

...

  1. Heavy Checkpointing Procedure: A considerable amount of files need to be uploaded during checkpointing.
  2. Limited Data Structure Flexibility: Confining local disk data to the SST format restricts potential performance gains from alternative caching structures.
  3. Inaccurate Warm/Cold Distinction: File-level classification of data as warm or cold inaccurately reflects actual access patterns, leading to suboptimal resource allocation.
  4. More Complicated File Management: This architecture indicates that both local disk and DFS play part of the primary storage, hence needs to unify the file management of the local disk and DFS, which is complicated in extreme cases of error handling e.t.c. 

References

[1] Generic log-based checkpointing https://flink.apache.org/2022/05/30/improving-speed-and-stability-of-checkpointing-with-generic-log-based-incremental-checkpoints/

[2] Disaggregated RocksDB: https://dl.acm.org/doi/pdf/10.1145/3589772

[23] RocksDB-cloud/Rockset https://github.com/rockset/rocksdb-cloud

...