Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Tiered Storage: DFS as complimentary when the local disk is out of space. While initially appealing due to its intuitive nature and potential performance benefits within local capacity, the tiered storage solution with local disk overflow to remote storage ultimately proves unsuitable for disaggregated state management in Flink.

Here's why:

  1. Heavy Checkpointing Procedure: A considerable amount of files need to be uploaded during checkpointing.
  2. Limited Data Structure Flexibility: Confining local disk data to the SST format restricts potential performance gains from alternative caching structures.
  3. Inaccurate Warm/Cold Distinction: File-level classification of data as warm or cold inaccurately reflects actual access patterns, leading to suboptimal resource allocation.
  4. More Complicated File Management: This architecture indicates that both local disk and DFS play part of the primary storage, hence needs to unify the file management of the local disk and DFS, which is complicated in extreme cases of error handling e.t.c. 

Appendix: How to run the PoC

...