Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Table of Contents

One major advantage of Flink is its efficient and easy-to-use state management mechanism. However, this mechanism has not evolved much since it was born and hasn't kept pace with the demands of the cloud-native era. In this FLIP, we revisit the current architecture of Flink state management model and propose an augmented alternative in a disaggregated fashion.

...

High-Level Overview and Design

Image Modified

Figure 2: The Disaggregated Model of State Management

...

Fast Checkpoint:Since most state files already reside in DFS, only small incremental updates need to be uploaded during checkpointing, drastically reducing network transfer time. In addition, both working state and checkpoints can reference the same underlying physical files, eliminating duplication. This saves storage space and further accelerates checkpoints.
Restore: With DFS plays as the primary storage, downloading large state files to local disks is avoided, significantly reducing restore time. Local disks (cache) can be gradually warmed up after the job starts, further optimizing performance.
Rescale: Rescaling leverages existing solutions like ClipDB/IngestDB to accelerate rebuilding the state store on DFS directly. Notice that since file downloads are eliminated, local disk constraints for downscaling are no longer an issue.

Image Modified

Figure 4: Checkpoint/Restore/Rescale Mechanism (Current Model on the left; Disaggregated Model on the right)

...

While the asynchronous execution model (FLIP-425) and network I/O grouping (FLIP-426) have significantly enhanced performance, PoC results in XXXX access latency in Table 1 reveal that direct access to remote storage (DFS) remains xx% 95% slower than local disk on average. Therefore, efficiently utilizing available local disks is crucial to maximize overall performance. In this context, local disks act as an optional secondary cache.

...

To optimize CPU efficiency across diverse scenarios, we are actively exploring an "Adaptive Local Cache" capable of intelligently transitioning between the aforementioned caching solutions based on workload characteristics. As depicted in Figure XXX5, this solution aims to achieve optimal performance regardless of the prevailing conditions. Initial testing shows that with an adaptive local cache, we can achieve at least the same performance while states can fit into the local disk. More details will be revealed in future FLIP(s).

Image Modified

Figure 5: Adaptive Local Cache

Remote Compaction (FLIP-430)

...

Code Block

language	bash

mvn clean package -DskipTests  -Dcheckstyle.skip=true -Dspotless.check.skip=true -Denforcer.skip=true -Drat.skip=true -Djapicmp.skip=true

...

Page tree

Versions Compared

Old Version 87

New Version 88

Key

High-Level Overview and Design

Remote Compaction (FLIP-430)

Page tree

Page History

Versions Compared

Old Version 87

New Version 88

Key

High-Level Overview and Design

Remote Compaction (FLIP-430)