Document the state by adding a label to the CIP page with one of "discussion", "accepted", "released", "rejected".
Discussion thread | |
---|---|
Vote thread | |
JIRA | |
Release |
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
- To make partition data writer decouple with the writing file logic.
- Support future needs for different storages.
Public Interfaces
- Add a proxy to hide evict logic from the partition data writer
CelebornFileProxy{ CelebornFile currentFile; FileInfo currentFileInfo; void write(bytebuf buf){ evict(false) currentFile.write(buf) } // true means this evict operation is triggered by memory manager void evict(force){ if(currentFile.needEvict||force){ CelebornFile nFile = StoragePolicy.getEvictedFile(currentFile) currentFile.evict(nFile) currentFile=nFile } } void close(); }
- Add StoragePolicy
celeborn.worker.storagePolicy.evictPolicy: (MEMORY,(LOCAL|DFS)),(LOCAL,DFS) celeborn.worker.storagePolicy.createFilePolicy: LOCAL,MEMORY,DFS celeborn.worker.storagePolicy.evictTrigger: SIZE StoragePolicy{ CelebornFile getEvictedFile(CelebornFile file); CelebornFile createFile(); }
This config "celeborn.worker.storagePolicy.evictPolicy" defined the order to evict.
- Add file abstraction, and implement for different storage like Memory, Disk, DFS.
interface CelebornFile{ FileInfo fileInfo; void write(bytebuf buf); boolean needEvict(); void evict(CelebornFile file); void close(); } CelebornMemoryFile implement CelebornFile DiskMemoryFile implement CelebornFile DFSMemoryFile implement CelebornFile
Proposed Changes
- Move the writing logic output of partitionDataWriter and hide details about different storages.
- The partition data writer will only need to pass data to CelebornFileProxy. The proxy will handle the logic about eviction.
- The file abstraction layer's implementation will do the actual writer logic.
- Enable evict capability for all shuffle files.
- Add storage policy
- Support customize create file priority.
- Support customize evict priority.
- Here is a sample. (MEMORY, HDD, HDFS) means that a memory shuffle file can be evicted to local or HDFS and local files are preferred. (HDD, HDFS) means that local shuffle files can be evicted to HDFS.
- Evolution
- extend CelebornFileProxy to support a partition location to be stored on different storages.
- Simplify storage manager logic about managing different writers.
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
This change won't affect existing users.
- If we are changing behavior how will we phase out the older behavior?
This won't break the compatibility assurance within the Celeborn community.
- If we need special migration tools, describe them here.
No needed.
- When will we remove the existing behavior?
Test Plan
Describe in few sentences how the PIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?
This CIP will be tested in cluster and UT.
Rejected Alternatives
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.