Motivation
1. To improve memory efficiency of (de-)serialize partition locations to protobuf.
2. Reduce RPC messages' size.
Public Interfaces
Add new protobuf messages like the below:
protobuf definations
message PbPackedPartitionLocations { repeated int32 ids = 1; repeated int32 epoches = 2; repeated int32 workerIds = 3; repeated string workerIdsSet = 4; repeated bytes mapIdBitMap = 5; repeated int32 types = 6; repeated int32 mountPoints = 7; repeated string mountPointsSet = 8; repeated bool finalResult = 9 ; repeated string filePaths = 10; repeated int32 availableStorageTypes = 11; } message PbPackedPartitionLocationsPair { PbPackedPartitionLocations locations = 1; repeated int32 replicates = 2; } message PbPackedWorkerResource { PbPackedPartitionLocationsPair locationPairs = 1; string networkLocation = 2; }
Proposed Changes
- Add packed RPC messages to improve memory efficiency.
- Add PbPackedPartitionLocationsPair as new field in PbFileGroup, PbRegisterShuffleResponse, PbReserveSlots, PbRequestSlotsResponse.
Compatibility, Deprecation, and Migration Plan
- Old RPC messages will be kept.
- When will we remove the existing behavior?
- I think old RPC messages can be removed after one or two major releases.