Motivation

1. To improve memory efficiency of (de-)serialize partition locations to protobuf.

2. Reduce RPC messages' size.

Public Interfaces

Add new protobuf messages like the below:

protobuf definations
message PbPackedPartitionLocations {
  repeated int32 ids = 1;
  repeated int32 epoches = 2;
  repeated int32 workerIds = 3;
  repeated string workerIdsSet = 4;
  repeated bytes mapIdBitMap = 5;
  repeated int32 types = 6;
  repeated int32 mountPoints = 7;
  repeated string mountPointsSet = 8;
  repeated bool finalResult = 9 ;
  repeated string filePaths = 10;
  repeated int32 availableStorageTypes = 11;
}

message PbPackedPartitionLocationsPair {
  PbPackedPartitionLocations locations = 1;
  repeated int32 replicates = 2;
}

message PbPackedWorkerResource {
  PbPackedPartitionLocationsPair locationPairs = 1;
  string networkLocation = 2;
}

Proposed Changes

  1. Add packed RPC messages to improve memory efficiency.
  2. Add PbPackedPartitionLocationsPair as new field in PbFileGroup, PbRegisterShuffleResponse, PbReserveSlots, PbRequestSlotsResponse.

Compatibility, Deprecation, and Migration Plan

  • Old RPC messages will be kept.
  • When will we remove the existing behavior?
    • I think old RPC messages can be removed after one or two major releases.


  • No labels