...
- Interface and API change
- A new Pinot Server API in TablesResource.java for segment download (PR 4914).
- Start with a straightforward approach to first gzip the segment directory for each request and send it back.
- Need performance test here on impact server query performance during segment download.
- A new Pinot Server API in TablesResource.java for uploading a segment to a configured deep store. (PR Pending).
- Used by RealtimeValidationManager to ask servers to update
- A new Pinot Server API in TablesResource.java for segment download (PR 4914).
- Server changes for segment completion (PR 4914)
- Add a new config parameter _enableSegmentUploadToController to IndexLoadingConfig.java (default true for backward compatibility) to control whether servers want to upload segment to controller.
- when set as true, the segment completion behavior is exactly the same as the current code.
- when set as false, server will not upload built segment to the controller by invoking a new controller end point (see controller change below).
- Add a new config segment.store.root.dir to HelixInstanceDataManagerConfig.java to config the deep store root dir if necessary.
- In SplitSegmentCommitter.java, if _enableSegmentUploadToController is set as false, skip synchronous segmentCommitUpload segment to controller during split commit.
- In LLRealtimeSegmentDataManager.java, after the segment commit is done with successful metadata upload in (c) –- and assuming by this it implies _enableSegmentUploadToController is false,
- If there is a configured deep store (by checking segment.store.root.dir), the server does a best efforts upload of built segments to the configured external store – i.e., no need to retry even the upload fails.
- Otherwise just continue.
- Consuming to online Helix state transition: refactor the server download procedure so that it
- first tries to download segments from the configured segment store if available – this is done with retries and backoff to get rid of race condition with upload servers; otherwise
- discovers the servers hosting the segment via external view of the table and then download from the hosting server.
- Add a new config parameter _enableSegmentUploadToController to IndexLoadingConfig.java (default true for backward compatibility) to control whether servers want to upload segment to controller.
- Controller changes for segment completion (PR 4914)
Add a new query parameter enableSegmentUpload to LLCSegmentCompletionHandlers.java's segmentCommitEndWithMetadata() to indicate if the server uploads segment to the controller during segment completion (default true to maintain backward compatibility)
- If the value is false, there is no need for the controller to move the segment file to its permeant location in a split commit.
- The changes in (a) implies that now the controller accepts segment commit metadata with empty (but not null) deep storage uri.
- Related to (a), add a new param _uploadToControllerEnabled to SegmentCompletionProtocol.java's Params class.
- In PinotLLCRealtimeSegmentManager.java's updateCommittingSegmentZKMetadata() to setDownloadUrl for segment, use the descriptor's segment location instead of the vip address.
- A related change here is to update the segment location field of the committing segment's descriptor after segment moving during a split commit.
- RealtimeValidationManager (PR Pending)
- During segment validation, for any segment without external storage uri, ask one server to upload it to the store and update the segment store uri in Helix.
...