Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Download the segment from the deep store uri found (if any) in the property store of the zk server.
    1. Note: the download is done with retries and exponential backoff time to cater for the time commit server needs for segment upload.
  2. If (1) failed, get the external view of the table (from Controller) to find out the ONLINE servers hosting the segments.
  3. Download the segment from a randomly chosen ONLINE server. The download format is Gzip'd file of the segment on the source server. 

API changes

  1. Segment A new api for segment download from a Pinot server (via server Admin api port)
    • URI path:  /tables/{tableName}/segments/{segmentName}
    • Usage: Download a realtime or offline table segment as a zipped tar file.
    • Code location:  TablesResource

Config change

  1. Enable best effort segment upload and download from peer servers.

Option 1: Add a new optional string field peerSegmentDownloadScheme to the SegmentsValidationAndRetentionConfig in the TableConfig. The value can be http or https

During segment completion phase,

  1. SplitSegmentCommitter  will check this value. If it exists, the segment committer will be able to finish segment commit successfully even if the upload to the segment store fails. The committer will report to the controller that the segment is available in peer servers.
  2. When Pinot servers in LLRealtimeSegmentDataManager fail to download segments from the segment store during goOnlineFromConsuming() transition, they also check this field's value. If it exists, it can 
    1. First discover the segment location server URI u.
    2. Construct the complete uri using the configured scheme (http or https) and use the appropriate segment fetcher to download it.

Note this is a table level config. We will test the new download behavior in realtime tables in incremental fashion. Once fully proven, this config can be upgraded to server level config. 


Option 2:

  • Add a new optional boolean field enablePeerSegmentDownload to the SegmentsValidationAndRetentionConfig in the TableConfig
  • Add a new file scheme called SERVER, when SegmentFetcherFactory sees this scheme in the server instance file system config, it will initialize a new kind of segment fetcher PeerServerSegmentFetcher which givens a segment name, handles both server discovery and segment download. The PeerServerSegmentFetcher can be further configured using http or https fetcher for fetching segment from peers.

       During segment completion phase,

  1. SplitSegmentCommitter  will check this value and behaves exactly like Option 1.
  2. When Pinot servers in LLRealtimeSegmentDataManager fails to download segments from the segment store during goOnlineFromConsuming() transition, it can use the  PeerServerSegmentFetcher (if SERVER scheme is configured) to discover the download the segment from peers.

Failure cases and handling

...