Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Download the segment from the deep store uri found (if any) in the property store of the zk server.
    1. Note: the download is done with retries and exponential backoff time to cater for the time commit server needs for segment upload.
  2. If (1) failed, get the external view of the table (from Controller) to find out the ONLINE servers hosting the segments.
  3. Download the segment from a randomly chosen ONLINE server. The download format is Gzip'd file of the segment on the source server. 

Failure cases and handling

  1. What happens when the The segment upload fails but the preceding metadata commit succeeds?
    • In this case, if a server needs can failover to download the segment, it needs to download from the commit server which has a copy of the data.
    • If users want to minimize the chances of downloading from peer servers: the segment completion mode can be set as DEFAULT instead of DOWNLOAD.
    • In the background, RealtimeValidationManager will fix the upload failure by periodically asks the server to upload missing segments..
  2. The segment upload fails and the commit server crashes but the preceding metadata commit succeeds
    • The non-committer server can not download from the committer server
    • In DEFAULT segment completion mode, the non-committer server can still try to finish the segment.
    • In DOWNLOAD segment completion mode, the non-committer server will get into ERROR state for the segment.
    • Wait for the RealtimeValidationManager to fix the segment.
  3. The segment upload succeeded but the the commit server crashes
    • The non-commit servers can download from the segment store.
  4. The segment upload succeeded but the controller crashes
    • Can be handled similar to the current failure handling mechanism.
    • If another server was asked to commit and upload the same segment again, let PinotFS to handle the segment overwrites. 
  5. Another What happens if another server gets a "download" but the committer has not gotten to ONLINE state yet?  
    • To account for the fact that the metadata commit happens before the segment upload, another server should do retries (with exponential backoff) when downloading.
    • The retries with wait can greatly reduce the issues caused by the above race condition.

...