Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Use the RealtimeValidationManager to fix LLC segments only has a single copy. Currently the validation manager only fix fixes the most recent 2 segments of each partition. With this change, it will scans all segments's Zk data, if it finds a segment does not have a deep storage copy, it will instruct a server to upload the segment to the deep storage and then updates it segment location url with the deep storage uri. We need to evaluate the performance penalty of more zk data access by validation manager. We need to deploy optimization techniques to reduce zk data access rate:

  1. Run a periodical job to obtain segment zk data instead of letting the  RealtimeValidationManager to access it every time it runs; and
  2. If commit servers fail to upload the segment, they will report the failures and record them in zk.

Segment Download

In our new design, the segment upload to deep storage is asynchronous and best effort. As a result, the server can not depend only on deep storage. Instead we allow the server to download the segment from peer servers (although it is not preferred).  To discover the servers hosting a given segment, we utilize the external view of the table to find out the latest servers hosting the segment. It has the advantages of keeping the server list in sync with table state for events like table re-balance without extra writing/reading cost.  The complete segment download process works as follows:

...