Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Our new approach relies on the fact that "segmentLocation" is now a list of different HTTP end points. Any one location should be sufficient to download this segment (needless to say if one location is unavailable, we can retry using the next location). Exactly how this segment is persisted and when can it transition to ONLINE state can be done in 2 different ways:

1a) No Quorum:

  • The commit leader becomes the owner of the segment being committed. It will make this segment available using a HTTP end point.
  • Commit leader will inform the controller using a similar 'COMMIT' API. This will update the segment location to this HTTP end point.
  • Similarly other replicas will keep updating this 'segment location' via the Controller.

...

  • If the only server that saves a copy of this segment is destroyed and at the same time the other replicas are unable to catch up, then we may completely lose this segment. In this case, we need to do a segment recovery (either from Kafka or from an offline source). One way to get around this is to upload the segment to the controller as it happens in the current protocol and expose this segment via some HTTP URL. This gives additional redundancy without drastically changing the protocol.

2b) Quorum:

We can further improve the availability of this design by requiring the controller to wait for a quorum of servers to finish ingesting the current segment before updating segment metadata and transitioning on an ONLINE state. In this case, the Segment metadata (segment location) will point to a list of HTTP endpoints (for the different replicas that successfully ingested this segment).

...