Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We can define a few invariants to help narrow the design:

  • Only a single record transaction may exist at a time
  • A transaction cannot be interleaved with other records outside the transaction
  • All scheduled writes to Raft must be atomic, including the two marker records

...

As the controller sends records to Raft, it will optimistically apply the record to its in-memory state. The record is not considered durable (i.e., "committed") until the high watermark of the Raft log increases beyond the offset of the record. Once a record is durable, it is made visible outside the controller to external clients. Since we are defining transactions that may span across Raft commits, we must extend our criteria for visibility to include whole transactions. When the controller sees that a partial transaction has been committed, it must not expose it the records until the entire transaction is committed. This is how the controller will provide atomicity for a transaction.

...

Another aspect of atomicity for transactions is dealing with controller failover. If a transaction has not been completed when a controller crashes, there will be a partial transaction committed to the log. The remaining records for this partial transaction are forever lost, and so we must abort the entire transaction. This will be done by the newly elected controller inserting an AbortMarker.

Detecting Since we only allow for a single transaction at a time, detecting a partial transaction during failover is easily determined by examining the log. An incomplete transaction is would only expected during failover and would occur at the end of the log.

...

There will be control records in the Raft log after this partial transaction (e.g., records pertaining to Raft election), but these are not exposed to the controller as records. To abort this transaction, the controller will write a AbortMarkerRecord to the log.

Broker Support

The records fetched by the brokers will include everything up to the Raft high watermark. This will include partial transactions. The broker must buffer these records and not make them visible to its internal components. Once an EndMarker or AbortMarker is seen, the buffered records can be processed or discarded, respectively. 

...