Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add Status section

Audience: All Cassandra Users and Developers
User Impact: Support for fast general purpose transactions
Whitepaper: Accord


Current state: Accepted

Discussion thread: ,


serverASF JIRA


Users must expend significant effort to modify their database consistently while maintaining scalability. Even simple transactions involving more than one partition may become complex and error prone, as a distributed state machine must be built atop the database. Conversely, packing all of the state into one partition is not scalable.

Performance also remains an issue, despite recent Paxos improvements: latency is still twice its theoretical minimum over the wide area network, and suffers particularly badly under contention.

This work aims to improve Cassandra to support fast general purpose transactions. That is, those that may operate over any set of keys in the database atomically, modifying their contents at-once, with any action conditional on the existing contents of any key.


The union of all dependencies received during consensus is derived before t is disseminated via Commit and simultaneously a Read is issued by C to a member of each participating shard (preferably in the same DC), with those dependencies known to participate in that shard attached. This replica waits for all dependencies to be committed before filtering out those that are assigned a later t. The remaining dependencies are waited on until they execute and their result applied on this replica, before the read is evaluated and returned to the coordinator. C combines these responses to compute an update and client response, which is then disseminated by Apply to all replicas and returned to the client (respectively).

Code Block

Replica R receiving Commit(X, deps):
    Committed[X] = true
Coordinator C: 
    send a read to one or more (preferably local) replicas of each shard 
        (containing those deps that apply on the shard)
Replica R receiving Read(X, t, deps): 
    Wait for deps to be committed
    Wait for deps with a lower t to be applied locally
    Reply with result of read
Coordinator C (with a response from each shard):
    result = execute(read responses)
    send Apply(result) to all replicas of each shard
    send result to client
Replica R receiving Apply(X, t, deps, result):
    Wait for deps to be committed
    Wait for deps with a lower t to be applied locally
    Apply result locally
    Applied[X] = true