Skip to end of metadata
Go to start of metadata


Current state: Done.

Release: 4.5.0


Using BookKeeper as a low-latency distributed data storage achiving read-your-writes consistency outside the LAC protocol

The primary purpose of BookKeeper is to be used as distributed transaction log, but for its internal design it can be used as a distributed storage of binary data.

The idea is to store binary objects (BLOBS) as entries inside ledgers. In this case there is no need for "tailing reads" or "fencing".

This way of working can be drafted with this every simple scenario:

  • Writers create ledgers, writes BLOBS to them and store to an external database the ID of each BLOB a pair (ledgerid, entryid) as BLOB_ID
  • Once a write as been acked by the configured quorum of bookies the BLOB can be considered to be stored in a durable storage
  • Readers can use the BLOB_IDs to access data reading directly from any of the the bookies which contain a copy of the BLOB



In BookKeeper 4.4.0 the asyncReadEntries method checks that the range of entries to be retrieved is within the safe range of entry ids according to the LAC protocol:

The Idea is to add a new BookKeeper client side  method "readUnconfirmedEntries" (and the async counterpart) which does not check for the LastAddConfirmed range


We can say that there is an out-of-band contract between the writer and the readers, we only want to achieve "read-your-writes consistency"


Rejected Alternatives:

  • Add a ClientConfiguration option 'readEntriesAllowReadAfterLastAddConfirmed' to disable the check for the method readEntries.
    • This has been rejected (Sijie Guo) because the usage of a new method will be clearer and because if a client is shared among different reader procedure this option will introduce uncontrolled side-effects


Related works:

  • No labels