DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
...
For simplicity, let’s assume there’s only one partition, __diskless-metadata-0. The Batch Coordinator instance will run on the broker that is the leader of this partition, while read-only Batch Coordinators instances will run on all the brokers that are in-sync replicas for this partition. The Batch Coordinator will handle all the requests that modify the state (such as batch commit and delete requests), while the read-only Batch Coordinators can serve requests that only need to read the state.
The topic is the storage and replication medium, but it cannot be directly queried in a performant way. For this, the state needs to be materialized locally. This, in turn, poses another challenge: the state can become too large to fit into memory. That means, the materialization mechanism must be backed by a disk. This set of requirements suggests using some embedded disk-based database engine. We propose to use SQLite. SQLite is an open-source, self-contained SQL database engine that can be embedded into an application. It’s widely used in the software industry and can be easily integrated into any codebase. Using SQLite offers the possibility for structured querying and indexing of the state, allowing fast and easy access to the Batch Coordinator state without the need for implementing all the data structures and algorithms needed for managing the state on disk.
...
Topic-partitions are assigned to one specific Batch Coordinator. This mapping is stored in the KRaft metadata.
Risks
The __diskless-metadata topic health is essential for the topic-based Batch Coordinator to function correctly. This creates some risks when operating this type of Batch Coordinator:
...


