DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
...
While it would allow more complex use cases and we could implement a cluster of clusters, the operational complexity would be much higher. With hierarchical clusters we would need to introduce a separator to address resources in the non-leaf nodes of the cluster tree. There are multiple reasons against this:
- The introduction of this notation itself could Introducing a new notation for virtual clusters isn't transparent to users, they need to add it to their requests to address the correct topic. This could break backward compatibility as users would need to change their clients to use the fully qualified resource names when addressing resources in virtual clusters.
- It would also complicate the ACL structure as we would need to introduce this hierarchy in ACLs and would need to create a much more complicated model to handle the relationships in the hierarchy which in our opinion isn’t worth the price as we may risk keeping backward compatibility or having weird edge cases.
- Besides this, in an average organization there aren’t hundreds of teams who use Kafka, so a flat structure would likely satisfy most needs. Users can continue using “.” as a separator, so they could have “us.amer.ecommerce” as their virtual cluster.
- It is an open question how do we interpret virtual clusters in a hierarchy:
- Are topics in the children visible to the parent VCs? In this case a parent topic may see too much as often users wouldn't need to work with all the topics. If we chose not to automatically share topics or other resources, then the hierarchy doesn't really make sense at all. In short, a hierarchy complicates setting fine grained resource access.
- If children VC topics are shared in the parent, then all children VC topics have to be unique. This breaks the transparency requirement as 2 children can't have a topic with the same name because it would break in the parent. To handle this, we would need to introduce prefix notation, which in turn would require a protocol change.
The following diagrams would denote these questionable scenarios:
A topic must have a different name in different namespaces otherwise it would clash in one of parent.
If topics aren't shared, then the hierarchy is mostly not required.
KIP-37 Style Namespaces
KIP-37 proposed a similar idea, however since it never got through the discussion phase, it never expanded more on the details. We find that this solution has the drawbacks of the hierarchical clusters and additionally it represents namespaces hardcoded in the log directory structure. This isn’t beneficial as moving or renaming topics would become hard as one would need to replicate the data, which is obviously more costly, the more data there is.
...
