Current state: Under Discussion
Discussion thread: here
KStream.branch method uses varargs to supply predicates and returns array of streams ('Each stream in the result array corresponds position-wise (index) to the predicate in the supplied predicates').
This is poor API design that makes building branches very inconvenient because of 'impedance mismatch' between arrays and generics in Java language.
- In general, the code have poor cohesion: we need to define predicates in one place, and respective stream processors in another place of code. In case of change we must remember to edit two pieces of code.
- If the number of predicates is predefined, this method forces us to use 'magic numbers' to extract the right branch from the result (see examples here).
- If we need to build branches dynamically (e. g. one branch per enum value) we inevitably have to deal with 'generic arrays' and 'unchecked typecasts'.
The proposed new
org.apache.kafka.streams.kstream.KafkaStreamsBrancher classs introduces new standard way to build branches on top of KStream.
we could use
Here the new KStream#branch() method returns KBranchedStream<K, V> object, which, in turn, contains `branch` and `defaultBranch` methods.
This is critical that KStream consumers in .branch methods should be invoked immediately during the `branch` methods invocation. This is necessary for the case when we need to gather the streams that were defined in separate scopes back into one scope using auxiliary object:
Add the new KBranchedStream class and branch() method for KStream (see https://github.com/apache/kafka/pull/6512).
Compatibility, Deprecation, and Migration Plan
The proposed change has no impact on existing code and is backwards compatible. All the old code that uses
branch method will continue to work, we will just get the new way to perform branching.
Add KStreamsBrancher class that works the same way, but does not require KStream interface modification: