Current state: Stalled
Discussion thread: -
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Presently, windowed aggregations in KafkaStreams fall into two categories:
Unfortunately, Windows is an abstract class instead of an interface, and it forces some fields onto its implementations.
This has led to a number of problems over the years, but so far we have been able to live with them.
Extension points in an API should almost always be interfaces, since it allows the system to define just the basics of what it needs to know, and implementers can fill in that information any way they see fit. A lot of extra baggage comes with an abstract class, though. It forces constructors and fields upon implementers, which may not be appropriate for the semantics of the implementation. In the case of a multi-level hierarchy, it's not possible to know which layer is responsible for maintaining the field, which leads to bugs and semantic ambiguity.
All of these problems came to the surface when I implemented grace period and needed to move retention time and segments into the store definition. It should have been as simple as deprecating a couple of methods, but instead there were constructors and fields to contend with, and computing the actual retention time to use became enormously complex. For example, should we resolve the value returned by the getter method or the one in the abstract class's public field? I did the best I could, but the resulting code is very difficult to understand. My plan at the time was to deprecate everything except the necessary getter methods, hoping to convert the abstract class into "effectively" an interface, although it would never be possible to actually convert it into an interface, and there would be no way to ensure a simple design bug wouldn't return us to the same unfortunate state in the future. Plus, we just have this perpetual bizarre state in which we want Windows to be exactly like an interface, except that it's an abstract class. Better to just fix it.
My motivation to bring this up now is that I have had several recent conversations in which people were considering new extensions of Windows. We considered extending Windows in KIP-450, and there have been several requests to add calendar-aligned windows or other kinds of windows.
I think the approach I'm proposing here is safer than my earlier plan: Instead of living with an interface-like abstract class Windows, I'm proposing to slice it out of the hierarchy. We add a new interface on top of it, and migrate the DSL to expect that interface. We deprecate Windows itself, causing custom stores to stop extending Windows and just implement the interface instead. Then, once we remove Windows in 3.0 or 4.0, the API is safe and clean.
At the same time, we can make a small adjustment to the interface to correct a design bug that prohibits calendar-aligned windows like daily or monthly windows. The TimeWindows algorithm doesn't require fixed-size windows, just enumerable ones, given a record timestamp. I have already corrected the algorithm in the POC PR, and the only remaining use case for the concept of "window size" is to provide a lower bound for retention time. Thus, I'm proposing to replace Windows#size() with EnumerableWindowDefinition#maxSize(), so that variable-sized window definitions like "monthly windows" could give an upper bound like "32 days" on their size, ensuring windows would never get dropped from the store before they are closed.
- Add new interface to take the place of Windows: EnumerableWindowDefinition
- Add "implements EnumerableWindowDefinition" to TimeWindows and UnlimitedWindows
- Do not add the new interface to JoinWindows, which should not be part of this hierarchy. It will naturally become disconnected when we remove Windows
- Add "implements EnumerableWindowDefinition" to Windows and deprecate it.
- Swap out the argument type in both windowBy methods
New interface to take the place of Windows: EnumerableWindowDefinition
Add EnumerableWindowDefinition to TimeWindows and UnlimitedWindows
Add EnumerableWindowDefinition to Windows and deprecate it
Includes deprecating size() and delegating maxSize() to size() to for compatibility.
Swap out the argument type in windowBy
Note, because Windows now implements EnumerableWindowDefinition, all existing implementations of Windows will automatically work with this change, so there is no compatibility concern.
Compatibility, Deprecation, and Migration Plan
Windows is deprecated. Otherwise, no compatibility issues arise. We can remove Windows cleanly later on.