As-Is

Currently, windows can be individually defined per pipeline element. The whole windowing logic needs to be declared in the controller and runtime logic needs to be individually added based on the selected runtime wrapper.

Problem

As many data processors benefit from using windows, windowing logic is often duplicated as it needs to be implemented for every new pipeline element. In addition, the feature set of supported window operators differs (and often depends on the developer) as it is unclear which windows and parameters should/can be offered.  

Approach

Adding support for explicit window semantics to the SDK/Core would implementing data processors and sinks using windows much easier and less error prone. 

There are currently two proposed options:

  • 1. Introduce a dedicated Window PE which can be used before any existing PE.
    • No need for API changes.
    • But, might have to flag event in a way for the next processor to identify, to which window the event belongs to (i.e with sliding windows, etc...).
    • So, have to change the processing logic of existing PEs to check event flag and process accordingly.
    • So all the existing PEs might not work with window semantics. In that case, we need a way to show window compatible PEs (because, having a window before a normal PE might result in un-expected outputs)
  • 2. Introduce windowed EventProcessor/EventSink APIs which allows users to write their own windowed extensions (i.e aggregators, etc...)
    • No need for event flagging. Can introduce API methods like onCurrentEvent, onExpiredEvent, onResetEvent, etc...
    • Need API changes/refactoring in EventProcessor/EventSink, as well as in existing PEs.
    • Need a way to expose the window related parameters through existing PEs DataProcessorDescription.


  • No labels