Design document to outline desired changes in the Streaming Window Join Operator

 

 

Requirements

  • Joins should initially run only on time windows
  • Support for operator time and event time
  • Event time version must support multiple windows being in progress (or buffered) while waiting for watermarks

  • Support for simple Java heap and Flink-managed memory
  • Flink-managed memory variant should support out-of-core operation

  • Join buffers or hashtables need to be checkpointed.
  • Join buffers or hashtables need to support incremental checkpointing
  • Join buffers or hashtables should support asynchronous checkpointing

  • For sliding time windows, we should consider variants to reuse the join candidates across multiple windows they are contained in.

 

 

 

  • No labels

1 Comment

    • Join maybe need support count based window.
    • Window trigger policy need support by every event.
    • If two join objects have join conditions like ,
      dataStream1.join(dataStream2)
          .onWindow(windowing_params)
          .where(key_in_first)
          .equalTo(key_in_second);

      when saving event in dataStream2 window, the event must be saved as the key of key_in_second for fast finding.