Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Before implementing operations that rely in some way on time it is necessary to define the different concepts of time in Flink and how time is handled in a streaming topology.

Different

...

Notions of Time

In Flink Streaming, every element has a timestamp attached to it. When a value enters a streaming topology through a source this source attaches a timestamp to the value. The timestamp can either be the current system time of the source (ingress time) or it can be a timestamp that is extracted from the value (event time).

...

The concept of watermarks as used in Flink is inspired by Google's MillWheel: http://research.google.com/pubs/pub41378.html and by work building on top of it for Google Cloud Dataflow API: http://strataconf.com/big-data-conference-uk-2015/public/schedule/speaker/193653

Ordering

...

Elements in a Stream

Elements in a stream can be ordered by timestamp. Normally, it would not be possible to order elements of an infinite stream. By using watermarks, however, Flink can order elements in a stream. The ordering operator has to buffer all elements it receives. Then, when it receives a watermark it can sort all elements that have a timestamp that is lower than the watermark and emit them in the sorted order. This is correct because the watermark signals that not more elements can arrive that would be intermixed with the sorted elements. The following code example and diagram illustrate this idea:

...