Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

State in streaming programs

More complex streaming pipelines generally need to keep some sort of operator state to execute the application logic. Examples include, keeping some aggregation or summary of the received elements, or we can imagine more complex states such as keeping a state-machine for detecting patterns for fraudulent financial transactions or holding a model for some machine learning application. While in all mentioned cases we keep some kind of summary of the input history, the concrete requirements vary greatly from one stateful application to another.

This document is intended to serve as a guide for stateful stream processing in Flink and other systems, by identifying some common usage patterns and requirements for implementing stateful operators in streaming systems. 

NOTE: This document is a design draft and does not reflect the state of a stable release, but the work in progress on the current master, and future considerations.

 

This document is intended to give an overview of how operator state is handled and represented in streaming programs.

State in streaming programs

tbd

State access patterns

To design a properly functioning, scalable state representation we need to understand the different state access and handling patterns that applications need in streaming programs. We can distinguish three main state types (access patterns): local state, partitioned state, out-of-core state. 

...