...
Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-145-Support-SQL-windowing-table-valued-function-td45269.html
Vote thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-145-Support-SQL-windowing-table-valued-function-td455712nd-td46452.html
JIRA: Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key FLINK-19604
...
Windowing table-valued functions
Tumbling Windows
The table-valued function TUMBLE
assigns a window for each row of a relation based on a time attribute columnWe would like to propose 4 kinds of window TVF: tumbling, hopping, cumulative, session. The return value of TUMBLE
is a relation window TVF is a relation that includes all columns of data as well as additional two columns named one column named window
to indicate the assigned window. The window column is a composite ROW
type consists of window_start
and , window_end
and window_time
which indicate the assigned window columns. The window_time
column field is the a time attribute of the record after window TVF, it alwsays equal to "window_end - 1"
.
Tumbling Windows
The table-valued function TUMBLE
assigns a window for each row of a relation based on a time attribute column. The return value of TUMBLE
is a relation that includes all columns of data as well as additional a ROW
type column named window
to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF. All assigned windows have the same length, and that’s why tumbling sometimes is named as “fixed windowing”.
TUMBLE
that takes three required parameters and one optional parameter:
Code Block | ||
---|---|---|
| ||
TUMBLE(data, DESCRIPTOR(timecol), size [, offset ]) |
...
The table-valued function HOP
assigns windows that cover rows within the interval of size and shifting every slide based on a timestamp column. The return value of HOP
is is a relation that includes all columns of data as well as additional two columns named window_start
and window_end
and window_time
which indicate the assigned window. The window_time
column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1"
. The a ROW
type column named window
to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF. Windows assigned could have overlapping so hopping sometime is named as “sliding windowing”.
...
The table-valued function CUMULATE
assigns windows that cover rows within an initial interval of step size, and expanding to one more step size (keep window start fixed) every step until to the max window size. The return value of CUMULATE
is a relation that includes all columns of data as well as additional two columns named window_start
and window_end
and window_time
which a ROW
type column named window
to indicate the assigned window. The window_time
column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1"
. The original row time attribute "timecol" will be a regular timestamp column after window TVF. Windows assigned could have overlapping.
...
The table-valued function SESSION
assigns windows that cover rows based on datetime. Within a session window, distances of rows are less than interval. Session window is applied per key. The returned return value of SESSION
SE
is a relation that includes all columns of data as well as additional three columns named window_start
and window_end
and window_time
which indicate the assigned window. The window_time
column is the time attribute of the record after window TVF, it alwsays equal to "window_end - 1"
a ROW
type column named window
to indicate the assigned window. The original row time attribute "timecol" will be a regular timestamp column after window TVF.
...
Explore more ability of Polymorphic Table Functions
It's very exciting to see the potential ability of PTF. As we can see, in the future, maybe we can support the following features in SQL in a standard PTF way.
- advanced operations supported in Table API (FLIP-29), e.g. drop_columns, user-defined-table-aggregate
- user defined join operator
- a shortcut TopN function
- re-assign watermarks?
- re-partition data, similar feature to Hive DISTRIBUTED BY syntax.
- ...