Motivation

Currently Flink state information (checkpoint/savepoint) can be accessed through State Processor Table API. However the meta information coming from the state is not available (for example checkpoint ID, operator name, etc.). Since it’s valuable information, here it’s planned to add them.

Public Interfaces

New built-in value table function is intended to be added.

Proposed Changes

The mentioned table function is going to have the following schema:

Key

Data Type

Description

checkpoint-id

BIGINT NOT NULL

Checkpoint ID.

operator-name

STRING

Operator Name.

operator-uid

STRING

Operator UID.

operator-uid-hash

STRING NOT NULL

Operator UID hash.

operator-parallelism

INT NOT NULL

Parallelism of the operator.

operator-max-parallelism

INT NOT NULL

Maximum parallelism of the operator.

operator-coordinator-state-size-in-bytes

BIGINT NOT NULL

The operator’s coordinator state size in bytes, or zero if no coordinator state.

operator-total-size-in-bytes

BIGINT NOT NULL

Total operator state size in bytes.


The new function can be used the following way:

LOAD MODULE state;
SELECT * FROM savepoint_metadata('/root/dir/of/checkpoint-data');


Compatibility, Deprecation, and Migration Plan

No migration needed.

Test Plan

It’s planned to implement automated integration tests.

Rejected Alternatives

None.