Status
Current state: Accepted
Discussion thread: https://lists.apache.org/thread/y5owjkfxq3xs9lmpdbl6d6jmqdgbjqxo
JIRA:
-
FLINK-33581Getting issue details...
STATUS
Released: 1.19
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Currently, the job configuration in FLINK is spread out across different components, including StreamExecutionEnvironment, CheckpointConfig, and ExecutionConfig. This leads to inconsistencies between configurations stored in these components. For example, the 'execution.checkpointing.interval' in StreamExecutionEnvironment configuration may be different from the checkpoint interval specified in CheckpointConfig. This can confuse developers and higher-level components like the Table layer has to retrieve configuration from multiple sources.
Furthermore, the approaches used to configure these components are different, with some configurations using complex Java objects while others use ConfigOption, which is a key-value configuration approach. This makes it difficult to effectively manage job configuration. For example, validating non-ConfigOption job configuration is challenging, as seen in StreamContextEnvironment#checkCheckpointConfig. Additionally, passing complex Java objects (e.g., state backend and checkpoint storage) between the environment, streamGraph, and jobGraph adds complexity to development.
To address these issues, it is necessary to standardize the configuration approach by migrating non-ConfigOption objects to use ConfigOption. Additionally, adopting a single Configuration object to host all the configuration can also help resolve these challenges.
However, there is a significant blocker to implement the proposed solution. Currently, the non-ConfigOption objects in the StreamExecutionEnvironment, CheckpointConfig, and ExecutionConfig have already been exposed to users through the public API. This poses a challenge when trying to modify the existing implementation to accommodate the proposed solution. Therefore, this FLIP aims to deprecate these Java objects and their corresponding getter/setter interfaces, ultimately removing them in FLINK-2.0.
Please note that this FLIP does not include deprecating fields related to serialization. The deprecation work for the serialization part will be carried out in conjunction with the relevant work in the FLINK-2.0 serialization section.
Public Interfaces
Deprecate following classes, fields and methods
RestartStrategy:
Class | Annotation |
org.apache.flink.api.common.restartstrategy.RestartStrategies | @PublicEvolving |
org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration | |
org.apache.flink.api.common.restartstrategy.RestartStrategies.FixedDelayRestartStrategyConfiguration | |
org.apache.flink.api.common.restartstrategy.RestartStrategies.ExponentialDelayRestartStrategyConfiguration | |
org.apache.flink.api.common.restartstrategy.RestartStrategies.FailureRateRestartStrategyConfiguration | |
org.apache.flink.api.common.restartstrategy.RestartStrategies.FallbackRestartStrategyConfiguration |
Method | Annotation |
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration) | @Public |
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#getRestartStrategy() | |
org.apache.flink.api.common.ExecutionConfig#getRestartStrategy() | |
org.apache.flink.api.common.ExecutionConfig#setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration) |
Field | Annotation |
org.apache.flink.api.common.ExecutionConfig#restartStrategyConfiguration | @Public |
Suggested alternative: Users can configure the RestartStrategyOptions related ConfigOptions, such as "restart-strategy.type", in the configuration, instead of passing a RestartStrategyConfiguration object.
CheckpointStorage
Method | Annotation |
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(CheckpointStorage storage) | @Public |
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(String checkpointDirectory) | |
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(URI checkpointDirectory) | |
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(Path checkpointDirectory) | |
org.apache.flink.streaming.api.environment.CheckpointConfig#getCheckpointStorage() |
Suggested alternative: Users can configure "state.checkpoint-storage" in the configuration as the fully qualified name of the checkpoint storage or use some FLINK-provided checkpoint storage shortcut names such as "jobmanager" and "filesystem", and provide the necessary configuration options for building that storage, instead of passing a CheckpointStorage object.
StateBackend
Method | Annotation |
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#setStateBackend(StateBackend backend) | @Public |
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#getStateBackend() |
Field | Annotation |
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#defaultStateBackend | @Public |
Suggested alternative: Users can configure "state.backend.type" in the configuration as the fully qualified name of the state backend or use some FLINK-provided state backend shortcut names such as "hashmap" and "rocksdb", and provide the necessary configuration options for building that StateBackend, instead of passing a StateBackend object.
Proposed Changes
We propose deprecating the classes/methods mentioned above and updating the documentation from the Flink website.
Compatibility, Deprecation, and Migration Plan
The mentioned method, fields and class are planned to be deprecated in Flink 1.19 and subsequently removed in Flink 2.0. For users who rely on these, it is recommended to use the ConfigOption stack through configuration. For example, users should configure the application checkpoint storage and state backend like the following code:
Configuration config = new Configuration(); config.set(CheckpointingOptions.CHECKPOINT_STORAGE, "filesystem"); config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, "file://test"); config.set(StateBackendOptions.STATE_BACKEND, "org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackendFactory"); StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(config);
Test Plan
N.A.
Rejected Alternatives
N.A.