Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
As Flink moves toward version 2.0, we want to provide users with a better experience with the existing configuration. In this FLIP, we outline several general improvements to the current configuration.
Public Interfaces
We listed the general improvements we want to make in this FLIP as the following. Details of each change can be found in the "Proposed Changes" section:
Ensure all the ConfigOptions are properly annotated
Ensure all user-facing configurations are included in the documentation generation process
Make the existing ConfigOptions use the proper type
Mark all internally used ConfigOptions with the @Internal annotation
Proposed Changes
In this section, we describe in detail all the configurations that need updating.
Ensure all the ConfigOptions are properly annotated
Many user-facing ConfigOptions are currently not annotated at all. We will make sure that they are properly annotated.
Mark the following class as PublicEvolving:
CEPCacheOptions
AlgorithmOptions
HighAvailabilityOptions
RestOptions
SecurityOptions
SlowTaskDetectorOptions
InfluxdbReporterOptions
PrometheusPushGatewayReporterOptions
JobResultStoreOptions
ShuffleServiceOptions
SqlClientOptions
YarnConfigOptions
We will also update the ConfigOptionsDocGenerator to verify that the ConfigOptions are properly annotated.
Ensure all user-facing configurations are included in the documentation generation process
The following ConfigOptions
are defined in classes that are not included in the documentation generation process. We will relocate these ConfigOptions to a class that is included in the documentation generation.
GPUDriver#DISCOVERY_SCRIPT_PATH
GPUDriver#DISCOVERY_SCRIPT_ARG
The following ConfigOptions will be moved to a new class GPUDriverOptions at package org.apache.flink.externalresource.gpu.GPUDriver
. The docs of the ConfigOptions will be generated as the following with dynamic prefix, similar to MetricOptions:
Key | Default | Type | Description |
external-resource.<resource_name>.param.discovery-script.path | (none) | String | The path of the discovery script. It can either be an absolute path, or a relative path to FLINK_HOME when defined or the current directory otherwise. If not explicitly configured, the default script will be used. |
external-resource.<resource_name>.param.discovery-script.args | (none) | String | The arguments passed to the discovery script. For the default discovery script, see Default Script for the available parameters. |
Note that the above ConfigOptions are invisible to the users currently, so we can directly introduce a PublicEvolving class that contains the ConfigOptions above without a deprecation process.
Make the existing ConfigOptions use the proper type
Some ConfigOptions do not specify the type properly. We will update the type of the ConfigOptions in a backward-compatible way.
The following ConfigOption
will be Duration Type:
RpcOptions#TCP_TIMEOUT
RpcOptions#STARTUP_TIMEOUT
ClusterOptions#INITIAL_REGISTRATION_TIMEOUT
ClusterOptions#MAX_REGISTRATION_TIMEOUT
ClusterOptions#ERROR_REGISTRATION_DELAY
ClusterOptions#REFUSED_REGISTRATION_DELAY
ClusterOptions#CLUSTER_SERVICES_SHUTDOWN_TIMEOUT
HighAvailabilityOptions#ZOOKEEPER_SESSION_TIMEOUT
HighAvailabilityOptions#ZOOKEEPER_CONNECTION_TIMEOUT
HighAvailabilityOptions#ZOOKEEPER_RETRY_WAIT
ResourceManagerOptions#JOB_TIMEOUT
ResourceManagerOptions#STANDALONE_CLUSTER_STARTUP_PERIOD_TIME
ResourceManagerOptions#TASK_MANAGER_TIMEOUT
RestOptions#AWAIT_LEADER_TIMEOUT
RestOptions#RETRY_DELAY
RestOptions#CONNECTION_TIMEOUT
RestOptions#IDLENESS_TIMEOUT
InfluxdbReporterOptions#CONNECT_TIMEOUT
InfluxdbReporterOptions#WRITE_TIMEOUT
YarnConfigOptions#CONTAINER_REQUEST_HEARTBEAT_INTERVAL_MILLISECONDS
Note:
When a value is set to Duration type ConfigOption without a time unit, it will be considered as milliseconds. Thus, it is backward compatible if the time unit of the original ConfigOption is millisecond. And all the original ConfigOptiosn above are using millisecond as the time unit.
RpcOptions#TCP_TIMEOUT, RpcOptions#STARTUP_TIMEOUT, and ResourceManagerOptions#JOB_TIMEOUT are String typed at the moment, but they are all parsed by method `org.apache.flink.util.TimeUtils#parseDuration`, which is also used to parse the duration typed ConfigOption. Therefore, the changes are backward-compatible.
The following ConfigOption
will be Enum type:
NettyShuffleEnvironmentOptions#SHUFFLE_COMPRESSION_CODEC
OptimizerConfigOptions#TABLE_OPTIMIZER_AGG_PHASE_STRATEGY
Note:
The two configurations above will throw an exception if the value is unknown, which is the same behavior if we update the type to enum, so the changes are backward-compatible.
The following ConfigOption
will be Int type:
YarnConfigOptions#APPLICATION_ATTEMPTS
Mark all internally used ConfigOptions with the @Internal annotation
The ConfigOptions
listed below are currently used only internally after checking with the committers familiar with each module; however, they have not yet been marked with the @Internal annotation:
PythonDynamicTableOptions
Dispatcher#CLIENT_ALIVENESS_CHECK_DURATION
ClusterEntrypoint#INTERNAL_CLUSTER_EXECUTION_MODE
FileJobGraphRetriever#JOB_GRAPH_FILE_PATH
Compatibility, Deprecation, and Migration Plan
The changes made in this FLIP are backward compatible. No deprecation or migration plan is needed.
Test Plan
Existing UT/IT can ensure compatibility with old options. New tests will cover the new options.