Document the state by adding a label to the FLIP page with one of "discussion", "accepted", "released", "rejected".

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Apache Curator is used in order to perform interactions with ZooKeeper in HA mode for Flink. Current set up misses several configurations options, which could be useful in certain Flink deployments.

We want to ensure that related available options in Apache Curator are configurable for Flink users. Thus Flink users can have all mechanisms to allow Flink interacts with ZooKeeper. Given list of features could be critical for Flink adoption with ZooKeeper in cloud environment. For example, currently is not possible to use ZooKeeper with authorization mechanism along with Flink.

Public Interfaces

There are some new configurations should be exposed for high-availability.zookeeper configuration.

Proposed option

Configuration type

Motivation

high-availability.zookeeper.client.authorization


ConfigOptions#mapType()

Ability to fully utilise given set up of ZooKeeper for environment.

For example: In certain cases ZooKeeper requires additional Authorization information. For example list of valid names for ensemble in order to prevent the accidental connecting to a wrong ensemble.

high-availability.zookeeper.client.max-close-wait-ms


ConfigOptions#intType()

Ability that would enable the user to adjust to different network speeds.

high-availability.zookeeper.client.simulated-session-expiration-percent

ConfigOptions#intType()

Additional checking for Session expiration above what is provided by ZooKeeper.

The rest of the options provided by Curator framework are considered as non-useful:

Proposed Changes

We should incorporate the aforementioned options and translate configuration values into the corresponding Curator builder calls.

An issue arises due to a type mismatch between the Flink configuration parameter high-availability.zookeeper.client.authorization and the corresponding Curator method call. The Curator method anticipates an array of AuthInfo (see method javadoc with signature authorization#List<AuthInfo>) while the Flink configuration for ConfigOptions#mapType() provides a different java type - Map<String, String>. To resolve this, we suggest the following conversion: Each entry of type Map.Entry<String, String> will be transformed into an AuthInfo object with the constructor AuthInfo(String, byte[]). The field entry.key()  will serve as the String scheme  value, while the field entry.getValue()  will be initially converted to a byte[]  using the String#getBytes()  method. Subsequently, this byte array will be utilized as byte[] auth  parameter during the creation of the AuthInfo.

Compatibility, Deprecation, and Migration Plan

N/A

Test Plan

Simple manual tests will do that given options are well applied.

For high-availability.zookeeper.client.authorization we can add a unit test which validates the conversion between the Map<String, String> and AuthInfo[].

Rejected Alternatives

Generic configuration for all Apache Curator options via namespaces

We could think about utilising the namespaces. The FLIP could propose adding namespace support for Apache Curator . E.g. metric high-availability.zookeeper.client.<config_option> could be translated into the appropriate <config_option> of the Curator configuration. That would allow to load any parameter supported by these systems.

Unfortunately Curator connection is configured via Builder pattern, when single configuration is translated into the proper call of the Builder object.