...
- There is no separate ZK dependency for the client and the server
- Makes it easy to start ZK in "managed" mode (see below)Will we shade ZK away?, because we will ship ZK with Flink
JM/TM/Client configuration
- Add HA mode for JMs (allowing non-HA and HA JM operation)
- HA mode when a ZooKeeper quorum is configured
- zookeeper.quorum: "host:port,[...,]host:port"
- TM, clients need to:
- connect directly to configured JM (non-HA)
- connect to JM leader; ZK coordinates the JM address (HA)
- Note: as of current stable ZK version (3.4.6) the ZK quorum needs to be statically configured (see ZK-107)
...
- ZK client configuration in Flink only requires the address of the ZK quorum to connect to
- Znode root and znode paths relative to this root, e.g. /flink (root), and /flink/leader
ZK server configuration
- ZK managed by Flink (default): Flink startup scripts provides script to start ZK servers for HA mode
zookeeper.mode: "managed" (default)
zookeeper.dir: "/flink" (default)
zookeeper.quorum: "host:port,[...,]host:port"
zookeeper.property.<zk-property>: ... (set single ZK properties)
zookeeper.conf=path-to-zk-conf (similar to current Hadoop conf; has precedence over Flink conf)- Configuration via zoo.cfg
- Dedicated ZK cluster (user sets up ZK)
zookeeper.mode: "dedicated"
zookeeper.quorum: "host:port,[...,]host:port"zookeeper.dir: "/flink" (default)
...
Startup scripts
...
- Add an optional file `backup-masters` or `masters``bin/start-cluster-streaming.sh` takes extra argument for backup masters to start`masters` to use for HA mode (contains all masters)
- No extra startup scripts, HA mode is activated if quorum is configured
- scripts start JobManager on hosts configured in `masters`
Leader Election
TaskManager and Client retrieve leading JobManager using ZooKeeper
Every time they lose connection to the current JobManager, they try to reconnect to the current leader using ZooKeeper
...