Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Solution: implement JM fault tolerance/high availability by having multiple JM instances running with one as leader and the other(s) in standby. The exact coordination and state update protocol between JM, TM, and clients is the topic of this document.

JIRA: FLINK-2287

Having standby JM instances requires distributed coordination between JM, TM, and clients. For this, we will use ZooKeeper (ZK).

Pros:

...