Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Before any Flink job can be executed, one JobManager and one or more TaskManager have to be started. The TaskManager then registers at the JobManager by sending a RegisterTaskManager message to the JobManager. The JobManager acknowledges a successful registration with an AcknowledgeRegistration message. In case that the TaskManager is already registered at the JobManager, because there were multiple RegisterTaskManager messages sent, an AlreadyRegistered message is returned by the JobManager. If the registration is refused, then the JobManager will respond with a RefuseRegistration message.

A job is submitted to the JobManager by sending a SubmitJob message with the corresponding JobGraph to it. Upon receiving the JobGraph, the JobManager creates an ExecutionGraph out of the JobGraph which serves as the logical representation of the distributed execution. The ExecutionGraph contains the information about the tasks which have to be deployed to the TaskManager in order to be executed.

The JobManager's scheduler is responsible for allocating execution slots on the available TaskManagers. After allocating an execution slot on a TaskManager, a SubmitTask message with all necessary information to execute the task is sent to the respective TaskManager. A successful task deployment is acknowledged by TaskOperationResult. Once the sources of the submitted job are deployed and running, also the job submission is considered successful. The JobManager informs the JobClient about this state by sending a Success message with the corresponding job id.

State updates of the individual task running on the TaskManagers are sent back to the JobManager via UpdateTaskExecutionState messages. With these update messages, the ExecutionGraph can be updated to reflect the current state of the execution.

The JobManager also acts as the input split assigner for data sources. It is responsible for distributing the work across all TaskManager such that data locality is preserved where possible. In order to dynamically balance the load, the Tasks request a new input split after they have finished processing the old one. This request is realized by sending a RequestNextInputSplit to the JobManager. The JobManager responds with a NextInputSplit message. If there are no more input splits, then the input split contained in the message is null.

The `Task`s Tasks are deployed lazily to the TaskManagers.
This  This means that tasks which consume data are only deployed after one of its producers has finished producing some data.
Once  Once the producer has done so, it sends a `ScheduleOrUpdateConsumers` ScheduleOrUpdateConsumers message to the JobManager.
This  This messages says that the consumer can now read the newly produced data.
If  If the consuming task is not yet running, it will be deployed to a TaskManager.