Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

After FLIP-34 we have introduced two different types for job stop:

TypeSource OPSTask StatusJob Status
SUSPEND

Checkpoint Barrier,

End Of Stream

FinishedFinished
TERMINATE

MAX_WATERMARK, Checkpoint Barrier,

End Of Stream

FinishedFinished

And we need below implementations to support performing a checkpoint when stopping the job when (with retained checkpoint is configured):

  1. The Job Manager triggers a synchronous checkpoint at the source, that also indicates one of TERMINATE or SUSPEND
  2. Sources send a MAX_WATERMARK in case of TERMINATE, nothing is done in case of SUSPEND
  3. The Task Manager executes the checkpoint in a SYNCHRONOUS way, i.e. it blocks until the state is persisted successfully and the notifyCheckpointComplete() is executed.
  4. The Task Manager acknowledges the successful persistence of the state for the checkpoint
  5. The Job Manager sends the notification that the checkpoint is completed
  6. The Task Manger unblock the synchronous checkpoint execution.
  7. Finishing the job progress from the sources, i.e. they shut down and EOS message propagate through the job.
  8. The Job Manager waits until the job state goes to FINISHED before declaring the operation successful.

...