Status
Current state: "Under Discussion"
Discussion thread: here (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
...
Page properties | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
We should use a real per-job cluster also for the attached-per-job case but that does have some implications in the functionality of bin/flink, see cluster lifecycle and public interfaces.
JobGraph retrieval
For executing a user program, we need to retrieve a JobGraph from the main() method and hand that off to a client to trigger execution. There are two ways we do this currently: 1. hijacking execute() to throw an exception and 2. injecting a client into the ContextEnvironment.
...
This does not change public interfaces, what will change, however, is the feedback that a user gets when running bin/flink in attached mode. Failures during job submission will no longer be reported back. See cluster lifecycle.
Design
We propose two changes: 1. don't use the "hijack execute and throw an exception" code path anymore but instead have an execute() method that does the per-job cluster deployment 2. also use that new code path for detached-per-job-mode.
...