Table of contents
Jira Boards
Flink 1.17 Burndown: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=572
Sync meeting
The sync meeting is happening every second Tuesday starting on the 1st of November 2022 at 9am CEST / 4pm China Standard Time / 8am UTC.
As we are getting closer to the feature freeze we will do the meeting on a weekly base.
Feel free to join on Google Meet. Local dial-in numbers can be found at https://tel.meet/wcx-fjbt-hhz?pin=1940846765126
Timeline
- Feature Freeze
January, 17th of 2023, end of business CEST.- January 31st, 2023, end of business CEST (ML discussion on change)
- Release
Mid of March 2023- End of March 2023
Chinese new year is 22 January: 1+ week holiday. Ppl will be back in early February
Highlight features
please feel free to add/suggest.
Features
List of features announced by contributors and committers that are likely to be ready for the feature freeze:n
NOTICE: It's preferred if only new features end up there and not all bugs/tasks separately, so that the page is not over bloated. Of course, unless fixing a bug is a really big or important one equivalent to implementing a completely new feature. A good rule of thumb would be that each entry in the page could (but does not have to) be later on included in a release blog post.
Legend
State
symbol | meaning | comment |
---|---|---|
validated | through cross team testing | |
done | well documented with a complete test coverage | |
will make it | there is no reason this effort should not go into 1.17 | |
in danger | there are some concerns the effort could be ready for the feature freeze of 1.17 | |
very unlikely | there are severe concerns the effort could make it to 1.17 | |
won't make it | it was decided against adding this for the 1.17 release. working on the effort has been stopped. | |
state unclear | ||
independent | as the artifact could be released independent of Apache Flink |
X-Team verification
symbol | meaning |
---|---|
done | |
not required |
Feature Stage
Please align with the list on the Apache Flink Roadmap (https://flink.apache.org/roadmap.html).
- MVP: Have a look, consider whether this can help you in the future.
- Beta: You can benefit from this, but you should carefully evaluate the feature.
- Ready and Evolving: Ready to use in production, but be aware you may need to make some adjustments to your application and setup in the future, when you upgrade Flink.
- Stable: Unrestricted use in production
- Reaching End-of-Life: Stable, still feel free to use, but think about alternatives. Not a good match for new long-lived projects.
- Deprecated: Start looking for alternatives now
Summary
Numbers are based on the items in the list below, not on the tickets
∑ | Remaining weeks | |||||||||
2022-11-01 | 13 | |||||||||
2022-11-15 | 11 | |||||||||
2022-11-29 | 4 | 10 | 0 | 0 | 0 | 23 | 0 | 37 | 9 | |
2022-12-13 | 7 | 24 | 1 | 0 | 1 | 9 | 0 | 42 | 7 | |
2022-12-27 | 7 | 25 | 1 | 0 | 1 | 8 | 0 | 42 | 5 | |
2023-01-10 | 9 | 27 | 0 | 0 | 3 | 5 | 0 | 44 | 3 | |
2023-01-17 | 13 | 23 | 0 | 0 | 3 | 3 | 2 | 44 | 2 | |
2023-01-24 | 16 | 21 | 0 | 0 | 3 | 3 | 2 | 45 | 1 | |
2023-01-31 | 26 | 5 | 0 | 0 | 16 | 0 | 0 | 47 | 0 | |
2023-02-14 | 5 | 23 | 3 |
List
Feel free to add categories.
Runtime | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
FLIP in voting | ||||||||||
Shuffle | ||||||||||
Xintong Song | 100% | 10-01-2023 | 10-01-2023 | 14-02-2023 | ||||||
AdaptiveBatchScheduler should supports early consumption for dynamic graph. | Weijie Guo | Xintong Song | ||||||||
Yuxin Tan | Xintong Song | 100% | 17-01-2023 | 17-01-2023 | 10-02-2023 | self-test | ||||
Further improvement of production availability of hybrid shuffle | 100% | 17-01-2023 | 17-01-2023 | self-test | ||||||
Deployment & Cluster Coordination | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
REST API | ||||||||||
100% | 20-02-2023 | 15-11-2022 | 15-11-2022 | self-test | ||||||
The first stage is finished, and the second stage is improvement, it will be finished in the next version. | 100% | 09-01-2023 | 05-01-2023 | 09-01-2023 | ||||||
Reactive Mode | ||||||||||
n/a | ||||||||||
Scheduler | ||||||||||
100% | 17-01-2022 | |||||||||
Biao Liu | 100% | 31-01-2022 | ||||||||
100% | 31-01-2023 | |||||||||
Misc | ||||||||||
n/a | ||||||||||
OLAP | ||||||||||
n/a | ||||||||||
State backend | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
Hangxiang Yu | 80% | 31-1-2023 | ||||||||
Improve File Management in State Backend | 5% | 31-1-2023 | ||||||||
100% | 31-1-2023 | self-test | ||||||||
70% | 31-1-2023 | |||||||||
40% | 31-1-2023 | |||||||||
100% | 30-1-2023 | Tested by Martijn Visser | ||||||||
100% | 31-1-2023 | Tested by Martijn Visser | ||||||||
Checkpoint | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
Benchmark Generic Incremental CP + UC + BD | Yuan Mei | |||||||||
100% |
| |||||||||
Benchmark | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
Yuan Mei | 50% | 31-1-2023 | ||||||||
API | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
Deprecated | 100% | 08-11-2022 | 08-11-2022 | 08-11-2022 | Self-tested | |||||
Removed | 40% | 31-01-2023 | ||||||||
Dawid Wysakowicz | 90% | 31-01-2022 | ||||||||
FLIP Discussion | Need to be postponed to the next release | 10-01-2023 | ||||||||
Martijn Visser | Removed | 100% | 14-11-2022 | 14-11-2022 | 14-11-2022 | Self-tested | ||||
90% | 31-01-2022 | Self-tested | ||||||||
Ruan Hang | Coding | |||||||||
SQL | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
General | ||||||||||
Godfrey He ,@Yunhong Zheng | 100% | 2023-01-31 | ||||||||
31-01-2023 | ||||||||||
Chesnay Schepler | 31-01-2023 | |||||||||
Table API | ||||||||||
2023-01-20 | ||||||||||
Calcite Update | ||||||||||
Sergey Nuyanzin | 31-01-2023 | |||||||||
Flink Dialect | ||||||||||
100% | 2023-02-14 | self-tested | ||||||||
100% | 2023-02-14 | self-tested | ||||||||
Hive Dialect | ||||||||||
100% | 2023-01-31 | |||||||||
31-01-2022 | ||||||||||
SQL Gateway | ||||||||||
Shengkai Fang , Zelin Yu | writing doc | 100% | 2023-01-10 | |||||||
QE | ||||||||||
100% | 2023-01-10 | |||||||||
QO | ||||||||||
FLINK-27591 - Improve the plan for batch queries when statistics is unavailable OPEN | Godfrey He ,@Yunhong Zheng | Need to be postponed to the next release | 40% | 2023-1-31 | ||||||
API/Python | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
HuangXingbo | 100% | 10-01-2023 | self-test | |||||||
HuangXingbo | 0% | 10-01-2023 | ||||||||
HuangXingbo | 100% | 19-01-2023 | self-test | |||||||
HuangXingbo | 100% | 19-01-2023 | self-test | |||||||
HuangXingbo | 100% | 10-01-2023 | self-test | |||||||
Machine Learning | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
n/a | ||||||||||
CEP | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
n/a | ||||||||||
Web | ||||||||||
Name/JIRA Issue | Responsible Contributor | Reviewer / committer available | Feature Stage | Note | State | % | Updated | Implemented | Documented | X-team verified |
100% |
Status / Follow-ups
2022-11-01
Agenda
- Kickoff
- Keeping the state of features updated (ideally before the sync)
- Blockers
- Build stability
- Are there any (new) contributors who need a PR reviewed or merged? And if yes, who can help out?
2022-11-15
Agenda
- Kickoff
- Keeping the state of features updated (ideally before the sync)
- Blockers
- Build stability
- Are there any (new) contributors who need a PR reviewed or merged? And if yes, who can help out?
2022-11-29
- Build instabilities
- - FLINK-28766Getting issue details... STATUS : Anton has some new findings on that issue and will get back on it
- Pulsar-related issues:
- Python-related issues:
- - FLINK-29461Getting issue details... STATUS Matthias Pohl pings Xingbo Huang / Dian Fu
- - FLINK-26974Getting issue details... STATUS
-
-
FLINK-18356Getting issue details...
STATUS
: OOM errors are most likely being caused by
flink-table-planner
(Godfrey He might be a person to reach out to about it) - - FLINK-29427Getting issue details... STATUS : Qingsheng Ren will look into the PR
- - FLINK-27916Getting issue details... STATUS Matthias Pohl will ping contributors on the issue - but it's not that urgent since it's not failing that frequently
- Other topics:
- Externalizing Pulsar connector (test instabilities): Martijn Visser is working on externalizing the code base
- Performance test monitoring: Discussion is happening on the mailing list
- Externalizing connectors in general is work-in-progress but looks good
- Public CI documentation can be improved
- Matthias Pohl will work on a first approval
2022-12-13
- Build instabilities
- - FLINK-29405Getting issue details... STATUS → Qingsheng to have a look at the PR
- Pulsar connector has been synced to external connector repository. Martijn Visser to open a PR to remove connector from master
- - FLINK-18356Getting issue details... STATUS → Qingsheng to ping Godfrey
- - FLINK-18356Getting issue details... STATUS → The PR is still failing for the same issue as this PR should fix. We should ping the author to have a look first
- - FLINK-27916Getting issue details... STATUS → Martijn to ping Thomas once more
- - FLINK-26974Getting issue details... STATUS → Xingbo is working on this
- How to have monitoring and quality control for the externalized connectors → Need to have a discussion on the Dev mailing list. Martijn Visser to make a proposal and open a discussion thread on this topic.
2022-12-27
- Meeting skipped due to Christmas holiday/sick leaves
- Discussion started on moving the feature freeze from Jan 17 to Jan 31 due to pandemic situation in China (see dev ML discussion thread)
- Pulsar connector has been externalized. Pulsar-related test instabilities were disabled in
release-1.16
andrelease-1.15
(see - FLINK-30351Getting issue details... STATUS and parent task)
2023-01-10
- Build instabilities (all 1.17 test instabilities with a priority >=Major)
-
-
FLINK-18356Getting issue details...
STATUS
- Godfrey He and Yunhong Zheng are working on it
- run tests after each other/not in parallel
- don't reuse JVMs
- JUnit has feature to log memory consumption
- - FLINK-26974Getting issue details... STATUS
- - FLINK-29427Getting issue details... STATUS
-
-
FLINK-18356Getting issue details...
STATUS
- Priorities of test instabilities (docs about it)
- Test instabilities are prioritized as Critical and become blocker as soon as we notice that they are newly introduced
- Feature freeze extended until Jan 31, 2023
- Switching to weekly calls as we're getting closer to the feature freeze?
- Yes, switch to weekly will happen. Next meeting will be Jan 17 due to Chinese New Year coming up and the feature freeze happening soon
- Votes on FLIPs are stalled due to outstanding votes
- Qingsheng Ren will reach out to Martijn Visser about it
- Share in the Slack dev channel to get more people to look into it
- leader elections lacks test coverage (FLIP-285,
-
FLINK-26522Getting issue details...
STATUS
)
- No issues popped up on the mailing list since 1.16
2023-01-17
- Build instabilities (all 1.17 test instabilities with a priority >=Major)
- Blockers
- Martijn Visser to check open critical test stabilities to determine if some need to be assigned / if we're OK in the overall direction
- FLIP-272: Generalized delegation token support has been merged in and a blog post will be written and published about it, prior to the 1.17 release
- https://github.com/apache/flink/pull/21606 has been merged into Flink, but should also be taken into account for externalized connectors - This is tracked under - FLINK-30639Getting issue details... STATUS
2023-01-24
- Build instabilities (all 1.17 test instabilities with a priority >=Major)
- Performance regressions
- Blockers
- - FLINK-29405Getting issue details... STATUS → In progress, needs a status update
- - FLINK-30727Getting issue details... STATUS → In progress, needs a status update
- - FLINK-30328Getting issue details... STATUS → Resolved
- - FLINK-29427Getting issue details... STATUS → In progress, needs a status update
- - FLINK-30618Getting issue details... STATUS → Martijn Visser is looking into this one
- - FLINK-30733Getting issue details... STATUS → Martijn Visser is looking into this one
2023-01-31
- Today is feature freeze day
- 26 features / improvements are in for Flink 1.17 (47 in at Flink 1.16, 20 for Flink 1.15, 27 for Flink 1.14)
- 5 features are still listed as expected to be completed, but are not yet in, 4 of them have been merged and are writing documentation, Martijn Visser to check/update for the status of these items.
- Blockers:
- - FLINK-30881Getting issue details... STATUS → Matthias will look into this
- - FLINK-29427Getting issue details... STATUS → Leonard review PASS, waiting CI green
- - FLINK-29405Getting issue details... STATUS → Fixed by Qingsheng
- - FLINK-30625Getting issue details... STATUS → Should be resolved, pending validation by the benchmarks (related to FLINK-30624). Benchmarks look to be improved, downgraded to Critical.
- - FLINK-30624Getting issue details... STATUS → Should be resolved, pending validation by the benchmarks (related to FLINK-30625). Benchmarks look to be improved, downgraded to Critical.
- - FLINK-30623Getting issue details... STATUS → Martijn to reach out to Dong, Rui Fan, Piotr. We're planning to give them until Friday the 3rd of February to come to a conclusion on this ticket; if no consensus is achieved, then the original commit that introduced the regression should be reverted.
- - FLINK-30826Getting issue details... STATUS → Matthias to check if this has already been resolved via another ticket (multiple related tickets)
- - FLINK-30727Getting issue details... STATUS → [Critical] PR to update buffers for the test has been merged, will be continued to monitor. Test downgraded to Critical
- - FLINK-30846Getting issue details... STATUS → [Critical] Downgraded to Critical as it's only a test-related issue and doesn't indicate a bug in production.
- - FLINK-30844Getting issue details... STATUS → [Major] Test downgraded to Major, If the test fail again contributor will increase the waiting interval for this test as a solution.
- - FLINK-30870Getting issue details... STATUS → [Major] Leonard downgrade the issue priority to Major as it's a known slack plugin issue
- Martijn Visser to communicate to the Flink community this evening that the feature freeze has started and we plan to cut the release branch at the end of this week (Friday 3rd of February).
2023-02-07
master
is stabilized enough to cut therelease-1.17
branch- FLINK-30921 - The Azure apt mirror instabilities seem to have been resolved for now.
- FLINK-30908 - The issue turned out to be a problem that existed in previous releases
- FLINK-30921 - The Azure apt mirror instabilities seem to have been resolved for now.
- Release branch is going to be cut today by Leonard Xu
- Release testing will be announced: 2 weeks will be planned for this
2023-02-14
- Status update on release testing efforts
-
FLINK-30926Getting issue details...
STATUS
- End of cross-team testing date is 21st of February 2023. We will monitor the status throughout the week and hopefully conclude everything next week.
- When a feature is cross-team tested, the icon needs to be changed from to to indicate that the testing has been completed
- We always look for volunteers: picking up a cross-team testing task is much appreciated.
- Proposal: Create Jira issues for release management tasks to document what was done to improve review-ability (alternatively, add expected output to release documentation)
- Do this from now on for the next steps in release management (create release candidate etc.)
- Test instabilities:
- - FLINK-31036Getting issue details... STATUS : Blocker
- - FLINK-18356Getting issue details... STATUS : Starts failing more regularly again. Qingsheng Ren to ping Godfrey He and Yunhong Zheng
- FLINK-30972 (openssl version update necessary) continues to fail because of FLINK-30965 (repo-sync doesn't pick up 1.15 changes anymore)
2023-02-21
- Open Blocker issues
- Example Release Jira issues: - FLINK-31146Getting issue details... STATUS > - FLINK-31154Getting issue details... STATUS
- Release testing not finished, yet
- Decision on RC creation moved to next week's Flink release sync call
- Other issues:
- - FLINK-31092Getting issue details... STATUS : 1.17 issue in Hive
- - FLINK-30733Getting issue details... STATUS : no updates on the Slack bot instability so far
- - FLINK-31134Getting issue details... STATUS : OOM issue - waiting for Gabor's response
- - FLINK-31145Getting issue details... STATUS (Kafka infrastructure umbrella ticket)
2023-02-28
- Blockers: are these really blockers or do we want to move the release forward?
- - FLINK-31092Getting issue details... STATUS → Qingsheng Ren to check the status with Shengkai Fang
- - FLINK-31104Getting issue details... STATUS → Qingsheng Ren to check the status with Shengkai Fang
- - FLINK-30978Getting issue details... STATUS → Qingsheng Ren to check the status with Shengkai Fang
- Other issues:
- - FLINK-31134Getting issue details... STATUS : Revisit OOM in Kafka e2e test in 1.15.3 - we're keeping it as critical for now. There are no other artifacts that we can investigate. The test doesn't involve table-runner code (therefore, isn't connected to - FLINK-18356Getting issue details... STATUS ). → Martijn Visser to check internally
- Dependabot alerts are affecting all commiters now
- Release managers to review draft of 1.17 announcement
- Release 1.17 preparation: - FLINK-31146Getting issue details... STATUS
2023-03-07
- Ubuntu mirror instabilities: - FLINK-30921Getting issue details... STATUS Matthias Pohl will follow-up on that one
- API backwards compatibility: - FLINK-31167Getting issue details... STATUS Do we have additional documentation on API compatibility? Leonard Xu will do another pass over the ticket comments to confirm the findings
- Announcement: FLIPs missing: FLIP-272: Generalized delegation token support, FLIP-217: Support watermark alignment of source splits Leonard Xu reaches out to contributors to add something related to the announcement
- Instabilities (blocker query):
- - FLINK-31351Getting issue details... STATUS :Leonard Xu has pinged luoyuxia to look into this issue.
- - FLINK-31278Getting issue details... STATUS : Downgraded→ [Critical] PR ready, flink-runtime tests will be made sequential ,
- - FLINK-31339Getting issue details... STATUS : Downgraded→ [Critical] Leonard Xu has pinged Godfrey He to look into this issue.
- - FLINK-31342Getting issue details... STATUS : Downgraded→ [Critical]
- - FLINK-31341Getting issue details... STATUS : Downgraded→ [Critical]
- - FLINK-31354Getting issue details... STATUS : Downgraded→ [Critical] Matthias Pohl
- - FLINK-31355Getting issue details... STATUS : Downgraded→ [Critical] Leonard Xu has pinged Dian Fu to look into this issue.
2023-03-14
- Flink RC2 for 1.17 is available, needs checkers
2023-03-21
- RC3 for 1.17.0 was created and is in voting stage (1 binding vote miss)
- Robert Metzger agreed to assist with the license check
2023-03-23
- Flink 1.17.0 is officially released today!
Retrospective:
From Qingsheng Ren
- As discussed in the mailing list, we need to trigger a final patch version for 1.15 after releasing 1.17. Some cleanup steps need to be reviewed and changed, such as removing 1.15 data from svn, CI, flink-docker etc. See - FLINK-31570Getting issue details... STATUS
- I like the idea made by Matthias Pohl that we track TODOs for releasing on JIRA 👍 I used it as a checklist to make sure we don't miss anything. Also it helps collaborating, as we can divide works across RMs easily by assigning JIRA tickets.
From Matthias Pohl
Google Meet might not be the best choice for the release sync. We need to be able to invite attendees even if the creator of the meeting isn't available (maybe try Zoom or even Jitsi as an OpenSource alternative instead?)
- Release sync every 2 weeks and a switch to weekly after feature freeze felt reasonable
Slack worked well as a collaboration tool to document the monitoring tasks (#builds, #flink-dev-benchmarks) in a team with multiple release managers
The Slack Azure Pipeline bot seems to be buggy. It swallows some build failures. It's not a severe issue, though. We created #builds-debug to monitor whether it's happening consistently. The issue is covered in - FLINK-30733Getting issue details... STATUS
We experienced occasional issues in the manual steps of the release creation in the past (e.g. japicmp config was not properly pushed). Creating Jira issues for the release helped to make the release creation more transparent and made the steps more reviewable. Additionally, it helped to distribute subtasks to different people with Jira being the tool for documentation and synchronization. That's especially helpful when there is more than one person in charge of creating the release.
We had backports/merges without PRs happening by committers occasionally during the 1.17 release which broke master/release branches (probably, changes were done locally before merging which were not part of the PR to have a faster backport experience). It might make sense to remind everyone that this should be avoided. Not sure whether we want/can restrict that.
We observed a good response on fixing test instabilities by the end of the release cycle but had some long running issues earlier in the cycle which caused extra efforts on the release managers due to reoccurring test failures.
Release testing picked up “slowly”: Initially, we planned 2 weeks for release testing. But there was not really any progress (tickets being created and worked on) in the first week. In the end, we had to extend the phase by another week resulting in 3 instead of 2 weeks of release testing. I guess we could encourage the community to create release testing tasks earlier and label them properly to be able to monitor the effort. That would even enable us to do release testing for a certain feature after the feature is done and not necessarily only at the end of the release cycle.
- Manual test data generation is tedious ( - FLINK-31593Getting issue details... STATUS ). But this should be fixed in 1.18 with - FLINK-27518Getting issue details... STATUS being almost done.
- We started creating documentation for release management. The goal is to collect what tasks are there to help support a Flink release to encourage newcomers to pick up the task.
From Leonard Xu
- We can keep RC0 (a non-votable one) in future releases, as an initial version for developers to validate, so that some issues could be found earlier and avoid repeatedly canceling and re-creating RCs.
From Martijn Visser
- We should be more careful for commits without a PR / green CI, which brought some problem at the end of 1.17 release cycle. There might not be possible to totally ban this, but we could give an reminder to committers.