Table of contents

Jira Boards

Flink 1.17 Burndown: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=572

Sync meeting

The sync meeting is happening every second Tuesday starting on the 1st of November 2022 at 9am CEST / 4pm China Standard Time / 8am UTC. 
As we are getting closer to the feature freeze we will do the meeting on a weekly base.


Feel free to join on Google Meet. Local dial-in numbers can be found at https://tel.meet/wcx-fjbt-hhz?pin=1940846765126

Timeline

  • Feature Freeze
  • Release
    • Mid of March 2023
    • End of March 2023

Chinese new year is 22 January: 1+ week holiday. Ppl will be back in early February

Highlight features

please feel free to add/suggest.

Features

List of features announced by contributors and committers that are likely to be ready for the feature freeze:n

NOTICE: It's preferred if only new features end up there and not all bugs/tasks separately, so that the page is not over bloated. Of course, unless fixing a bug is a really big or important one equivalent to implementing a completely new feature. A good rule of thumb would be that each entry in the page could (but does not have to) be later on included in a release blog post.

Legend

State

symbolmeaningcomment
(big grin)validatedthrough cross team testing
(tick)donewell documented with a complete test coverage
(green star)will make itthere is no reason this effort should not go into 1.17
(star)in dangerthere are some concerns the effort could be ready for the feature freeze of 1.17
(red star)very unlikelythere are severe concerns the effort could make it to 1.17
(minus)won't make itit was decided against adding this for the 1.17 release. working on the effort has been stopped.
(question)state unclear
(blue star)independentas the artifact could be released independent of Apache Flink

X-Team verification

symbolmeaning
(tick)done
(blue star)not required

Feature Stage

Please align with the list on the Apache Flink Roadmap (https://flink.apache.org/roadmap.html).

  • MVP: Have a look, consider whether this can help you in the future.
  • Beta: You can benefit from this, but you should carefully evaluate the feature.
  • Ready and Evolving: Ready to use in production, but be aware you may need to make some adjustments to your application and setup in the future, when you upgrade Flink.
  • Stable: Unrestricted use in production
  • Reaching End-of-Life: Stable, still feel free to use, but think about alternatives. Not a good match for new long-lived projects.
  • Deprecated: Start looking for alternatives now 

Summary

Numbers are based on the items in the list below, not on the tickets 


(big grin)(tick) (green star) (star)(red star)(minus)(question)(blue star)Remaining weeks
2022-11-01








13
2022-11-15








11
2022-11-29
410000230379
2022-12-13
72410190427
2022-12-27
72510180425
2023-01-10
92700350443
2023-01-17
132300332442
2023-01-24
162100332451
2023-01-31
265001600470
2023-02-145233






List

Feel free to add categories.

Runtime
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

FLINK-29801 - Getting issue details... STATUS



FLIP in voting(minus)




Shuffle









FLINK-29766 - Getting issue details... STATUS

Xintong Song 

(big grin)100%10-01-202310-01-202314-02-2023

FLINK-30938 - Getting issue details... STATUS

AdaptiveBatchScheduler should supports early consumption for dynamic graph.Weijie Guo Xintong Song 

(minus)




FLINK-30469 - Getting issue details... STATUS

Yuxin Tan Xintong Song 

(tick)100%17-01-202317-01-202310-02-2023self-test
Further improvement of production availability of hybrid shuffle

(big grin)100%17-01-202317-01-2023
self-test
Deployment & Cluster Coordination
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified
REST API

FLINK-27060 - Getting issue details... STATUS




(big grin)100%20-02-202315-11-202215-11-2022self-test

FLINK-30583 - Getting issue details... STATUS


The first stage is finished, and the second stage is improvement, it will be finished in the next version.(tick)100%09-01-202305-01-202309-01-2023
Reactive Mode
n/a
Scheduler

FLINK-29663 - Getting issue details... STATUS



(tick) 100%17-01-2022


FLINK-30725 - Getting issue details... STATUS

Biao Liu 

(tick) 100%31-01-2022

FLINK-31005 - Getting issue details... STATUS

FLINK-30682 - Getting issue details... STATUS



(tick) 100%31-01-2023


Misc
n/a
OLAP
n/a
State backend
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

FLIP-263: Improve resolving schema compatibility

FLINK-29844 - Getting issue details... STATUS

Hangxiang Yu 

(minus)80%31-1-2023


Improve File Management in State Backend




(minus)5%31-1-2023


Improve the serializer performace of state change of changelog

FLINK-30345 - Getting issue details... STATUS



(big grin)100%31-1-2023

self-test

Allow to configure Changelog Storage per program

FLINK-26372 - Getting issue details... STATUS



(minus)70%31-1-2023


Add a metric for back-pressure from the ChangelogStateBackend

FLINK-24402 - Getting issue details... STATUS



(minus)40%31-1-2023


Frocksdb cannot run on Apple M1

FLINK-24932 - Getting issue details... STATUS



(big grin)100%30-1-2023

Tested  by Martijn Visser 

Release FRocksDB 6.20.3-ververica-2.0

FLINK-30836 - Getting issue details... STATUS



(big grin)100%31-1-2023

Tested by Martijn Visser 

Checkpoint
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified
Benchmark Generic Incremental CP + UC + BDYuan Mei 

(minus)




FLINK-26803 - Getting issue details... STATUS



(tick)100%

 




Benchmark









Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

Improve benchmark stability

FLINK-29825 - Getting issue details... STATUS

Yuan Mei 

(minus)50%31-1-2023


API
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

FLINK-29740 - Getting issue details... STATUS


Deprecated
(big grin)100%08-11-202208-11-202208-11-2022Self-tested

FLINK-28641 - Getting issue details... STATUS


Removed
(minus)40%31-01-2023


FLINK-24456 - Getting issue details... STATUS

Dawid Wysakowicz 


(green star)90%31-01-2022


FLINK-18647 - Getting issue details... STATUS


FLIP DiscussionNeed to be postponed to the next release(minus)
10-01-2023


FLINK-29668 - Getting issue details... STATUS

Martijn Visser 
Removed
(big grin)100%14-11-202214-11-202214-11-2022Self-tested

FLINK-30050 - Getting issue details... STATUS





(big grin)90%31-01-2022

Self-tested

FLINK-25509 - Getting issue details... STATUS

Hang Ruan Coding






SQL
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified
General

FLINK-29942 - Getting issue details... STATUS

Godfrey He ,@Yunhong Zheng




(tick)100%2023-01-31


FLINK-28016 - Getting issue details... STATUS




(minus)
31-01-2023


FLINK-29281 - Getting issue details... STATUS

Chesnay Schepler 


(minus)
31-01-2023


Table API

FLINK-30025 - Getting issue details... STATUS




(minus)
2023-01-20


Calcite Update

FLINK-20873 - Getting issue details... STATUS




(tick)




FLINK-21239 - Getting issue details... STATUS




(tick)




FLINK-29932 - Getting issue details... STATUS

Sergey Nuyanzin 


(tick)
31-01-2023


Flink Dialect

ALTER TABLE API

FLINK-21634 - Getting issue details... STATUS




(big grin)100%2023-02-14

self-tested

FLINK-30648 - Getting issue details... STATUS





(big grin)100%2023-02-14

self-tested
Hive Dialect

FLINK-29635 - Getting issue details... STATUS




(tick)



FLINK-30951 - Getting issue details... STATUS

FLINK-29717 - Getting issue details... STATUS




(tick)100%2023-01-31


FLINK-26603 - Getting issue details... STATUS




(minus)
31-01-2022


SQL Gateway

FLINK-29941 - Getting issue details... STATUS

Shengkai Fang  , Zelin Yu



writing doc(green star)100%2023-01-10

FLINK-30936 - Getting issue details... STATUS

QE

FLINK-30650 - Getting issue details... STATUS




(big grin)100%2023-01-10

FLINK-31025 - Getting issue details... STATUS

QO

FLINK-27591 - Improve the plan for batch queries when statistics is unavailable OPEN

Godfrey He ,@Yunhong Zheng




Need to be postponed to the next release(minus)40%2023-1-31


API/Python
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

FLINK-29155 - Getting issue details... STATUS

HuangXingbo 


(tick)100%10-01-2023

self-test

FLINK-29833 - Getting issue details... STATUS

HuangXingbo 


(minus)0%10-01-2023


FLINK-28957 - Getting issue details... STATUS

HuangXingbo 


(tick)100%19-01-2023

self-test

FLINK-29421 - Getting issue details... STATUS

HuangXingbo 


(tick)100%19-01-2023

self-test

FLINK-21223 - Getting issue details... STATUS

HuangXingbo 


(tick)100%10-01-2023

self-test
Machine Learning
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified
n/a
CEP
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified
n/a
Web
Name/JIRA IssueResponsible ContributorReviewer / committer availableFeature StageNoteState%UpdatedImplementedDocumentedX-team verified

FLINK-29995 - Getting issue details... STATUS



(tick)100%



Status / Follow-ups

2022-11-01

Agenda

  • Kickoff
  • Keeping the state of features updated (ideally before the sync)
  • Blockers
  • Build stability
  • Are there any (new) contributors who need a PR reviewed or merged? And if yes, who can help out?

2022-11-15

Agenda

  • Kickoff
  • Keeping the state of features updated (ideally before the sync)
  • Blockers
  • Build stability
    • Number of growing test stability issues with "Exit code 137 errors"
  • Are there any (new) contributors who need a PR reviewed or merged? And if yes, who can help out?

2022-11-29

  • Build instabilities
  • Other topics:
    • Externalizing Pulsar connector (test instabilities): Martijn Visser  is working on externalizing the code base
    • Performance test monitoring: Discussion is happening on the mailing list
    • Externalizing connectors in general is work-in-progress but looks good
    • Public CI documentation can be improved

2022-12-13

  • Build instabilities
    • FLINK-29405 - Getting issue details... STATUS → Qingsheng to have a look at the PR
    • Pulsar connector has been synced to external connector repository. Martijn Visser to open a PR to remove connector from master
    • FLINK-18356 - Getting issue details... STATUS → Qingsheng to ping Godfrey
    • FLINK-18356 - Getting issue details... STATUS → The PR is still failing for the same issue as this PR should fix. We should ping the author to have a look first
    • FLINK-27916 - Getting issue details... STATUS → Martijn to ping Thomas once more
    • FLINK-26974 - Getting issue details... STATUS → Xingbo is working on this
  • How to have monitoring and quality control for the externalized connectors → Need to have a discussion on the Dev mailing list. Martijn Visser to make a proposal and open a discussion thread on this topic.

2022-12-27

  • Meeting skipped due to Christmas holiday/sick leaves
  • Discussion started on moving the feature freeze from Jan 17 to Jan 31 due to pandemic situation in China (see dev ML discussion thread)
  • Pulsar connector has been externalized. Pulsar-related test instabilities were disabled in release-1.16  and release-1.15  (see FLINK-30351 - Getting issue details... STATUS and parent task)

2023-01-10

  • Build instabilities (all 1.17 test instabilities with a priority >=Major)
  • Priorities of test instabilities (docs about it)
    • Test instabilities are prioritized as Critical and become blocker as soon as we notice that they are newly introduced
  • Feature freeze extended until Jan 31, 2023
  • Switching to weekly calls as we're getting closer to the feature freeze?
    • Yes, switch to weekly will happen. Next meeting will be Jan 17 due to Chinese New Year coming up and the feature freeze happening soon  
  • Votes on FLIPs are stalled due to outstanding votes
  • leader elections lacks test coverage (FLIP-285, FLINK-26522 - Getting issue details... STATUS )
    • No issues popped up on the mailing list since 1.16

2023-01-17

2023-01-24


2023-01-31

  • Today is feature freeze day
  • 26 features / improvements are in for Flink 1.17 (47 in at Flink 1.16, 20 for Flink 1.15, 27 for Flink 1.14)
    • 5 features are still listed as expected to be completed, but are not yet in, 4 of them have been merged and are writing documentation,  Martijn Visser to check/update for the status of these items. 
  • Blockers:
      • FLINK-30881 - Getting issue details... STATUS Matthias will look into this 
      • FLINK-29427 - Getting issue details... STATUS → Leonard review PASS, waiting CI green
      • FLINK-29405 - Getting issue details... STATUS → Fixed by Qingsheng
      • FLINK-30625 - Getting issue details... STATUS → Should be resolved, pending validation by the benchmarks (related to FLINK-30624). Benchmarks look to be improved, downgraded to Critical. 
      • FLINK-30624 - Getting issue details... STATUS → Should be resolved, pending validation by the benchmarks (related to FLINK-30625). Benchmarks look to be improved, downgraded to Critical. 
      • FLINK-30623 - Getting issue details... STATUS → Martijn to reach out to Dong, Rui Fan, Piotr. We're planning to give them until Friday the 3rd of February to come to a conclusion on this ticket; if no consensus is achieved, then the original commit that introduced the regression should be reverted. 
      • FLINK-30826 - Getting issue details... STATUS → Matthias to check if this has already been resolved via another ticket (multiple related tickets)
      • FLINK-30727 - Getting issue details... STATUS →  [Critical]  PR to update buffers for the test has been merged, will be continued to monitor. Test downgraded to Critical
      • FLINK-30846 - Getting issue details... STATUS  [Critical]  Downgraded to Critical as it's only a test-related issue and doesn't indicate a bug in production.
      • FLINK-30844 - Getting issue details... STATUS →  [Major] Test downgraded to Major, If the test fail again contributor will increase the waiting interval for this test as a solution.
      • FLINK-30870 - Getting issue details... STATUS →  [Major]  Leonard downgrade the issue priority to Major as it's a known slack plugin issue
  • Martijn Visser to communicate to the Flink community this evening that the feature freeze has started and we plan to cut the release branch at the end of this week (Friday 3rd of February). 

2023-02-07

  • master  is stabilized enough to cut the release-1.17  branch
    • FLINK-30921 - The Azure apt mirror instabilities seem to have been resolved for now. 
    • FLINK-30908 - The issue turned out to be a problem that existed in previous releases
  • Release branch is going to be cut today by Leonard Xu 
  • Release testing will be announced: 2 weeks will be planned for this

2023-02-14

  • Status update on release testing efforts FLINK-30926 - Getting issue details... STATUS
    • End of cross-team testing date is 21st of February 2023. We will monitor the status throughout the week and hopefully conclude everything next week. 
    • When a feature is cross-team tested, the icon needs to be changed from (tick) to (big grin) to indicate that the testing has been completed
    • We always look for volunteers: picking up a cross-team testing task is much appreciated. 
  • Proposal: Create Jira issues for release management tasks to document what was done to improve review-ability (alternatively, add expected output to release documentation)
    • Do this from now on for the next steps in release management (create release candidate etc.)
  • Test instabilities:

2023-02-21

  • Open Blocker issues
  • Example Release Jira issues: FLINK-31146 - Getting issue details... STATUS  > FLINK-31154 - Getting issue details... STATUS
  • Release testing not finished, yet
  • Decision on RC creation moved to next week's Flink release sync call
  • Other issues:
    • FLINK-31133 - Getting issue details... STATUS : Issue in 1.15.3 unexplained
    • FLINK-31092 - Getting issue details... STATUS : 1.17 issue in Hive
    • FLINK-30733 - Getting issue details... STATUS : no updates on the Slack bot instability so far
    • FLINK-31134 - Getting issue details... STATUS : OOM issue - waiting for Gabor's response
    • FLINK-31145 - Getting issue details... STATUS (Kafka infrastructure umbrella ticket)

2023-02-28

2023-03-07

2023-03-14

  • Flink RC2 for 1.17 is available, needs checkers (smile)

2023-03-21

2023-03-23

  • Flink 1.17.0 is officially released today!

Retrospective:

From Qingsheng Ren 

  • As discussed in the mailing list, we need to trigger a final patch version for 1.15 after releasing 1.17. Some cleanup steps need to be reviewed and changed, such as removing 1.15 data from svn, CI, flink-docker etc. See FLINK-31570 - Getting issue details... STATUS
  • I like the idea made by Matthias Pohl that we track TODOs for releasing on JIRA 👍 I used it as a checklist to make sure we don't miss anything. Also it helps collaborating, as we can divide works across RMs easily by assigning JIRA tickets.

From Matthias Pohl 

  • Google Meet might not be the best choice for the release sync. We need to be able to invite attendees even if the creator of the meeting isn't available (maybe try Zoom or even Jitsi as an OpenSource alternative instead?)

  • Release sync every 2 weeks and a switch to weekly after feature freeze felt reasonable
  • Slack worked well as a collaboration tool to document the monitoring tasks (#builds, #flink-dev-benchmarks) in a team with multiple release managers

  • The Slack Azure Pipeline bot seems to be buggy. It swallows some build failures. It's not a severe issue, though. We created #builds-debug to monitor whether it's happening consistently. The issue is covered in FLINK-30733 - Getting issue details... STATUS

  • We experienced occasional issues in the manual steps of the release creation in the past (e.g. japicmp config was not properly pushed). Creating Jira issues for the release helped to make the release creation more transparent and made the steps more reviewable. Additionally, it helped to distribute subtasks to different people with Jira being the tool for documentation and synchronization. That's especially helpful when there is more than one person in charge of creating the release.

  • We had backports/merges without PRs happening by committers occasionally during the 1.17 release which broke master/release branches (probably, changes were done locally before merging which were not part of the PR to have a faster backport experience). It might make sense to remind everyone that this should be avoided. Not sure whether we want/can restrict that.

  • We observed a good response on fixing test instabilities by the end of the release cycle but had some long running issues earlier in the cycle which caused extra efforts on the release managers due to reoccurring test failures.

  • Release testing picked up “slowly”: Initially, we planned 2 weeks for release testing. But there was not really any progress (tickets being created and worked on) in the first week. In the end, we had to extend the phase by another week resulting in 3 instead of 2 weeks of release testing. I guess we could encourage the community to create release testing tasks earlier and label them properly to be able to monitor the effort. That would even enable us to do release testing for a certain feature after the feature is done and not necessarily only at the end of the release cycle.

  • Manual test data generation is tedious ( FLINK-31593 - Getting issue details... STATUS ). But this should be fixed in 1.18 with FLINK-27518 - Getting issue details... STATUS being almost done.
  • We started creating documentation for release management. The goal is to collect what tasks are there to help support a Flink release to encourage newcomers to pick up the task.

From Leonard Xu 

  • We can keep RC0 (a non-votable one) in future releases, as an initial version for developers to validate, so that some issues could be found earlier and avoid repeatedly canceling and re-creating RCs. 

From Martijn Visser 

  • We should be more careful for commits without a PR / green CI, which brought some problem at the end of 1.17 release cycle. There might not be possible to totally ban this, but we could give an reminder to committers. 



  • No labels