This page is work in progress! 

Flink Project Management consists of the following tasks:

  • Managing the feature development for the release
  • Maintenance of CI builds and infrastructure (for master and the two most-recently release Flink versions)
  • Jira maintenance

These tasks are not necessarily exclusively done by the release manager. The community should take care of it. The responsibility of the release manager is to make sure that a certain quality and stability of the relevant code base is achieved during the release.

Additionally, any release-related documentation should be kept up-to-date (e.g. Release Management and Feature Plan or Creating a Flink Release)

Organization of the Release Cycle

Preparing a Release Cycle

  • Setting up a release page (see 1.17 Release as a template)
  • Announcing the plan for the release cycle (e.g. feature freeze date)

Regular Sync

It's a good habit to meet on a regular basis to sync on the developments of the current release cycle (so far, bi-weekly before the feature freeze and weekly from the feature freeze until the release actually happened). A summary should be kept in the release's wiki article (see Release Management and Feature Plan) and sent to the dev mailing list to keep the community up-to-date.

Inviting the release managers of the previous release in one of the early release syncs to discuss learnings from previous efforts also worked well in the past.

Feature Freeze

The feature freeze is a set date until which features can be added to master. After the feature freeze, no additional feature are allowed to be merged into master. Only bugfixes and documentation changes are allowed. The goal is to stabilize master before cutting of the release branch. The feature freeze date is communicated at the beginning of a release cycle. It's not uncommon that the date will be changed during the release cycle if there are valid reasons to do so. Such a decision needs to be discussed in the dev mailing list (see 1.17 feature freeze extension discussion).

The time between announcing the feature freeze and cutting the release branch should be as short as possible since it's blocking work that should go into future releases.

Release Testing

Release testing happens after the release branch is cut and CI is stable enough. The goal is to test features manually that ended up in this build. Additionally, the documentation for these feature should be available. Any blocking issues that come up during the release testing need to be addressed before going forward with the release.

Release Metrics 

  • Count contributors,  the following git commands can be used to count contributors for given commit range of current branch:
    • git shortlog  --summary startCommitId..endCommitId | awk -F ' ' '{$1=""; print $0 }'|sort -n|awk 'BEGIN{ORS=", "}{print $0}'  
  • Count resolved issues, the JIRA filter can be used to count the resolved issues in this version:
    • project = flink AND status in (closed, resolved, Fixed, Completed, Done) AND fixVersion in (1.17.0) 

Maintenance of CI builds and infrastructure

The release manager should ensure stability of master and the two most-recently published Flink versions in terms of CI. Builds can be monitored on AzureCI Flink build overview (see Testing Infrastructure for further details on the build process).

Monitoring CI failures in master and the release branches

Failed builds are reported to Apache Flink's #builds Slack channel. Build failures should be investigated and documented in this Slack channel (i.e. linking the corresponding ticket in the Slack thread and marking the thread with a "check" emoj when the investigation is done for this build). The documentation in the Slack channel allows us to work concurrently on CI failures (i.e. a missing check mark for a build failure means that the build was not fully investigated, yet, and should be picked up).

There is an issue with clicking Azure Pipeline links that are reported in the Slack channel. You need to install a redirect routine in your browser to make this work. The instructions can be found in the #builds channels canvas.

Other tasks:

  • Monitoring the remote branches. Sometimes, there are remote branches created accidentally in the Apache Flink repo. Branches should generally been created in the forks. We might want to reach out to contributors to delete these accidentally created remote branches. The following branches shouldn't be touched:
    • master  & release-* - Flink versioning branches
    • blink  - Branch holding the legacy blink code. This one is kept for historical purposes.
    • experiment_gha_docs  & exp_github_actions - These branches are kept as part of the Github Actions migration efforts (see Chesnay Schepler's comment in the related ML post).
    • dependabot/* - These branches are temporarily created by dependabot for version bumps (related ML announcement).

Relevant Repositories, Workflows and other artifacts

Performance Regression Tests

Performance regression tests are used to monitor that there are no changes that reduce the performance of Flink. There is more documentation on this topic in Codespeed / Benchmarks. Regressions are reported in Apache Flink's #flink-dev-benchmarks Slack channel.

Jira maintenance

  • Build failures should be reported in the corresponding Jira issue (or a Jira issue should be created if none exists, yet). Contributors should be pinged to fix instabilities as soon as possible to ensure a stable infrastructure of the course of the release cycle. More details on Jira issues can be found on the Flink Jira Process wiki page.
  • Newly created Jira issue should follow the Flink Jira Process guide (e.g. fixVersion , affectedVersion , component and have the label test-stability)
  • Important information to improve Jira issue search:
    • Name of the test that failed
    • Link to the test failure (ideally with the relevant log line; both Azure Pipelines and GitHub Actions support log line-specific links)
    • Log snippet that identified the test failure (e.g. assertion error or stacktrace)

Hints around AzureCI/Jira usage

  • Console log output can be linked per log line:
    • GitHub Actions: Click the line number on the left side of the console view to generate the line-specific URL.
    • Azure Pipelines: The link button will appear at the end (i.e. the right side) of the log line when hovering over the log line.
  • There are several URLs with placeholder (i.e. %s) that might be handy when accessing Jira through your browser using Firefox's bookmark keywords or Chrome's search engine feature:
    • Jira issue look up by ID (e.g. "<keyword> 123 would lead to FLINK-123 - Getting issue details... STATUS ):
    • Search for open or closed Jira issues with a substring (this is handy to find test stability issues):
    • Same as above only for opened issues:
    • Look for most-recently updated test-stability Jira issues by date (number reflects the date range since today):
  • No labels