The Flink community wants to use Azure Pipelines for testing Flink pull requests and changes.

This page is first going to list everything we know about the current setup, then we'll describe plans moving forward.

Once we use Azure Pipelines as our primary testing system, this page will be revised.

Current Setup (December 2019)

Overview of Flink Tests

Testing workloads:

  • "mvn clean verify" – all unit and integration tests, including Java-based end to end tests (3.5hours)
    • variations
      • hadoop=2.8.3 scala=2.11
      • hadoop=2.4.1
      • scala=2.12
      • hadoop=2.8.3 scala=2.11 jdk=11
  • flink-end-to-end-tests/run-nightly-tests.sh –  all bash-based end to end tests (1 hour)
    • variations
      • hadoop=2.8
      • scala=2.12
      • hadoop=none
      • jdk=11
  • flink-end-to-end-tests/run-pre-commit-tests.sh – some lightweight bash-based end to end tests (10m) 
  • flink-python/dev/lint-python.sh – python tests
  • tools/travis/docs.sh – documentation links check

Current test-execution approach on Travis CI

  • on each pull request or push to master/release-*
    • (variation: hadoop=2.8.3 scala=2.11)
    • 1. compile-stage
    • 2. test-stage
      • core
      • python
      • libraries
      • blink_planner
      • connectors
      • kafka/gelly
      • tests
      • legacy_scheduler_core
      • legacy_scheduler_tests
      • misc
  • daily cron: master
    • "mvn clean verify" in all 4 variations, split into the same jobs as regular pull request builds
    • "run-nightly-tests.sh" in all 4 variations, split into 7 different jobs
  • weekly cron: release-*
    → same as daily cron.

Azure Pipelines

Flink Testing Infrastructure on Azure Pipelines

Notes on our Azure Pipelines setup:

  • Flink's Azure Pipelines account is located here: https://dev.azure.com/rmetzger/Flink/_build
  • For the Flink repository, the testing does not happen directly on the machine, but in a docker container, providing the correct maven version (3.2.5) and the required dependencies for the openssl tests. This allows us maintain a consistent build environment across different build machines.
  • Azure Pipeline builds are controlled via a configuration file, called azure-pipelines.yml and located in the Flink source code.
  • Secrets are defined in the AZP web ui
  • Cron jobs are defined in the azure-pipelines.yml file.

Available Custom Build Machines

#machinesNameSpecsNotesProvider
4CommunityCITest0**

32 cores, 64gb mem, x86-64, 100mbit conn

Each machine runs two agents, for better resource utilization.

More machines available if needed.

Alibaba Cloud
2flink-arm-agent-****16 cores, 32gb mem, aarm64

Running an outdated build agent, because aarm64 support is not
officially released yet.

Not tested yet.

https://openlabtesting.org/

Please reach out to the dev@ mailing list if you want to offer the Flink community more build capacity.

When to use which machines:


Flink RepositoryFlink Pull requests3rd party repositories (Contributors)
Build EnvironmentCustom MachinesCustom Machines

Azure Pipelines free tier for open source


Integration with GitHub / Flink's repositories

Apache does not allow the use of the GitHub integration of Azure Pipelines, because it needs write access to the source code ( INFRA-17030 - Getting issue details... STATUS ). To work around this limitation, we will do the following:

  • all branches of "apache/flink" are mirrored to the "flink-ci/flink" repository using a custom script, running every few minutes. Azure Pipelines "listens" on pushes to "flink-ci/flink".
  • ci-bot mirrors all pull requests against "apache/flink" into a branch, where AZP is picking up the builds.
  • ci-bot is picking up the build status from GitHub for each pull request branch, reporting it back to the user via "flinkbot's" build comment.

AZP Setup: Version 1 

The goal of version 1 is to migrate the existing setup from Travis to Azure Pipelines, to give the community more build capacity and stability.

Current work in progress code: https://github.com/rmetzger/flink/tree/azure_playground

Current work in progress builds: https://dev.azure.com/rmetzger/Flink/_build?definitionId=1&_a=summary

AZP Setup: Version 2

We should revisit the V1 build setup, given the better availability of build resources, and fewer constraints for free builds on AZP.

Idea 1: Trigger nightly e2e, jdk11, scala212, ... tests through "flinkbot", reporting the status back in the PR

Benefit: Pull requests with the potential of breaking e2e tests etc. can be executed on demand

Idea 2: Consider running all tests in one job (instead of splitting compile + 10 2nd stage jobs).

This is possible now again, because free builds on AZP give us a timeout of 6 hours per job. Tests would currently finish after 3.5hours.

We could also consider running different configurations (jdk11, scala212, ...) in one job each, as well as the e2e tests.

Other big data processing systems also seem to be willing to accept longer build times. A Kafka build is 2 hours, a Spark build 4:40h.

The benefit here is that it would drastically reduce the complexity of our build system 

Idea 3: Parallelize our build automatically

The current approach of manually splitting the builds leads to a fairly complicated and hard-to-maintain set of scripts. If we want to have fast builds, and low complexity, we need to look into ways of automatically splitting our builds.

AZP Setup: General Considerations

Things we should consider moving forward:

  • Setting up some metrics for Flink / Flink's build system
    • machine utilization
    • test durations
  • Setting up a central log collection service, where we dump all files, indexed by a search engine. This is good for finding logs for very rate errors, or statistics on certain error frequencies.
  • Check out related notes: WIP: Notes on Build Time (in particular about moving away from Maven to Gradle)
  • Running the end to end tests on Windows


(deprecated) Tutorial: Setting up Azure Pipelines for a fork of the Flink repository

Deprecation Note

You can utilize Flink's GitHub Actions workflows on your fork to run CI. No need to use the flink-mirror repo anymore.


This tutorial assumes that you want to authenticate with Microsoft through your GitHub Account. You can also use your existing MSFT Account to open an Azure Pipelines Account

  1. Go to https://azure.microsoft.com/en-us/services/devops/pipelines/, click "Start Pipelines Free with GitHub", authenticate with your GitHub Account
  2. Create Microsoft Account / Connect with existing one. Go through the legal stuff
  3. For the new organization, go to Settings / Policies and switch on the "Allow public projects" option
  4. In Azure Pipelines create a new organization and add a public project:
  5. On the left hand side, click "Pipelines", then "New pipeline":
  6. Select "GitHub" as the code location & authorize it
  7. Select your fork of the "flink" repository & authorize it
  8. The setup tool should detect an existing azure-pipelines.yml file (assuming that file exists on your master branch - if not, update your master branch), press "Run" in the top right corner to run it for the first time

Running end to end tests

Our AZP setup supports running the end to end tests on the CI system, through a build time variable (MODE=e2e).

To set this variable, you need to trigger a build yourself.

  1. Click the "Run pipeline" button on the top right
  2. Select your branch / commit, and click on "Variables":
  3. then click "Add Variable" add the bottom, fill the name with "MODE", and the value with "e2e". Click "Create" to set the variable, then go back to the "Run pipeline" screen and trigger the build.
  4. You should now see a build where only the "e2e_ci_build" is running


Azure Pipeline Usage Restrictions

Azure Pipelines has restricted usage of free plan accounts to a case-by-case basis, making the setup of Flink's CI on a personal fork difficult. While waiting for Azure to approve your personal, free account, there's a workaround one can use without creating noise on the main apache/flink GitHub repository. Please use this approach responsibly (don't submit too many builds), and only while waiting for Azure to approve your account. Note that they won't reply to your email. Your personal CI will just start working.

To do so, you'll need to fork the flink-ci/flink-mirror repository to your personal GitHub account.

Then, you'll need to add the mirror as a git remote in your local machine's Flink project directory.

cd /path/to/your/flink
git remote add flink-mirror https://github.com/<your-user-name>/flink-mirror.git  


To run your changes on Flink's Azure CI:

  • push your branch to your flink-mirror fork
  • open a draft PR in the flink-ci/flink-mirror GitHub repository
git push -u flink-mirror your-branch


When you're satisfied with your changes, you can:

  • close the draft PR in the flink-ci/flink-mirror repository
  • push your branch to your apache/flink fork
  • open a new PR against the apache/flink GitHub repository
git push -u origin your-branch



  • No labels