The Flink community wants to use Azure Pipelines for testing Flink pull requests and changes.
This page is first going to list everything we know about the current setup, then we'll describe plans moving forward.
Once we use Azure Pipelines as our primary testing system, this page will be revised.
Testing workloads:
Notes on our Azure Pipelines setup:
azure-pipelines.yml
and located in the Flink source code.#machines | Name | Specs | Notes | Provider |
---|---|---|---|---|
4 | CommunityCITest0** | 32 cores, 64gb mem, x86-64, 100mbit conn | Each machine runs two agents, for better resource utilization. More machines available if needed. | Alibaba Cloud |
2 | flink-arm-agent-**** | 16 cores, 32gb mem, aarm64 | Running an outdated build agent, because aarm64 support is not Not tested yet. | https://openlabtesting.org/ |
Please reach out to the dev@ mailing list if you want to offer the Flink community more build capacity.
When to use which machines:
Flink Repository | Flink Pull requests | 3rd party repositories (Contributors) | |
---|---|---|---|
Build Environment | Custom Machines | Custom Machines | Azure Pipelines free tier for open source |
Apache does not allow the use of the GitHub integration of Azure Pipelines, because it needs write access to the source code (). To work around this limitation, we will do the following:
The goal of version 1 is to migrate the existing setup from Travis to Azure Pipelines, to give the community more build capacity and stability.
Current work in progress code: https://github.com/rmetzger/flink/tree/azure_playground
Current work in progress builds: https://dev.azure.com/rmetzger/Flink/_build?definitionId=1&_a=summary
We should revisit the V1 build setup, given the better availability of build resources, and fewer constraints for free builds on AZP.
Benefit: Pull requests with the potential of breaking e2e tests etc. can be executed on demand
This is possible now again, because free builds on AZP give us a timeout of 6 hours per job. Tests would currently finish after 3.5hours.
We could also consider running different configurations (jdk11, scala212, ...) in one job each, as well as the e2e tests.
Other big data processing systems also seem to be willing to accept longer build times. A Kafka build is 2 hours, a Spark build 4:40h.
The benefit here is that it would drastically reduce the complexity of our build system
The current approach of manually splitting the builds leads to a fairly complicated and hard-to-maintain set of scripts. If we want to have fast builds, and low complexity, we need to look into ways of automatically splitting our builds.
Things we should consider moving forward:
You can utilize Flink's GitHub Actions workflows on your fork to run CI. No need to use the flink-mirror repo anymore. |
This tutorial assumes that you want to authenticate with Microsoft through your GitHub Account. You can also use your existing MSFT Account to open an Azure Pipelines Account
Our AZP setup supports running the end to end tests on the CI system, through a build time variable (MODE=e2e
).
To set this variable, you need to trigger a build yourself.
Azure Pipelines has restricted usage of free plan accounts to a case-by-case basis, making the setup of Flink's CI on a personal fork difficult. While waiting for Azure to approve your personal, free account, there's a workaround one can use without creating noise on the main apache/flink
GitHub repository. Please use this approach responsibly (don't submit too many builds), and only while waiting for Azure to approve your account. Note that they won't reply to your email. Your personal CI will just start working.
To do so, you'll need to fork the flink-ci/flink-mirror
repository to your personal GitHub account.
Then, you'll need to add the mirror as a git remote in your local machine's Flink project directory.
cd /path/to/your/flink git remote add flink-mirror https://github.com/<your-user-name>/flink-mirror.git |
To run your changes on Flink's Azure CI:
flink-mirror
forkflink-ci/flink-mirror
GitHub repositorygit push -u flink-mirror your-branch |
When you're satisfied with your changes, you can:
flink-ci/flink-mirror
repositoryapache/flink
forkapache/flink
GitHub repositorygit push -u origin your-branch |