The purpose of this page is to evaluate the GitHub Merge Queues feature in the context of Apache projects.
This issue has been raised a few times and fair amount of discussion took place on the - LEGAL-599Getting issue details... STATUS ticket. A number of concerns were raised by the ASF Infrastructure and Legal teams, many of which could not be answered without actually using merge queues and learning more about their behavior.
- INFRA-25932Getting issue details... STATUS was filed to create an ASF-managed repository that had merge queues enabled. The purpose of this repository was to simulate how the feature would work in the context of an Apache project. Although the repository is named "kafka-merge-queue-sandbox", there is nothing specific about Apache Kafka in this experiment.
Summary
The merge queue provides a mechanism for committers to serialize pull requests into an orderly queue before merging. A configurable number of Pull Requests are added to a "merge group" as part of the queue functionality. This works by creating a temporary branch off the base branch and merging each PR from that merge group into the temporary branch. Each PR is merged according to the configured merge method (squash, rebase, or fast-forward). If a conflict occurs, that PR is removed from the merge group and its status is updated indicating a conflict. If the PRs merge cleanly into the temporary branch, an optional CI workflow can be run before merging the temporary branch into the base branch.
A common use case of the merge queue is to run a brief and final check before merging to the mainline branch. This could be something like as a basic compilation check or a license check. By allowing for a CI workflow before hitting the mainline branch, the merge queue can guard against changes that do not conflict at the Git level, but do cause breakage in the code. The ability to group PRs into merge groups allows for amortization of this final CI workflow.
It should noted that the merge group size can be set to 1 (effectively disabling the grouping behavior) and it is not necessary to have additional CI as part of the merge queue.
The mechanics of the merge queue do not change repo access control, commit history, or other project specific workflows. It is simply a way to automatically coordinate PR merges in an orderly fashion.
GitHub Setup
The following GitHub settings were used to enable the merge queue
- Branch protection was added for the default branch sets ("main" in this case)
- Require status checks to pass before merging
- "validate-patch" added as required status check
- Required Merge Queue enabled
- Merge Method set to "Squash and merge"
- Require all queue entries to pass required checks
Committer Approval
When a non-committer raises a PR, it cannot be added to the merge queue without the explicit action of a committer. This is no different from the traditional "Merge Pull Request" UI element in GitHub. This has nothing to do with workflows or required status checks. GitHub considers adding to the merge queue as an act of writing to the repository, therefor normal access control rules apply.
Here is an example. A non-committer user opens a PR after creating a fork of the repository and making changes on a branch. From their perspective, there is no option to add the PR to the merge queue.
From a committers perspective (a user with write access), the "Merge when ready" option is available.
This means a PR cannot be added to the merge queue automatically. It must be explicitly submitted to the merge queue by a committer.
Commit Provenance
For squash merging, the user who opens a Pull Request will be the author of the resulting commit. For fast-forwarded and rebase merges, the author of each commit will be preserved when merging with the merge queue. This is exactly the same as when merging with the "Merge pull request" button (squash, rebase, or fast-forward). Even though there is automation involved in running the workflows and performing the actual merge behind the scenes, the result in the Git history is as if a committer performed "git merge" (or similar).
A PR from a non-committer is merged, it appears in the base branch history with the non-committer as the Author. This is exactly the same as when a committer approves and merges a PR from a non-committer. The record of committer approval is maintained in the Pull Request. For example, this commit resulted from merging a PR with the merge queue
commit 21fbae381f6a4e7e6c433f4482e058170445753e (HEAD -> main, origin/main) Author: ShivsundarR <shr@confluent.io> Date: Mon Jul 8 20:49:31 2024 +0530 Dummy commit (#15)
On GitHub the commit appears like:
Clicking on the Pull Request (#15), we see that a committer (mumrah) added the PR to the merge queue
In the case of a Committer (i.e., one who has an ICLA) merging contributor code via the merge queue, we also need a record of the committer adding a Pull Request to the queue. This is available in the merge_group webhook event. For example, this merge_group workflow action was the result of a committer clicking the "Merge when ready" button on a Pull Request made by a non-committer. https://github.com/apache/kafka-merge-queue-sandbox/actions/runs/10101204790/job/27934251206
Action: checks_requested Sender: mumrah On branch gh-readonly-queue/main/pr-18-4f7e4fc2543a8546a9f22345970e96a52b8b2954 Your branch is up to date with 'origin/gh-readonly-queue/main/pr-18-4f7e4fc2543a8546a9f22345970e96a52b8b2954'. nothing to commit, working tree clean commit bf67d0ed2b9231f4cc38af54c7405219ff459951 Author: ShivsundarR <shr@confluent.io> Date: Fri Jul 26 02:12:17 2024 +0530 Dummy commit 2 (#18) Co-authored-by: David Arthur <mumrah@gmail.com>
The "sender" of the merge_group is the committer. This data is also available using the GitHub API (the Pull Request object incudes a "merged_by" field).
Merge Queue batching
One feature of the merge queue is the ability to batch multiple Pull Request together. Internally, GitHub creates a temporary branch where all the enqueued PRs are merged using the configured merge method. The "merge_group" workflow is run against that temporary branch. After the workflow completes, the temporary branch is merged into the base branch. The result is the same as if each PR was merged separately.
For example, these two
Both PRs were enqueued at the same time and run as part of the same merge group. However, they result in individual commits with the correct Author.
If two (or more) conflicting PRs are added to the queue, the latter ones are automatically removed from the batch and only non-conflicting PRs are merged. Here is an example.
Two PRs which will conflict are added to the queue.
After a brief period, the UI shows the Merge queue contains a conflicting patch. After the merge group is finished, one PR is merged and the conflicting PR remains open