Child pages
  • Patch submission and review
Skip to end of metadata
Go to start of metadata

 

This page is meant to document the various steps to working with git to contribute or review Kafka code. There are probably a lot of bugs in these steps or possible better recipes, so help make this page better. If you want to push your commits without passwd, please see apache git wiki.

Overview

The Kafka project development ecosystem involves git (for version control), JIRA (for issue tracking) and Review Board (for reviewing code changes made by contributors). To make it easier for both the contributors and the reviewers to manage the contributions, the Kafka project also ships a (python based) script which automates the steps that are involved in the context of a patch submission. These steps involve:

  • Creating a patch/diff between the local git repo against the project remote repo
  • Creating a review task in Review Board and publish the patch/diff that was generated for the changes
  • Updating the JIRA, related to these changes, with a comment about a patch being made available and ready for review at Review Board

As you'll notice this requires (automated) integration between JIRA and Review Board. The (python based) script, which is named kafka-patch-review.py (and present in the checked out code of Kafka project), acts as a wrapper around the scripts/tools that are shipped by JIRA and Review Board for such integrations. Since the kafka-patch-review.py is merely a wrapper around those tools, you'll have to install those tools locally to be able to use the kafka-patch-review.py script. This document helps you in setting up those tools as well as helping you understand the usage of the kafka-patch-review.py itself.

Kafka patch review tool

The following sections will help you install and setup the necessary tools which this wrapper script uses for patch submission.

Install/setup jira-python package
Download the jira python package

 

 

(OPTIONAL) Configure JIRA user name and password

During the patch submission process, the kafka-patch-review.py prompts you for your JIRA user name and password that you use for https://issues.apache.org/jira JIRA instance. The tool uses that information  to update the JIRA with the new patch. However, if you do not like being prompted each time you submit the patch, you can configure your JIRA user name and password  be setup in a file named jira.ini under your home directory. The content of such a file would look like:

 

Install/setup review board python tools

This is a quick tutorial on using Review Board with Kafka.

Install the post-review tool

If you are on RHEL, Fedora or CentOS, follow these steps

If you are on a Debian based system (like LinuxMint) follow these steps

If you are on Mac, follow these steps

For other platforms, follow the instructions here to setup the post-review tool.

Configure review board related stuff

Then you need to configure a few things to make it work:

First set the review board url to use. You can do this from in git:

If you checked out using the git wip http url that confusingly won't work with review board. So you need to configure an override to use the non-http url. You can do this by adding a config file like this:

 
Install the argparse module

Kafka patch review tool usage

 

Upload patch

  1. Specify the branch against which the patch should be created (-b)
  2. Specify the corresponding JIRA (-j)
  3. Specify an optional summary (-s) and description (-d) for the reviewboard

Example:

Update patch

  1. Specify the branch against which the patch should be created (-b)
  2. Specify the corresponding JIRA (--jira)
  3. Specify the rb to be updated (-r)
  4. Specify an optional summary (-s) and description (-d) for the reviewboard, if you want to update it
  5. Specify an optional version of the patch. This will be appended to the jira to create a file named JIRA-<version>.patch. The purpose is to be able to upload multiple patches to the JIRA. This has no bearing on the reviewboard update.

Example:

FAQ

When I run the script, it throws the following error and exits

There are 2 reasons that can cause this -

  • The code is not checked into your local branch
  • The -b branch is not pointing to the remote branch. In the example above, "trunk" is specified as the branch, which is the local branch. The correct value for the -b (--branch) option is the remote branch. "git branch -r" gives the list of the remote branch names.
When I run the script, it throws the following error and exits

One of the most common root causes of this error are that the git remote branches are not up-to-date. Since the script already does that, it is probably due to some other problem. You can run the script with the --debug option that will make post-review run in the debug mode and list the root cause of the issue.

Simple contributor workflow

This is the simple workflow and will work well for small features development for people who don't have direct access to check in to the Apache repository. Let's assume you are working on a feature or bug called, xyz:

1. Checkout a new repository:

Or if you already have a copy of the repository, just check for updates

2. Create and checkout a feature branch to work in:

3. Do some work on this branch and periodically checkin locally:

4. When done (or periodically) rebase your branch to take any changes from trunk:

5. Make a patch containing your work and upload it to JIRA:

6. You may need to iterate/rebase your patch a few times as people comment on the code until a commit checks it in to the main repository.

You will also want to ensure you have your username and email setup correctly so that we correctly record the source of the contribution:

Reviewer workflow:

This assumes you already have a copy of the repository.

1. Make sure your code is up-to-date:

2. Checkout the destination branch:

3. See what the patch will do:

4. See that the patch will apply cleanly (otherwise prod the contributor to rebase):

6. Apply the patch to trunk

If you get an error that says "Patch does not have a valid e-mail address." then the patch might have been created by doing git diff in which case you can apply the patch using

if the am operation failed you will also need to remove the .git/rebase-apply/ that gets created

7. If things go wrong (tests fail, you find some problem, etc), you can back out:

8. Push the change back to Apache:

Simple Commiter Workflow

If you have commit access on the apache repository then you will not be applying patches in the manner described in the reviewer workflow. Instead, once your patch has been reviewed you will check it in yourself as follows:

  1. Create a branch to work on:

  2. Implement the feature.
  3. Rebase:

  4. Post the change to JIRA and get it reviewed.
  5. Push the change back to Apache. Pick one of the following:
    • You should almost always collapse your work into a single check-in in order to avoid cluttering the upstream change-log:

    • If you are absolutely sure you want to preserve your local intermediate check-in history then push directly from your feature branch instead of the above merge (or use merge without the squash option):

[Note] For reviewer and simple committer workflow, remember to resolve the corresponding JIRA ticket and mark the fix versions label after the corresponding patch is committed.

 

Github Workflow

Apache doesn't seem to provide a place to stash your work-in-progress branches or provide some of the nice social features github has. This can be a problem for larger features. Here are instructions for using github as a place to stash your work in progress changes.

Setting Up

1. As in the other workflows begin by checking out kafka (if you haven't already):

This sets up the remote alias "origin" automatically which refers back to the Apache repo.
2. Create a new github repository on your github account to use for stashing changes. There are various ways to do this, I just forked the apache/kafka repo (https://github.com/apache/kafka) which creates a repo https://github.com/jkreps/kafka (where jkreps would be your user name).
3. Add an alias on your local repository to github to avoid typing:

Now you can push either to origin or to github.

Doing Work

1. You can create a branch named xyz in your local repository and check it out

2. To set up a second machine to work on you can clone the github url.
3. To save your branch to your github repo do

4. To pull these changes onto the other machine where you have a copy of the repository you can do:

Review and pushing changes back to Apache works just as before.

Merging GitHub Pull Requests

This section documents the process for reviewing and merging code changes contributed via Github Pull Requests. It assumes you have a clone of Kafka's Git repository.

kafka-merge-pr.py is a script that automates the process of accepting a code change into the project. It creates a temporary branch from apache/trunk, squashes the commits in the pull request, rewrites the commit message in the squashed commit to follow a standard format including information about each original commit, merges the squashed commit into the temporary branch, pushes the code to apache/trunk and closes the JIRA ticket. The push will then be mirrored to apache-github/trunk, which will cause the PR to be closed due to the pattern in the commit message. Note that the script will ask the user before executing remote updates (ie git push and closing JIRA ticket), so it can still be used even if the user wants to skip those steps.

Setting Up

1. Add aliases for the remotes expected by the merge script (if you haven't already):

2. Install jira-python as described above.

Merging

Once the pull request is ready to be merged (it has been reviewed, feedback has been addressed, CI build has been successful and the branch merges cleanly into trunk):

1. Set the JIRA_USERNAME and JIRA_PASSWORD environment variables with the appropriate credentials if you intend to ask the script to close the issue associated with the pull request.

2. Run the merge script:

3. Answer the questions prompted by the script.

How to get your patches reviewed

Please ping the dev mailing list if you have a patch that needs a review and it will be added to the queue. The following (JIRA link) are issues that currently have patches available and have an assigned reviewer:

45 issues

 

  • No labels

2 Comments

  1. This is a really useful wiki. For the committer workflow: I would suggest merging your feature branch back to your local trunk (or branch) with --squash, commit locally and then push. Otherwise we would pollute the trunk log with intermediate (work-in-progress) changes in the feature branch. Or do you think it would be better to have that information in the upstream log?

    1. Ok - just made that change.