We are always excited to have contributions from the community, especially from new contributors! There are many different ways that you can make contributions to the Daffodil project, including wiki updates, mailing list support, testing, and new code.
This following steps documents the workflow to get you started with contributing code to Apache Daffodil.
- If you do not have a JIRA account, request one by visiting https://selfserve.apache.org/jira-account.html. Once granted, you will be able to assign bugs to yourself and create new bugs.
You may also want to subscribe to the dev@daffodil.apache.org and commits@daffodil.apache.org mailing lists by sending an email to dev-subscribe@daffodil.apache.org and commits-subscribe@daffodil.apache.org, respectively, and following the instructions.
Search for an existing issue or create a new issue in JIRA that represents the change you would like to make or bug to fix.
If you are a beginner to Daffodil development, a good place to start is with the Daffodil beginner bugs.
See the Daffodil Issue Tracker information for creating issues and what information/discussions should take place in JIRA.
- Assign the issue to yourself so others know that you are working on it.
Visit the Apache Daffodil GitHub and create a fork by clicking on "Fork" in the top right.
Clone your new fork using ssh (you will need to create an ssh key and add it to GitHub if you haven't already). This will be your
origin
remote:$ git clone git@github.com:<github_username>/daffodil.git $ cd daffodil
Add the ASF upstream repository as a new git remote, calling it asf:
$ git remote add asf https://github.com/apache/daffodil.git $ git fetch asf
It is also recommended to change the push URL of the asf remote to a nonsense string to prevent accidentally pushing to it--branches should only be pushed to your fork:
$ git remote set-url --push "push to apache/daffodil disabled"
Create a new branch off of the
asf/main
branch nameddaffodil-XYZ-description,
whereXYZ
is the JIRA bug number and-description
is an optional, very short description of the bug making it easier to differentiate between multiple development branches. For example:$ git checkout -b daffodil-123-bitorder-feature asf/main
Make changes to the branch, frequently adding new commits. For example, the following process should repeat until your code is ready to be reviewed:
edit files $ git add <files that have changed> $ git commit
Code changes should follow the Daffodil Code Style Guidelines and should add appropriate tests using the Test Data Markup Language (TDML) or unit tests.
General guidelines for a good commit message:
- The first line of a commit message should contain a short (~50 characters) description of the changes.
- The second line should be blank, followed by a longer description of the change, wrapped at 72 characters. This long description should describe what was changed and, more importantly, why those changes were made. The 'what' can be determined by inspecting the code, but the 'why' is often less obvious.
- If there are any changes that deprecate functionality or are non-backwards compatible, a section should follow labeled with the "Deprecation/Compatibility:" keyword with a description that can be copy/pasted into release notes. This should be more user focused, including what was deprecated/non-backwards compatible and a migration guide.
- At the end of the commit should be a blank line followed by the JIRA bug number, e.g. DAFFODIL-123. Multiple bugs referenced in a single commit should be separated by a comma on the same line.
An example of a commit message is:
Add support for the dfdl:bitOrder feature Longer explanation of what changes were made to support the bitOrder feature, including a description of why the changes were made. Multiple lines are wrapped at 72 characters. Deprecation/Compatibility: The dfdlx:bitDirection extension property is now deprecated in favor of the new dfdl:bitOrder property: - dfdlx:bitDirection="l2r" becomes dfdl:bitOrder="mostSignificantBitFirst" - dfdlx:bitDirection="r2l" becomes dfdl:bitOrder="leastSignificantBitFirst" DAFFODIL-123
Your IDE or operating system environment may add files to the git repository that are specific to your development environment, such as temporary backups and IDE project files. These files should be ignored and not commited by git. However, because they are specific to your environment, the Daffodil .gitignore file does not contain entires for them. You may want to add such entires to one of the following system-specific gitignore files instead:
$GIT_DIR/info/exclude
$XDG_CONFIG_HOME/git/ignore
$HOME/.config/git/ignore
See the gitignore man page for more information.
When changes are complete, rebase your commits onto the latest
asf/main
and verify that all tests pass:$ git fetch asf $ git rebase asf/main $ sbt test
Note that you should not use
git pull
orgit merge
to sync to theasf
repo. Alwaysfetch/rebase
and avoid merge commits. Pull requests containing merge commits will be rejected.If multiple commits were made in step 8, use
git rebase -i asf/main
to interactively rebase and squash the commits into the smallest number of logical commits. Most commonly this should be a single commit, but there may be some rare cases where multiple commits make sense.Push your branch to your fork:
$ git push origin daffodil-123-bitorder-feature
- Use the GitHub interface to create a pull request for your new branch.
Wait for review comments. There must be at least two +1's from other committers before the change can be merged. If there are any review comments that require changes or the automated Travis CI build fails, create a new commit on your branch (do not squash your changes yet or use
git commit --amend
) and push your branch with new commits to GitHub for furthur review. The process should look like:edit files $ git add <files that changed> $ git commit $ git push origin daffodil-123-bitorder-feature
The pull request will automatically update with your new commit. Repeat this step until at least two +1's are recieved from committers.
Once at least two +1's are received from committers, a committer can accept the pull request. If you made extra commits in step 12, you should now fetch the latest asf, rebase and squash the changes into a single commit (fixing potential conflicts), and push to
origin
using the--force
option:$ git fetch asf $ git rebase -i asf/main $ git push --force origin daffodil-123-bitorder-feature
A committer can now merge the pull request using the GitHub GUI. This is to be done by clicking the
"Merge pull request"
drop down and selecting"Rebase and merge"
. The"Create merge commit"
and"Squash and merge"
options should not be used. For new committers, you may need to link your GitHub and ASF accounts by visiting https://gitbox.apache.org before you can merge.
- The committer that merged the pull request should now mark the JIRA bug as "
Resolved"
and add a comment with the git commit hash that includes the fix. If you would like to clean up, you can now delete your development branch, either via the GitHub user interface or:
$ git push --delete origin daffodil-123-bitorder-feature $ git branch -D daffodil-123-bitorder-feature
3 Comments
Mike Beckerle
Use case: A developer does large volume of work (worth saving), but for some reason has to stop working on it, and we'd like to hand off that work to someone else. It is not ready to be merged.
How is this accomplished?
My assumption is that this developer's fork from the github mirror, becomes someone else's remote who pulls the branches into their fork?
Is that correct?
Steve Lawrence
Correct. The original developer (DevA) should push their branch to their DevA fork. The new developer (DevB) should add DevA's fork as a remote, fetch the new branch, and follow this workflow, pushing to their own DevB fork when changes are complete.
Mike Beckerle
I want to fold these learnings into the flow above.
Many of our patch sets are large. We're still refactoring the internals of Daffodil to improve maintainability and performance. So a patch set might modify 80 files. Generally review for large sets of changes like this requires several iterations of a review-fix cycle.
So what we learned is that each time you want code reviewed, you want to push (without --force) a single separate new commit to your branch. This commit should not be squashed together with any commit from a prior review, but should squash together all commits for changes since the prior review. Your work may go through several cycles of commit, push, get review comments, make changes to respond to them (doing local commits as often as you want), squash local commmits into a "next review commit", push, repeat.
Let's say your review-fix cycle takes 3 iterations. Then at the end there should be 3 commits on that branch that become part of the pull-request for review. Each commit gathers comments as part of reviewing, and the response to those comments is a new separate commit on the branch.
Many developers use a "commit often" discipline. Those commits are to your local clone of your fork repository. Commit as often as you want there. When it is time to code review, squash all the commits together into a single commit of changes since the prior review.
When review comments from two reviewers come back +1, then it is time to incorporate the change into the main branch.
At this point, squash together all the review commits so you have one commit.
This one you must 'git push --force' to your branch on origin (aka your fork repo).
That code-review UI where you can see changes and comments interleaved into the code,... that UI no longer can retrieve those comments once the commits they are on have been squashed together.
That means review comments are not a matter of permanent record - though every one is sent to the dev mailing list, so they're recorded in that way, but you won't be able to open the pull request and revisit comments on the individual review commits any longer.
This means we need to follow this policy:
That means when doing code review - it's useful to remind contributors to put things into code comments.