We are always excited to have contributions from the community, especially from new contributors! There are many different ways that you can make contributions to the Daffodil project, including wiki updates, mailing list support, testing, and new code.

This following steps documents the workflow to get you started with contributing code to Apache Daffodil.

  1. If you do not have a JIRA account, request one by visiting https://selfserve.apache.org/jira-account.html. Once granted, you will be able to assign bugs to yourself and create new bugs.
     
  2. You may also want to subscribe to the dev@daffodil.apache.org and commits@daffodil.apache.org mailing lists by sending an email to dev-subscribe@daffodil.apache.org and commits-subscribe@daffodil.apache.org, respectively, and following the instructions.

     
  3. Search for an existing issue or create a new issue in JIRA that represents the change you would like to make or bug to fix.

    If you are a beginner to Daffodil development, a good place to start is with the Daffodil beginner bugs.

    See the Daffodil Issue Tracker information for creating issues and what information/discussions should take place in JIRA.
     

  4. Assign the issue to yourself so others know that you are working on it.

  5. Visit the Apache Daffodil GitHub and create a fork by clicking on "Fork" in the top right.
     

  6. Clone your new fork using ssh (you will need to create an ssh key and add it to GitHub if you haven't already). This will be your origin remote:

    $ git clone git@github.com:<github_username>/daffodil.git
    $ cd daffodil
  7. Add the ASF upstream repository as a new git remote, calling it asf:

    $ git remote add asf https://github.com/apache/daffodil.git
    $ git fetch asf

    It is also recommended to change the push URL of the asf remote to a nonsense string to prevent accidentally pushing to it--branches should only be pushed to your fork:

    $ git remote set-url --push "push to apache/daffodil disabled"
  8. Create a new branch off of the asf/main branch named daffodil-XYZ-description, where XYZ is the JIRA bug number and -description is an optional, very short description of the bug making it easier to differentiate between multiple development branches. For example:

    $ git checkout -b daffodil-123-bitorder-feature asf/main
  9. Make changes to the branch, frequently adding new commits. For example, the following process should repeat until your code is ready to be reviewed:

    edit files
    $ git add <files that have changed>
    $ git commit

    Code changes should follow the Daffodil Code Style Guidelines and should add appropriate tests using the Test Data Markup Language (TDML) or unit tests.

    General guidelines for a good commit message:

    - The first line of a commit message should contain a short (~50 characters) description of the changes.

    - The second line should be blank, followed by a longer description of the change, wrapped at 72 characters. This long description should describe what was changed and, more importantly, why those changes were made. The 'what' can be determined by inspecting the code, but the 'why' is often less obvious.

    - If there are any changes that deprecate functionality or are non-backwards compatible, a section should follow labeled with the "Deprecation/Compatibility:" keyword with a description that can be copy/pasted into release notes. This should be more user focused, including what was deprecated/non-backwards compatible and a migration guide.

    - At the end of the commit should be a blank line followed by the JIRA bug number, e.g. DAFFODIL-123. Multiple bugs referenced in a single commit should be separated by a comma on the same line.

    An example of a commit message is:

    Add support for the dfdl:bitOrder feature
     
    Longer explanation of what changes were made to support the bitOrder
    feature, including a description of why the changes were made. Multiple
    lines are wrapped at 72 characters.
    
    Deprecation/Compatibility:
    
    The dfdlx:bitDirection extension property is now deprecated in favor of
    the new dfdl:bitOrder property:
    
    - dfdlx:bitDirection="l2r" becomes dfdl:bitOrder="mostSignificantBitFirst"
    - dfdlx:bitDirection="r2l" becomes dfdl:bitOrder="leastSignificantBitFirst"
    
    DAFFODIL-123


    Your IDE or operating system environment may add files to the git repository that are specific to your development environment, such as temporary backups and IDE project files. These files should be ignored and not commited by git. However, because they are specific to your environment, the Daffodil .gitignore file does not contain entires for them. You may want to add such entires to one of the following system-specific gitignore files instead:

    • $GIT_DIR/info/exclude 
    • $XDG_CONFIG_HOME/git/ignore 
    • $HOME/.config/git/ignore 

    See the gitignore man page for more information.

  10. When changes are complete, rebase your commits onto the latest asf/main and verify that all tests pass:

    $ git fetch asf
    $ git rebase asf/main
    $ sbt test

    Note that you should not use git pull or git merge to sync to the asf repo. Always fetch/rebase and avoid merge commits. Pull requests containing merge commits will be rejected.


  11. If multiple commits were made in step 8, use git rebase -i asf/main to interactively rebase and squash the commits into the smallest number of logical commits. Most commonly this should be a single commit, but there may be some rare cases where multiple commits make sense.

  12. Push your branch to your fork:

    $ git push origin daffodil-123-bitorder-feature
  13. Use the GitHub interface to create a pull request for your new branch.

  14. Wait for review comments. There must be at least two +1's from other committers before the change can be merged. If there are any review comments that require changes or the automated Travis CI build fails, create a new commit on your branch (do not squash your changes yet or use git commit --amend) and push your branch with new commits to GitHub for furthur review. The process should look like:

    edit files
    $ git add <files that changed>
    $ git commit
    $ git push origin daffodil-123-bitorder-feature

    The pull request will automatically update with your new commit. Repeat this step until at least two +1's are recieved from committers.
     

  15. Once at least two +1's are received from committers, a committer can accept the pull request. If you made extra commits in step 12, you should now fetch the latest asf, rebase and squash the changes into a single commit  (fixing potential conflicts), and push to origin using the --force option:

    $ git fetch asf
    $ git rebase -i asf/main
    $ git push --force origin daffodil-123-bitorder-feature
  16. A committer can now merge the pull request using the GitHub GUI. This is to be done by clicking the "Merge pull request" drop down and selecting "Rebase and merge". The "Create merge commit" and "Squash and merge" options should not be used.  For new committers, you may need to link your GitHub and ASF accounts by visiting https://gitbox.apache.org before you can merge.
     

  17. The committer that merged the pull request should now mark the JIRA bug as "Resolved" and add a comment with the git commit hash that includes the fix.

  18. If you would like to clean up, you can now delete your development branch, either via the GitHub user interface or:

    $ git push --delete origin daffodil-123-bitorder-feature
    $ git branch -D daffodil-123-bitorder-feature
  • No labels

3 Comments

  1. Use case: A developer does large volume of work (worth saving), but for some reason has to stop working on it, and we'd like to hand off that work to someone else. It is not ready to be merged.

    How is this accomplished?

    My assumption is that this developer's fork from the github mirror, becomes someone else's remote who pulls the branches into their fork?

    Is that correct?

    1. Correct. The original developer (DevA) should push their branch to their DevA fork. The new developer (DevB) should add DevA's fork as a remote, fetch the new branch, and follow this workflow, pushing to their own DevB fork when changes are complete. 

  2. I want to fold these learnings into the flow above.

    Many of our patch sets are large. We're still refactoring the internals of Daffodil to improve maintainability and performance. So a patch set might modify 80 files. Generally review for large sets of changes like this requires several iterations of a review-fix cycle.

    So what we learned is that each time you want code reviewed, you want to push (without --force) a single separate new commit to your branch. This commit should not be squashed together with any commit from a prior review, but should squash together all commits for changes since the prior review. Your work may go through several cycles of commit, push, get review comments, make changes to respond to them (doing local commits as often as you want), squash local commmits into a "next review commit", push, repeat.

    Let's say your review-fix cycle takes 3 iterations. Then at the end there should be 3 commits on that branch that become part of the pull-request for review. Each commit gathers comments as part of reviewing, and the response to those comments is a new separate commit on the branch.

    Many developers use a "commit often" discipline. Those commits are to your local clone of your fork repository. Commit as often as you want there. When it is time to code review, squash all the commits together into a single commit of changes since the prior review.

    • It's very important that each time you add a new commit for review, that it be separate, not squashed into anything already reviewed before it. This preserves the commentary on prior commits for your pull request. It allows reviewers to see how your new changes addressed the prior comments.
    • It's very important that you will never need to "git push --force ...." anything during the review-fix cycle. If you do, you have done something wrong - like squashed reviewed-commits together with post-review ones.

    When review comments from two reviewers come back +1, then it is time to incorporate the change into the main branch.

    At this point, squash together all the review commits so you have one commit.

    This one you must 'git push --force' to your branch on origin (aka your fork repo).

    • Note: doing this will make it impossible to revisit review comments from the commits that have been squashed together - basically the commentary is lost once the commit is squashed together.

    That code-review UI where you can see changes and comments interleaved into the code,... that UI no longer can retrieve those comments once the commits they are on have been squashed together. 

    That means review comments are not a matter of permanent record - though every one is sent to the dev mailing list, so they're recorded in that way, but you won't be able to open the pull request and revisit comments on the individual review commits any longer.

    This means we need to follow this policy:

    • If a comment contains any description/discussion that wants to be maintained/remembered, it should be edited into a comment in the code (perhaps with a TODO or FIXME tag so it's easy to find.)

    That means when doing code review - it's useful to remind contributors to put things into code comments.