Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


  1. If your source code was checkout from SVN repo, scroll down for a guide to migrate to git repo. 2. If your source code was cloned from a git repo at for primary development. Please scroll down to find a guide to migrate to github repo. 3. Currently, the repository at can be used as a primary repo. If you are looking for a fresh getting started instructions, skip migration guides, start with Developer Workflow and/or Contribution workflow using Github's pull requests.

Migrating from an existing SVN checkout of Tika to Git


  1. svn status (ensure no local changes) 2. mv .svn .svn.old (or simply find . -name "*.svn" -exec rm -rf {} \;) 3. git init 4. git remote add origin 5. git checkout -b merge-branch 6. git fetch --all 7. git reset --hard origin/master 8. git checkout master

And on my Tika 2.x checkout the last two steps were changed to:

  1. git reset --hard origin/2.x 2. git checkout 2.x

Migrating to Github from

Previously the repo was hosted on, if you are one of those who cloned it from git-wip-us.a.o and wants to migrate to Github as a primary repo, follow these instructions:

  1. please update your repository's remote url:

    No Format
     git remote set-url origin 

    2. If not already enabled - Goto your github account settings and please do 2-Factor-Auth for your account. It's a MUST do, no kidding. 3. Connect your apache account to github account. Click on If you type the URL manually, please note there is a '/' at the end of setup/
    4. Authorize your Apache account, and then Authorize your github account.

    1. If you see all three boxes green and a tick inside them, your setup is complete (smile) 2. Infra team had instructed in another mailing list that sometimes 'MFA status' might not tick in first pass. They said that the MFA status updater runs hourly so wait for an hour atleast. This usually happens when you authorize github account without 2FA, it won't go through in first pass. 3. If you face issues report to infra team!


Once you have the code, you can modify a file, or two, then add the files for staging/commit, and then commit them. Once done, you can also decide to push the files up to the master repository if you have write access and are a member of the PMC and/or a committer. If you don't have write access to the repository, you can follow this guide for issuing pull requests that committers/PMC members can merge.


  1. git branch TIKA-xxx 2. modify files, change them, test, etc. 3. git add <changed files> (to see them, try git status or simply to stage all changes, git add *) 4. `git commit -m "Fixes for TIKA-xxx contributed by <your first name> <your last name> <your email>" 5. git checkout master 6. git diff trunk..TIKA-xxx > TIKA-xxx.<your last name>.<yyMMdd>.patch.txt 7. Attach the patch created in 6 to JIRA.

If you are not looking for a review or you have already had a review and are ready to commit the changes, push them:


  1. git push -u origin master

Suggested User Contribution Workflow


  1. git apply < TIKA-xxx.<your last name>.<yyMMdd>.patch.txt 2. Steps 3-4 from Developer workflow guide. 3. Final 2 steps from Developer workflow guide (aka the merge of feature branch step and then the push to origin trunk)

Github contribution

If they are contributing using Github they are submitting a Pull Request for review you can easily merge their pull request into your local feature branch using Git. Assuming the Github user is "user01" and the branch they have created is "fix-tika-stuff" (you can find this information in the Tika Pull request, for example in this pull request #65, the username is smahda and branch name is TIKA-1803), you can merge it into your local feature branch like so (again, make sure you are on the local feature branch for your issue, TIKA-xxx first by typing git checkout TIKA-xxx if you aren't already):