Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: HIVE-28420: URL to Hive mailing lists is incorrect

...

Get the source code on your local drive using git. See 83527164 Understanding Hive Branches below to understand which branch you should be using.

...

This checklist tells you how to create accounts and obtain permissions needed by Hive contributors. See the Hive website for additional information.

  •  Create Request an Apache Software Foundation JIRA account, if you do not already have oneSign Up for JIRA.
    • The ASF JIRA system dashboard is here.
    • The Hive JIRA is here. 
  •  To review patches for JIRA tickets, use the Review Board. If you need an account, register here. See 83527164 below for more information.you could do that on Github
    • All Hive patches posted for review are listed here.
    • Individual JIRA tickets provide a link to the issue on the review board when a review request has been made.For simple reviews, you can just read the patch attached to the JIRA ticket and post a comment.
  •  To contribute to the Hive wiki, follow the instructions in About This Wiki.
  •  To edit the Hive website, follow the instructions in How to edit the website.
  •  Join the Hive mailing lists to receive email about issues and discussions.

Making Changes

If you're a newcomer, feel free to contribute by working on a newbie task.

Before you start, send a Before you start, send a message to the Hive developer mailing list, or file a bug report in JIRA. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements.

...

Understanding Hive Branches

Hive has sa a few "main lines", master and branch-X.

...

  • Add a new XXXXXX.q file in ql/src/test/queries/clientpositive. (Optionally, add a new XXXXXX.q file for a query that is expected to fail in ql/src/test/queries/clientnegative.)
  • Run mvn test -Dtest=TestCliDriver TestMiniLlapLocalCliDriver -Dqfile=XXXXXX.q -Dtest.output.overwrite=true. This will generate a new XXXXXX.q.out file in ql/src/test/results/clientpositive.
    • If you want to run multiple .q files in the test run, you can specify comma separated .q files, for example -Dqfile="X1.q,X2.q". You can also specify a Java regex, for example -Dqfile_regex='join.*'. (Note that it takes Java regex, i.e., 'join.*' and not 'join*'.) The regex match first removes the .q from the file name before matching regex, so specifying join*.q will not work.

...

If you need to make changes you just need to push further changes to the branch - but keep in mind that any new commit will trigger a new testrun; and the time needed to execute tests is measured in hours - so keep this in mind when you are pushing in changes.

Please do not:

  • reformat code unrelated to the bug issue being fixed: formatting changes should be separate patches/commits;
  • comment out code that is now obsolete: just remove it;
  • insert comments around each change, marking the change: folks can use git to figure out what's changed and by whom;
  • make things public that are not required by end-users.

...

  • try to adhere to the coding style of files you edit;
  • comment code whose function or rationale is not obvious;
  • add one or more unit tests (see 83527164 Add a Unit Test above);
  • update documentation (such as Javadocs including package.html files and this wiki).

Precommit Tests by Hive QA

If the name of your patch conforms to the naming convention shown above, the automated testing system will run precommit tests and post the results as a JIRA comment from Hive QA. The results give advisory +1 or -1 votes (SUCCESS or ERROR) based on whether all of the tests executed successfully and, more recently, whether existing tests are modified or new tests are included in the patch to cover the code changes. For examples, see the Hive QA comments on HIVE-9534 and HIVE-11752. Note that sometimes tests fail for reasons unrelated to the patch.

To prevent precommit testing, include the case-sensitive phrase NO PRECOMMIT TESTS in the Description section of the JIRA issue. You can remove it later as needed. For examples, see HIVE-5289, HIVE-7343, and HIVE-7375.

Applying a Patch

To apply a patch that you either generated or found from JIRA, you can issue:

Code Block
patch -p0 < cool_patch.patch

If you prefer to use git to apply the patch, the following patches the tree and runs git add on them ( this is very usefull, since it will not miss added/renamed files; and also enables git to oversee the conflicts...so git mergetool can be used to resolve the conflicts)

Code Block
git apply -3 -p0 HIVE-1111.1.patch

If you just want to check whether the patch applies you can run patch with --dry-run option:

Code Block
patch -p0 --dry-run < cool_patch.patch

If you are an Eclipse user, you can apply a patch by:

  1. Right click project name in Package Explorer.
  2. Team -> Apply Patch.

Review Process

See Review Board for instructions.

  • Use Hadoop's code review checklist as a rough guide when doing reviews.
  • In JIRA, use 'Submit Patch' to get your review request into the queue.
  • If a committer requests changes, set the issue status to 'Resume Progress', then once you're ready, submit an updated patch with necessary fixes and then request another round of review with 'Submit Patch' again.
  • Once your patch is accepted, be sure to upload a final version which grants rights to the ASF.

Contributing Your Work

Finally, patches should be attached to an issue report in JIRA via the Attach File link on the issue's JIRA. Please add a comment that asks for a code review. Please note that the attachment should be granted license to ASF for inclusion in ASF works (as per the Apache License).

When you believe that your patch is ready to be committed, select the Submit Patch link on the issue's JIRA. Unit tests will run automatically if the file is named according to the naming standards. See Hive PreCommit Patch TestingTests should all pass. If your patch involves performance optimizations, they should be validated by benchmarks that demonstrate an improvement.

If your patch creates an incompatibility with the latest major release, then you must set the Incompatible change flag on the issue's JIRA and fill in the Release Note field with an explanation of the impact of the incompatibility and the necessary steps users must take.

If your patch implements a major feature or improvement, then you must fill in the Release Note field on the issue's JIRA with an explanation of the feature that will be comprehensible by the end user.

The Release Note field can also document changes in the user interface (such as new HiveQL syntax or configuration parameters) prior to inclusion in the wiki documentation.

A committer should evaluate the patch within a few days and either: commit it; or reject it with an explanation.

Please be patient. Committers are busy people too. If no one responds to your patch after a few days, please make friendly reminders. Please incorporate others' suggestions into your patch if you think they're reasonable. Finally, remember that even a patch that is not committed is useful to the community.

Should your patch receive a "-1" select Resume Progress on the issue's JIRA, upload a new patch with necessary fixes, and then select the Submit Patch link again.

Committers: for non-trivial changes, it is best to get another committer to review your patches before commit. Use the Submit Patch link like other contributors, and then wait for a "+1" from another committer before committing. Please also try to frequently review things in the patch queue.

JIRA Guidelines

If you don't already have a JIRA account, sign Up for JIRA.

Please comment on issues in JIRA, making your concerns known. Please also vote for issues that are a high priority for you.

Please refrain from editing descriptions and comments if possible, as edits spam the mailing list and clutter JIRA's "All" display, which is otherwise very useful. Instead, preview descriptions and comments using the preview button (icon below the comment box) before posting them.

Keep descriptions brief and save more elaborate proposals for comments, since descriptions are included in JIRA's automatically sent messages. If you change your mind, note this in a new comment, rather than editing an older comment. The issue should preserve this history of the discussion.

To open a JIRA issue, click the Create button on the top line of the Hive summary page or any Hive JIRA issue.

Please leave Fix Version/s empty when creating the issue – it should not be tagged until an issue is closed, and then, it is tagged by the committer closing it to indicate the earliest version(s) the fix went into. Instead of Fix Version/s, use Target Version/s to request which versions the new issue's patch should go into. (Target Version/s was added to the Create Issue form in November 2015. You can add target versions to issues created before that with the Edit button, which is in the upper left corner.)

When in doubt about how to fill in the Create Issue form, take a look at what was done for other issues. Here are several Hive JIRA issues that you can use as examples:

Many examples of uncommitted issues are available in the "Added recently" list on the issues panel.

Generating Thrift Code

Some portions of the Hive code are generated by Thrift. For most Hive changes, you don't need to worry about this, but if you modify any of the Thrift IDL files (e.g. metastore/if/hive_metastore.thrift and service/if/hive_service.thrift), then you'll also need to regenerate these files and submit their updated versions as part of your patch.

Here are the steps relevant to hive_metastore.thrift:

Fetching a PR from Github

you could do that using:

Code Block
git fetch origin pull/ID/head:BRANCHNAME

Suppose you want to pull the changes of PR-1234 into a local branch named "radiator"

Code Block
git fetch origin pull/1234/head:radiator

Contributing Your Work

You should open a JIRA ticket about the issue you are about to fix.

Upload your changes to your github fork and open a PR against the hive repo.

If your patch creates an incompatibility with the latest major release, then you must set the Incompatible change flag on the issue's JIRA and fill in the Release Note field with an explanation of the impact of the incompatibility and the necessary steps users must take.

If your patch implements a major feature or improvement, then you must fill in the Release Note field on the issue's JIRA with an explanation of the feature that will be comprehensible by the end user.

The Release Note field can also document changes in the user interface (such as new HiveQL syntax or configuration parameters) prior to inclusion in the wiki documentation.

A committer should evaluate the patch within a few days and either: commit it; or reject it with an explanation.

Please be patient. Committers are busy people too. If no one responds to your patch after a few days, please make friendly reminders. Please incorporate others' suggestions into your patch if you think they're reasonable. Finally, remember that even a patch that is not committed is useful to the community.

Should your patch receive a "-1" select Resume Progress on the issue's JIRA, upload a new patch with necessary fixes, and then select the Submit Patch link again.

Committers: for non-trivial changes, it is best to get another committer to review your patches before commit. Use the Submit Patch link like other contributors, and then wait for a "+1" from another committer before committing. Please also try to frequently review things in the patch queue.

JIRA

Hive uses JIRA for issues/case management. You must have a JIRA account in order to log cases and issues.

Requests for the creation of new accounts can be submitted via the following form: https://selfserve.apache.org/jira-account.html

Guidelines

Please comment on issues in JIRA, making your concerns known. Please also vote for issues that are a high priority for you.

Please refrain from editing descriptions and comments if possible, as edits spam the mailing list and clutter JIRA's "All" display, which is otherwise very useful. Instead, preview descriptions and comments using the preview button (icon below the comment box) before posting them.

Keep descriptions brief and save more elaborate proposals for comments, since descriptions are included in JIRA's automatically sent messages. If you change your mind, note this in a new comment, rather than editing an older comment. The issue should preserve this history of the discussion.

To open a JIRA issue, click the Create button on the top line of the Hive summary page or any Hive JIRA issue.

Please leave Fix Version/s empty when creating the issue – it should not be tagged until an issue is closed, and then, it is tagged by the committer closing it to indicate the earliest version(s) the fix went into. Instead of Fix Version/s, use Target Version/s to request which versions the new issue's patch should go into. (Target Version/s was added to the Create Issue form in November 2015. You can add target versions to issues created before that with the Edit button, which is in the upper left corner.)

Consider using bi-directional links when referring to other tickets. It is very common and convenient to refer to other tickets by adding the HIVE-XXXXX pattern in summary, description, and comments. The pattern allows someone to navigate quickly to an older JIRA from the current one but not the other way around. Ideally, along with the mention (HIVE-XXXXX) pattern, it helps to add an explicit link (relates to, causes, depends upon, etc.) so that the relationship between tickets is visible from both ends.

Add the "backward-incompatible" label to tickets changing the behavior of some component or introduce modifications to public APIs. There are various other labels available for similar purposes but this is the most widely used across projects so it is better to stick to it to keep things uniform.

When in doubt about how to fill in the Create Issue form, take a look at what was done for other issues. Here are several Hive JIRA issues that you can use as examples:

Many examples of uncommitted issues are available in the "Added recently" list on the issues panel.

Generating Thrift Code

Some portions of the Hive code are generated by Thrift. For most Hive changes, you don't need to worry about this, but if you modify any of the Thrift IDL files (e.g., standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift and service-rpc/if/TCLIService.thrift), then you'll also need to regenerate these files and submit their updated versions as part of your patch.

Here are the steps relevant to hive_metastore.thrift:

  1. Don't make any changes to hive_metastore.thrift until instructed below.
  2. Use the approved version of Thrift. This is currently thrift-0.14.1, which you can obtain from http://thrift.apache.org/.
    1. For Mac via Homebrew (since the version we need is not available by default):

      Code Block
      languagebash
      brew tap-new $USER/local-tap
      brew extract --version='0.14.1' thrift $USER/local-tap
      brew install thrift@0.14.1
      mkdir -p /usr/local/share/fb303/if
      cp /usr/local/Cellar/thrift@0.14.1/0.14.1/share/fb303/if/fb303.thrift /usr/local/share/fb303/if


    2. For Mac, building from sources:

      Code Block
      languagebash
      wget http://archive.apache.org/dist/thrift/0.14.1/thrift-0.14.1.tar.gz
      
      tar xzf thrift-0.14.1.tar.gz
      
      #If configure fails with "syntax error near unexpected token `QT5", then run "brew install pkg-config"
      
      ./bootstrap.sh
      
      sudo ./configure --with-openssl=/usr/local/Cellar/openssl@1.1/1.1.1j --without-erlang --without-nodejs --without-python --without-py3 --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-swift --without-dotnetcore --without-qt5
      
      brew install openssl
      
      sudo ln -s /usr/local/opt/openssl/include/openssl/ /usr/local/include/
      
      sudo make
      
      sudo make install
      
      mkdir -p /usr/local/share/fb303/if
      
      cp path/to/thrift-0.14.1/contrib/fb303/if/fb303.thrift /usr/local/share/fb303/if/fb303.thrift
      # or alternatively the following command
      curl -o /usr/local/share/fb303/if/fb303.thrift https://raw.githubusercontent.com/apache/thrift/master/contrib/fb303/if/fb303.thrift


    3. For Linux:

      Code Block
      languagebash
      cd /path/to/thrift-0.14.1
      /configure -without-erlang --without-nodejs --without-python --without-py3 --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-swift --without-dotnetcore --without-qt5
      sudo make
      sudo make install 
      sudo mkdir -p /usr/local/share/fb303/if
      sudo cp /path/to/thrift-0.14.1
  3. Don't make any changes to hive_metastore.thrift until instructed below.
  4. Use the approved version of Thrift. This is currently thrift-0.9.3, which you can obtain from http://thrift.apache.org/. (Or on Mac: brew install thrift@0.9, and no need to build, skip to step 4.)
  5. Build the Thrift compiler from its sources, then install it:
    1. cd /path/to/thrift-0.9.3
    2. ./configure --without-csharp --without-ruby
    3. make
    4. sudo make install
  6. Before proceeding, verify that which thrift returns the build of Thrift you just installed (typically /usr/local/bin on Linux); if not, edit your PATH and repeat the verification. Also verify that the command 'thrift -version' returns the expected version number of Thrift.
  7. Now you can run the Maven 'thriftif' profile to generate the Thrift code:
    1. cd /path/to/hive-trunk/
    2. mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local  -Phadoop-2
    3. If you see an error about fb303.thrift not being found, copy it to the appropriate directory and run above command again.On centOS/RHEL:
      cp /path/to/thrift-0.9.3
      /contrib/fb303/if/fb303.thrift /usr/local/share/fb303/if/fb303.thrift
      On Mac with homebrew:
      cd /usr/local/Cellar/thrift@0.9/0.9.3 && mkdir -p share/fb303/if && cd share/fb303/if && curl -o fb303.thrift  https://raw.githubusercontent.com/apache/thrift/master/contrib/fb303/if/fb303.thrift


  8. Before proceeding, verify that which thrift returns the build of Thrift you just installed (typically /usr/local/bin on Linux); if not, edit your PATH and repeat the verification. Also verify that the command 'thrift -version' returns the expected version number of Thrift.

  9. Now you can run the Maven 'thriftif' profile to generate the Thrift code:
    1. cd /path/to/hive/
    2. mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
  10. Verify that the code generation was a no-op, which should be the case if you have the correct Thrift version and everyone has been following these instructions. You may use git status or svn status for for the same. If you can't figure out what is going wrong, ask for help from a committer.
  11. Now make your changes to hive_metastore.thrift, and then run the compiler again, from /path/to/hive-trunk/<hive_metastore.thrift's module>:
    1. mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local  -Phadoop-2
  12. Now use svn status and svn diff or git status and git diff to verify that the regenerated code corresponds only to the changes you made to hive_metastore.thrift. You may also need svn add or need git add if new files were generated (and svn remove or git rm if some files have been are now obsoleted).
  13. cd /path/to/hive-trunk
  14. ant mvn clean package -DskiptTests (at the time of writing also "-Dmaven.javadoc.skip" is needed)
  15. Verify that Hive is still working correctly with both embedded and remote metastore configurations.

...

Contributors should join the Hive mailing lists. In particular the dev list (to join discussions of changes) and the user list (to help others).

...