Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
Info
titleMavenization is pendingcomplete

Hive now uses maven as a build tool as opposed to antUntil HIVE-5610 is closed the maven commands below will not work. Afterwards the ant commands should be removed.

Getting the source code

First of all, you need the Hive source code.

...

This is an optional step. Eclipse has a lot of advanced features for Java development, and it makes the life much easier for Hive developers as well.

How to set up Eclipse for Hive developmentdo I import into eclipse?

Making Changes

Before you start, send a message to the Hive developer mailing list, or file a bug report in JIRA. Describe your proposed changes and check that they fit in with what others are doing and have planned for the project. Be patient, it may take folks a while to understand your requirements.

...

  • All public classes and methods should have informative Javadoc comments.
    • Do not use @author tags.
  • Code should be formatted according to Sun's conventions, with one exception:
    • Indent two (2) spaces per level, not four (4).
    • Line length limit is 100 chars, instead of 80 chars.
  • Contributions should not introduce new Checkstyle violations.
    • Check for new Checkstyle violations by running ant checkstyle, and then inspect the results in the build/checkstyle directory.
    • If you use Eclipse you should install the eclipse-cs Checkstyle plugin. This plugin highlights violations in your code and is also able to automatically correct some types of violations.
  • Contributions should pass existing unit tests.
  • New unit tests should be provided to demonstrate bugs and fixes. JUnit is our test framework:
    • You must implement a class that extends junit.framework.TestCase and whose class name starts with Test.
    • Define methods within your class whose names begin with test, and call JUnit's many assert methods to verify conditions; these methods will be executed when you run ant test.
    • You can run all the unit test with the command ant mvn test, or you can run a specific unit test with the command ant mvn test -DtestcaseDtest=<class name without package prefix> test (for example ant mvn test -DtestcaseDtest=TestFileSystem test)

Understanding

...

Maven

Hive is built by Ant, a Java build tool.

To build Hive, run

No Format

ant clean package

Understanding Maven

Hive is a multi-module maven project. If you are new to Maven, the articles below maybe of interest:

...

Additionally, Hive actually has two projects, "core" and "itests". The reason that itests is not connected to the core reactor is that itests requires the packages to be built.

Hadoop Dependencies

The actually maven commands you will need located on the HiveDeveloperFAQ page.

Hadoop Dependencies

The Hive The Hive build downloads a number of different Hadoop versions via ivy maven in order to compile "shims" which allow for compatibility with these Hadoop versions. However, by default, the rest of Hive is only built and tested against a single Hadoop version (01.202.1 as of this writing, but check buildpom.properties xml for the latest).

You can specify a different Hadoop version with -Dhadoop.version="<your-hadoop-version>". By default, Hadoop tarballs are pulled from http://mirror.facebook.net/facebook/hive-deps, which contains Hadoop 0.17.2.1, 0.18.3, 0.19.0, 0.20.0, 0.20.1, and 0.20S (aka 0.20.3-CDH3-SNAPSHOT). If the version you want is not here, then you'll need to set hadoop.mirror to a different source. For 0.19.2 and 0.20.2, you can use http://mirror.facebook.net/apache or any other Apache mirror. For other versions, you'll need to use http://archive.apache.org/dist (but don't use this unless you have to, since it's an overloaded server).

Trunk builds of Hive require Hadoop version at least 0.20.1; older versions are no longer supported.

Unit Tests

Please make sure that all unit tests succeed before and after applying your patch and that no new javac compiler warnings are introduced by your patch. Also see the information in the previous section about testing with different Hadoop versions if you want to verify compatibility with something other than the default Hadoop version.

When submitting a patch it's highly recommended you execute tests locally which you believe will be impacted in addition to any new tests. The full test suite can be executed by Hive PreCommit Patch Testing. See Hive Developer FAQ to see how to execute a specific set of tests.

Code Block

> cd hive-trunk
> ant clean package test tar -logfile ant.log

After a while, if you see

Code Block

BUILD SUCCESSFUL

all is ok, but if you see

Code Block

BUILD FAILED

then you should fix things before proceeding. Running

Code Block

> ant testreport

and examining the HTML report in build/test might be helpful (also build/ql/tmp/hive.log).

The maven build has two profiles, one for Hadoop 1 (0.20 and 1.X) and one for Hadoop 2 (2.X). By default the hadoop-1 profile is used, to use the hadoop-2 profile just specify -Phadoop-2

Trunk builds of Hive require Hadoop version at least 0.20.1; older versions are no longer supported.

Unit Tests

Please make sure that all unit tests succeed before and after applying your patch and that no new javac compiler warnings are introduced by your patch. Also see the information in the previous section about testing with different Hadoop versions if you want to verify compatibility with something other than the default Hadoop version.

When submitting a patch it's highly recommended you execute tests locally which you believe will be impacted in addition to any new tests. The full test suite can be executed by Hive PreCommit Patch Testing. See Hive Developer FAQ to see how to execute a specific set of tests.MVN:

Code Block
> cd hive-trunk
> mvn clean install && cd itests && mvn clean install-DskipTests
> mvn test -Dtest=SomeTest

After a while, if you see

After a while, if you see

Code Block
[INFO] BUILD SUCCESS

...

Code Block
[INFO] BUILD FAILURE

then you should fix things before proceeding.

Unit tests take a long time (several hours) to run sequentially even on a very fast machine; for information on how to run them in parallel, see Unit Test Parallel Execution Hive PreCommit Patch Testing

Add a Unit Test

There are two kinds of unit tests in Hive:

  • Normal unit test: These are used by testing a particular component of Hive.
    • We just need to add a new class (name must start with "Test") in */src/test directory.
    • We can run "ant test -Dtestcase=TestAbc" where TestAbc is the name of the new class. This will test only the new testcase, which will be faster than "ant test" which tests all testcases.
  • A new query: If the new feature can be tested using Hive command line, we just need to add a new *.q file and a new *.q.out file:
    • If the feature is added in ql
      • Add a new XXXXXX.q file in ql/src/test/queries/clientpositive
      • Run "ant mvn test -DtestcaseDcase=TestCliDriver -Dqfile=XXXXXX.q -Doverwrite=true -Dtest.output.silentoverwrite=falsetrue". This will generate a new XXXXXX.q.out file in ql/src/test/results/clientpositive.
        • If you want to run multiple .q files in the test run, you can specify comma separated .q files, for example- -Dqfile="X1.q,X2.q" . You can also specify a java regex, for example -Dqfile_regex='join.*'. (Note that it takes java regex, ie 'join.' and not 'join'). The regex match first removes the .q from the file name before matching regex, so specifying "join*.q" will not work.
      • If you are using hive-0.11.0 or later, you can specify -Dmodule=ql
    • If the feature is added in contrib
      • Do the steps above, replacing "ql" with "contrib", and "TestCliDriver" with "TestContribCliDriver".
      • If you are using hive-0.11.0 or later, you can specify -Dmodule=contrib

...