Welcome contributors! We strive to include everyone's contributions. This page provides necessary guidelines on how to contribute effectively towards furthering the development and evolution of Sqoop.
Note: This guide applies to general contributors. If you are a committer, please read the Guide for Committers as well.
There are many ways you can contribute towards the project. A few of these are:
Jump in on discussions: It is possible that someone initiates a thread on the mailing list describing a problem that you have dealt with in the past. You can help the project by chiming in on that thread and guiding that user to overcome or workaround that problem or limitation.
File Bugs: If you notice a problem and are sure it is a bug, then go ahead and file a JIRA. If however, you are not very sure that it is a bug, you should first confirm it by discussing it on the Mailing Lists.
Review Code: If you see that a JIRA ticket has a 'Patch Available' status, go ahead and review it. It cannot be stressed enough that you must be kind in your review and explain the rational for your feedback and suggestions. Also note that not all review feedback is accepted - often times it is a compromise between the contributor and reviewer. If you are happy with the change and do not spot any major issues +1
it. More information on this is available in the following sections.
Provide Patches: We encourage you to assign the relevant JIRA issue to yourself and supply a patch for it. The patch you provide can be code, documentation, build changes, or any combination of these. More information on this is available in the following sections.
In order to setup your development environment, you would need a Linux system with administrative privileges and Internet connection such as Ubuntu or CentOS. You would also need sufficient disk space for checking out and building the code, installing various database/other software that you may need for your testing.
Once you have your Linux system ready with sufficient disk space and Internet connection, go ahead and install the following software:
To get the source code, checkout the subversion "trunk" using the following command:
$ svn co https://svn.apache.org/repos/asf/incubator/sqoop/trunk/ sqoop |
If you prefer using git, you can clone the Sqoop repository from Apache Git mirror by the following command:
$ git clone git://git.apache.org/sqoop.git |
Once you have the code, you can build it by the following command:
$ cd sqoop $ ant jar-all |
You can use the clean
target to delete previously built files from the workspace and run jar-all
again to do a fresh build.
To see a list of all available targets that are available in the build type the following command:
$ ant -p |
If you prefer working in Eclipse, you can generate the necessary project definitions as follows:
ant eclipse |
Once these definitions are generated, you can import them in Eclipse as an existing project.
Sqoop source code contains many unit tests that exercise its functionality. These tests can be run simply by using the following command:
ant test |
Create a directory somewhere convenient on your development system. This directory will hold all the JDBC drivers that the tests will use. Once created, create (or edit) the build.properties
file in Sqoop workspace root directory and set the the full path of this directory as the value of the property sqoop.thirdparty.lib.dir
. For example:
sqoop.thirdparty.lib.dir=/opt/ws/3rd-party-lib |
Third-party tests are end-to-end integration tests that exercise the basic Sqoop functionality against third-party databases. You should run these tests in order to rule out regression when testing any changes to the core system. Before you run these tests, you must setup the following databases:
mysqldump
and mysqlimport
.build.properties
file by the value for the property sqoop.test.mysql.connectstring.host_url
. This property defaults to jdbc:mysql://localhost/
which assumes local installation and default port setup. If however your MySQL server is installed on a different host or on a different port you should specify it explicitly as follows:
sqoop.test.mysql.connectstring.host_url=jdbc:mysql://<mysqlhost>:<port>/ |
$ mysql -u root -p mysql> CREATE DATABASE sqooppasstest; mysql> CREATE DATABASE sqooptestdb; mysql> use mysql; mysql> GRANT ALL PRIVILEGES on sqooppasstest.* TO 'sqooptest'@'localhost' IDENTIFIED BY '12345'; mysql> GRANT ALL PRIVILEGES ON sqooptestdb.* TO 'yourusername'@'localhost'; mysql> flush privileges; mysql> \q |
localhost
with the appropriate client host value.yourusername
with your actual user name before issuing the command.psql
.build.properties
file by the value for the property sqoop.test.postgresql.connectstring.host_url
. This property defaults to jdbc:postgresql://localhost/
which assumes local installation and default port setup. If however your PostgreSQL server is installed on a different host or on a different port you should specify it explicitly as follows:
sqoop.test.postgresql.connectstring.host_url=jdbc:postgresql://<pgsqlhost>:<pgsqlport>/ |
pg_hba.conf
file and setup the authentication scheme to allow for testing. In a secured environment, it may be easy to setup up full trust based access by adding the following lines in this file, and commenting out any other lines referencing 127.0.0.1 or ::1.
local all all trust host all all 127.0.0.1/32 trust host all all ::1/128 trust |
postgresql.conf
uncomment the line that starts with listen_address
and set its value to '*' as follows:
listen_address = '*' |
$ sudo -u postgres psql -U postgres template1 template1=> CREATE USER sqooptest; template1=> CREATE DATABASE sqooptest; tempalte1=> \q $ |
build.properties
file by the value for the property sqoop.test.oracle.connectstring
. This property defaults to jdbc:oracle:thin:@//localhost/xe
which assumes local installation and default port setup. If however your Oracle server is installed on a different host or on a different port you should specify it explicitly as follows:
sqoop.test.oracle.connectstring=jdbc:oracle:thin:@//<oraclehost>:<port>/<sid> |
$ sqlplus system/<password>@<sid> SQL> CREATE USER SQOOPTEST identified by 12345; SQL> GRANT CONNECT, RESOURCE to SQOOPTEST; SQL> CREATE USER SQOOPTEST2 identified by ABCDEF; SQL> GRANT CONNECT, RESOURCE to SQOOPTEST2; SQL> exit $ |
ORA-12516, TNS:listener could not find available handler with matching protocol stack
, you are likely running into connection exhaustion problem. To circumvent this, log into the Oracle server as SYSTEM
, run the command below and restart your server.
$ sqlplus system/<password>@<sid> SQL> ALTER SYSTEM SET processes=200 scope=spfile; SQL> exit $ |
Once you have installed and configured all the above databases - MySQL, PostgreSQL and Oracle, you are now ready to run the third-party tests. To run them issue the following command:
$ ant test -Dthirdparty=true |
Certain third-party tests are categorized as Manual tests since these were introduced at a later stage and adding them to the third-party suite of tests would have resulted in ever test environment requiring new database installation.
build.properties
file by the value for the property sqoop.test.sqlserver.connectstring.host_url
. This property defaults to jdbc:sqlserver://sqlserverhost:1433
which assumes installation on a host called sqlserverhost
and port 1433
setup. If however your SQL server is installed on a different host or on a different port you should specify it explicitly as follows:
sqoop.test.sqlserver.connectstring.host_url=jdbc:sqlserver://<sqlserverhost>:<port> |
SQOOPTEST
.SQOOPUSER
and password PASSWORD
.SQOOPTEST
to the login SQOOPUSER
.build.properties
file by the value for the property sqoop.test.db2.connectstring.host_url
. This property defaults to jdbc:db2://db2host:50000
which assumes installation on a host called db2host
and port 50000
setup. If however your DB2 server is installed on a different host or on a different port you should specify it explicitly as follows:
sqoop.test.db2.connectstring.host_url=jdbc:db2://<db2host>:<port> |
SQOOP
.SQOOP
with password PASSWORD
.SQOOP
to login SQOOP
.Once you have installed and configured all the above databases - SQL Server and DB2, you are now ready to run the manual tests. To run them, issue the following command:
$ ant test -Dmanual=true |
To build Sqoop documentation, run the following command from the workspace root directory:
$ ant docs |
This will generate the documentation in the directory build/docs
directory. To see the documentation, open the file build/docs/index.html
in a web browser, where you will find the links to user and developer guides. All the man pages that are generated by this are available directly under build/docs
directory with the extension <name>.1.gz
. You can look at these man pages without installing them by the following comamnd:
$ man -l sqoop.1.gz |
To build the tar-ball for distribution, use the following command:
$ ant tar |
This will produce a tar-ball distribution file with a name sqoop-<version>.tar.gz
under the build directory.
Sqoop uses the Apache Review Board for doing code reviews. In order for a change to be reviewed, it should be either posted on the review board or attached to the JIRA. If the change is a minor change affecting only few lines and does not seem to impact main logic of the affected sources, it need not be posted on the review board. However, if the code change is large or otherwise impacting the core logic of the affected sources, it should be posted on the review board. Feel free to comment on the JIRA requesting the assignee to post the patch for review on review board.
Note: Not all patches attached to a JIRA are ready for review. Sometimes the patches are attached just to solicit early feedback regarding the implementation direction. Feel free to look it over and give your feedback in the JIRA as necessary. Patches are considered ready for review either when the patch has been posted on review board, or the JIRA status has been changed to 'Patch Available'.
The net outcome from the review should be the same - which is to ensure the following:
Following are some guidelines on how to do a code review. You may use any other approach instead as long as the above stated goals are met. That said, here is an approach that works fine generally:
Once you have collected your comments/concerns/feedback you need to send it to back to the contributor. In doing so, please be as courteous as possible and ensure the following:
Once you have provided your feedback, wait for the developer to respond. It is possible that the developer may need further clarification on your feedback, in which case you should promptly provide it where necessary. In general, the dialog between the reviewer and developer should lead to finding a reasonable middle ground where key concerns are satisfied and the goals of the review have been met.
If a change has met all your criteria for review, please +1
the change to indicate that you are happy with it.
In order to provide patches, please follow the following guidelines:
$ svn diff > /path/to/patch-file.patch |
$ git diff --no-prefix > /path/to/patch-file.patch |