Welcome contributors! We strive to include everyone's contributions. This page provides necessary guidelines on how to contribute effectively towards furthering the development and evolution of Sqoop.
Note: This guide applies to general contributors. If you are a committer, please read the Guide for Committers as well.
What can be contributed?
There are many ways you can contribute towards the project. A few of these are:
Jump in on discussions: It is possible that someone initiates a thread on the mailing list describing a problem that you have dealt with in the past. You can help the project by chiming in on that thread and guiding that user to overcome or workaround that problem or limitation.
File Bugs: If you notice a problem and are sure it is a bug, then go ahead and file a JIRA. If however, you are not very sure that it is a bug, you should first confirm it by discussing it on the Mailing Lists.
Review Code: If you see that a JIRA ticket has a 'Patch Available' status, go ahead and review it. It cannot be stressed enough that you must be kind in your review and explain the rational for your feedback and suggestions. Also note that not all review feedback is accepted - often times it is a compromise between the contributor and reviewer. If you are happy with the change and do not spot any major issues +1
it. More information on this is available in the following sections.
Provide Patches: We encourage you to assign the relevant JIRA issue to yourself and supply a patch for it. The patch you provide can be code, documentation, build changes, or any combination of these. More information on this is available in the following sections.
Setting up your development environment
In order to setup your development environment, you would need a Linux system with administrative privileges and Internet connection such as Ubuntu or CentOS. You would also need sufficient disk space for checking out and building the code, installing various database/other software that you may need for your testing.
Getting ready to build
Once you have your Linux system ready with sufficient disk space and Internet connection, go ahead and install the following software:
- Subversion client and/or Git
- The recent update of JDK 1.6
- Asciidoc version 8.6 or above
- Apache Ant 1.7 or above
- Findbugs version 1.3.9 or above
- Latest Eclipse IDE (or your IDE/Editor of choice)
Building the Sources
To get the source code, checkout the subversion "trunk" using the following command:
$ svn co https://svn.apache.org/repos/asf/incubator/sqoop/trunk/ sqoop
If you prefer using git, you can clone the Sqoop repository from Apache Git mirror by the following command:
$ git clone git://git.apache.org/sqoop.git
Once you have the code, you can build it by the following command:
$ cd sqoop $ ant jar-all
You can use the clean
target to delete previously built files from the workspace and run jar-all
again to do a fresh build.
To see a list of all available targets that are available in the build type the following command:
$ ant -p
If you prefer working in Eclipse, you can generate the necessary project definitions as follows:
ant eclipse
Once these definitions are generated, you can import them in Eclipse as an existing project.
Running Tests
Running unit tests
Sqoop source code contains many unit tests that exercise its functionality. These tests can be run simply by using the following command:
ant test
Setting up and running third-party tests
Third-party tests are end-to-end integration tests that exercise the basic Sqoop functionality against third-party databases. You should run these tests in order to rule out regression when testing any changes to the core system. Before you run these tests, you must setup the following databases:
Create third-party lib directory
Create a directory somewhere convenient on your development system. This directory will hold all the JDBC drivers that the tests will use. Once created, create (or edit) the build.properties
file in Sqoop workspace root directory and set the the full path of this directory as the value of the property sqoop.thirdparty.lib.dir
. For example:
sqoop.thirdparty.lib.dir=/opt/ws/3rd-party-lib
Setting up MySQL
- Install MySQL version 5.1.x with necessary client tools. You can install the server in a different host than your development host if necessary. However, you must have the client tools available on your development host including the JDBC driver, and batch utilities such as
mysqldump
andmysqlimport
. - Place the JDBC driver in the third-party lib directory that you created earlier.
- The location of MySQL server is specified in the
build.properties
file by the value for the propertysqoop.test.mysql.connectstring.host_url
. This property defaults tojdbc:mysql://localhost/
which assumes local installation and default port setup. If however your MySQL server is installed on a different host or on a different port you should specify it explicitly as follows:sqoop.test.mysql.connectstring.host_url=jdbc:mysql://<mysqlhost>:<port>/
- In order to run the MySQL third-party tests, you would need to configure the database as follows:
$ mysql -u root -p mysql> CREATE DATABASE sqooppasstest; mysql> CREATE DATABASE sqooptestdb; mysql> use mysql; mysql> GRANT ALL PRIVILEGES on sqooppasstest.* TO 'sqooptest'@'localhost' IDENTIFIED BY '12345'; mysql> GRANT ALL PRIVILEGES ON sqooptestdb.* TO 'yourusername'@'localhost'; mysql> flush privileges; mysql> \q
- Note:
- If the installation of MySQL server is on a different host, you must replace the
localhost
with the appropriate client host value. - You should replace
yourusername
with your actual user name before issuing the command.
- If the installation of MySQL server is on a different host, you must replace the
Setting up PostgreSQL
- Install PostgreSQL 8.3.9 or later along with client tools. You can install the server in a different host than your development host if necessary. However, you must have the client tools available on your development host including the JDBC driver and command line utility
psql
. - Place the JDBC driver in the third-party lib directory that you created earlier.
- The location of PostgreSQL server is specified in the
build.properties
file by the value for the propertysqoop.test.postgresql.connectstring.host_url
. This property defaults tojdbc:postgresql://localhost/
which assumes local installation and default port setup. If however your PostgreSQL server is installed on a different host or on a different port you should specify it explicitly as follows:sqoop.test.postgresql.connectstring.host_url=jdbc:postgresql://<pgsqlhost>:<pgsqlport>/
- In order to run PostgreSQL third-party tests, you would need to configure the database as follows:
- Edit the
pg_hba.conf
file and setup the authentication scheme to allow for testing. In a secured environment, it may be easy to setup up full trust based access by adding the following lines in this file, and commenting out any other lines referencing 127.0.0.1 or ::1.local all all trust host all all 127.0.0.1/32 trust host all all ::1/128 trust
- Also in the file
postgresql.conf
uncomment the line that starts withlisten_address
and set its value to '*' as follows:listen_address = '*'
- Restart your PostgreSQL server after modifying the configuration files above.
- Create the necessary user and database for Sqoop testing as follows:
$ sudo -u postgres psql -U postgres template1 template1=> CREATE USER sqooptest; template1=> CREATE DATABASE sqooptest; tempalte1=> \q $
- Edit the
Setting up Oracle
- Install Oracle 10.2.x or later and download the corresponding JDBC driver.
- Place the JDBC driver in the third-party lib directory that you created earlier.
- The location of Oracle server is specified in the
build.properties
file by the value for the propertysqoop.test.oracle.connectstring
. This property defaults tojdbc:oracle:thin:@//localhost/xe
which assumes local installation and default port setup. If however your Oracle server is installed on a different host or on a different port you should specify it explicitly as follows:sqoop.test.oracle.connectstring=jdbc:oracle:thin:@//<oraclehost>:<port>/<sid>
- In order to run Oracle third-party tests, you would need to configure the database as follows:
$ sqlplus system/<password>@<sid> SQL> CREATE USER SQOOPTEST identified by 12345; SQL> GRANT CONNECT, RESOURCE to SQOOPTEST; SQL> CREATE USER SQOOPTEST2 identified by ABCDEF; SQL> GRANT CONNECT, RESOURCE to SQOOPTEST2; SQL> exit $
- Note: If you are using Oracle XE and see an error like
ORA-12516, TNS:listener could not find available handler with matching protocol stack
, you are likely running into connection exhaustion problem. To circumvent this, log into the Oracle server asSYSTEM
, run the command below and restart your server.$ sqlplus system/<password>@<sid> SQL> ALTER SYSTEM SET processes=200 scope=spfile; SQL> exit $