Here are instructions for setting up a development environment for Hadoop under the Eclipse IDE. Please feel free to make additions or modifications to this page.
This document assumes you already have Eclipse downloaded, installed, and configured to your liking. It also assumes that you are aware of the HowToContribute page and have given that a read.
We will begin by downloading the Hadoop source. The hadoop-common source tree has three subprojects underneath it that you will see after you pull down the source code: hadoop-common, hdfs, and mapreduce.
Let's begin by getting the latest source from Git (Note there is a a copy mirrored on github but it lags the Apache read-only git repository slightly).
git clone git://git.apache.org/hadoop-common.git
This will create a hadoop-common folder in your current directory, if you "cd" into that folder you will see all the available subprojects. Now we will build the code to get it ready for importing into Eclipse.
From this directory you just 'cd'-ed into (Which is also known as the top-level directory of a branch or a trunk checkout), perform:
$ mvn install -DskipTests $ mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
Note: This may take a while the first time, as all libraries are fetched from the internet, and the whole build is performed.
After the above, do the following to finally have projects in Eclipse ready and waiting for you to go on that scratch-itching development spree:
Note: in the case of MapReduce the
testjar package is broken. This is expected since it is a part of a testcase that checks for incorrect packaging. This is not to be worried about.
To run tests from Eclipse you need to additionally do the following:
builddirectory of the current project