This page is updated for whirr 0.8.1, especially from multiple nodes at "Launch a cluster". Please follow the current Quick Start Guide linked from whirr.apache.org as well.
Getting Started with Whirr
See also http://incubator.apache.org/whirr/quick-start-guide.html
Whirr CLI
Pre-requisites
You need to install Java 6 on your machine. Also, you need to have an account with a cloud provider, such as Amazon EC2.
Install Whirr
Download or build Whirr. Call the directory which contains the Whirr JAR files WHIRR_HOME
(you might like to define this environment variable).
You can test that Whirr is working by running:
% java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar
(The above JAR no longer includes a main reference in its manifest. This information is left for informational purposes. The preferred means of starting is the script in bin/
. As noted above, follow the new instructions instead.)
It is handy to create an alias for whirr, and for one including cloud credentials:
% alias whirr='java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar' % alias whirr-ec2='whirr --identity=$AWS_ACCESS_KEY_ID --credential=$AWS_SECRET_ACCESS_KEY'
Launch a cluster
The following will launch a Hadoop cluster with a single machine for the namenode and jobtracker, and a further machine for a datanode and tasktracker.
% whirr-ec2 launch-cluster --service-name=hadoop --cluster-name=tomhadoopcluster \ --instance-templates='1 nn+jt 1 dn+tt'
Once the cluster has launched you can browse it by connecting to http://master-host:50030.
The following will launch a Hadoop cluster with multiple nodes on AWS EC2. You may want to take a look at or use the attached hbase.properties file:
jongwook@localhost:~/whirr$ bin/whirr launch-cluster --config ./recipes/hbase.properties --private-key-file ~/.ssh/id_rsa_whirr
Login to the remote master node
Once launching is successful, you will see the following SSH info to all nodes.
You can log into instances using the following ssh commands: [hadoop-datanode+hadoop-tasktracker+hbase-regionserver]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@60.xxx.xx.xxx [hadoop-datanode+hadoop-tasktracker+hbase-regionserver]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@54.xxx.xx.xxx [zookeeper+hadoop-namenode+hadoop-jobtracker+hbase-master]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@50.xxx.xx.xxx
Log in the master node, the last one, to run hadoop code with hbase data. Then, you can flexibly execute your Hadoop codes integrated with HBase. User name is your local login, eg, jongwook as a user name:
ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@50.xxx.xx.xxx
Setup path and CLASSPATH to run hbase and hadoop codes. You need to make sure what HADOOP and HBASE you have installed at /usr/local.
export HADOOP_HOME=/usr/local/hadoop-0.20.2 export HBASE_HOME=/usr/local/hbase-0.90.0 export ZOOKEEPER_HOME=/usr/local/zookeeper-3.3.3 export PATH=$HADOOP_HOME/bin:$HBASE_HOME/bin:$PATH # CLASSPATH for HADOOP export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-core.jar:$HADOOP_HOME/hadoop-0.20.2-ant.jar:$CLASSPATH export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-examples.jar:$HADOOP_HOME/hadoop-0.20.2-test.jar:$CLASSPATH export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-tools.jar:$CLASSPATH #export CLASSPATH=$HADOOP_HOME/commons-logging-1.0.4.jar:$HADOOP_HOME/commons-logging-api-1.0.4.jar:$CLASSPATH # CLASSPATH for HBASE export CLASSPATH=$HBASE_HOME/hbase-0.90.0.jar:$HBASE_HOME/lib/zookeeper-3.3.2.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/commons-logging-1.1.1.jar:$HBASE_HOME/lib/avro-1.3.3.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/log4j-1.2.16.jar:$HBASE_HOME/lib/commons-cli-1.2.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/jackson-core-asl-1.5.5.jar:$HBASE_HOME/lib/jackson-mapper-asl-1.4.2.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/commons-httpclient-3.1.jar:$HBASE_HOME/lib/jetty-6.1.26.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/jetty-util-6.1.26.jar:$HBASE_HOME/lib/hadoop-core.jar:$CLASSPATH export CLASSPATH=$HBASE_HOME/lib/hbase-0.90.0.jar:$HBASE_HOME/lib/hsqldb-1.8.0.10.jar:$CLASSPATH
First run Hadoop pi demo at the remote node in order to make sure if Hadoop works:
[jongwook@ip-10-xx-xx-xx ~]# cd /usr/local/hadoop-0.20.2/ [jongwook@ip-10-xx-xx-xx hadoop-0.20.2]# bin/hadoop jar hadoop-0.20.2-examples.jar pi 20 1000
Second, run HBase demo in order to make sure if HBase works:
jongwook@ip-10-xx-xx-xx:/usr/local$ cd hbase-0.90.0/ jongwook@ip-10-xx-xx-xx:/usr/local/hbase-0.90.0$ ls bin conf hbase-0.90.0.jar hbase-webapps LICENSE.txt pom.xml src CHANGES.txt docs hbase-0.90.0-tests.jar lib NOTICE.txt README.txt jongwook@ip-10-xx-xx-xx:/usr/local/hbase-90.0$ bin/hbase shell HBase Shell; enter 'help' for list of supported commands. Type "exit" to leave the HBase Shell Version: 0.90.0, r1056514, Fri Jan 7 21:22:53 UTC 2011 hbase(main):001:0> status 'simple' 2 live servers domU-12-xx-xx-xx-xx-xx.compute-1.internal:60020 1358812084397 requests=0, regions=2, usedHeap=36, maxHeap=1974 domU-12-xx-xx-xx-xx-xx.compute-1.internal:60020 1358812084972 requests=0, regions=0, usedHeap=57, maxHeap=1974 0 dead servers Aggregate load: 0, regions: 2
Configuration
Whirr is configured using a properties file, and optionally using command line arguments when using the CLI. Command line arguments take precedence over properties specified in a properties file.
See Configuration Guide for more on configuration.
Destroy a cluster
When you've finished using a cluster you can terminate the instances and clean up resources with
% whirr-ec2 destroy-cluster --service-name hadoop --cluster-name tomhadoopcluster
The following will destroy a Hadoop cluster with multiple nodes on AWS EC2
jongwook@localhost:~/whirr$ bin/whirr destroy-cluster --config ./recipes/hbase.properties --private-key-file ~/.ssh/id_rsa_whirr
Whirr API
Whirr provides a Java API for stopping and starting clusters. Please see the unit test source code for how to achieve this.
There's also some example code at http://github.com/hammer/whirr-demo.