Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

This page is updated for whirr 0.8.1, especially from multiple nodes at "Launch a cluster". Please follow the current Quick Start Guide linked from whirr.apache.org as well.

Getting Started with Whirr

See also http://incubator.apache.org/whirr/quick-start-guide.html

Whirr CLI

Pre-requisites

You need to install Java 6 on your machine. Also, you need to have an account with a cloud provider, such as Amazon EC2.

...

Code Block
% java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar

(The above JAR no longer includes a main reference in its manifest. This information is left for informational purposes. The preferred means of starting is the script in bin/. As noted above, follow the new instructions instead.)

It is handy to create an alias for whirr, and for one including cloud credentials:

...

Once the cluster has launched you can browse it by connecting to http://master-host:50030Image Removed..

The following will launch a Hadoop cluster with multiple nodes on AWS EC2. You may want to take a look at or use the attached hbase.properties file:

Code Block

jongwook@localhost:~/whirr$ bin/whirr launch-cluster --config ./recipes/hbase.properties  --private-key-file ~/.ssh/id_rsa_whirr

Login to the remote master node

Once launching is successful, you will see the following SSH info to all nodes.

Code Block

You can log into instances using the following ssh commands:
[hadoop-datanode+hadoop-tasktracker+hbase-regionserver]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@60.xxx.xx.xxx
[hadoop-datanode+hadoop-tasktracker+hbase-regionserver]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@54.xxx.xx.xxx
[zookeeper+hadoop-namenode+hadoop-jobtracker+hbase-master]: ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@50.xxx.xx.xxx

Log in the master node, the last one, to run hadoop code with hbase data. Then, you can flexibly execute your Hadoop codes integrated with HBase. User name is your local login, eg, jongwook as a user name:

Code Block

ssh -i /home/jongwook/.ssh/id_rsa_whirr -o "UserKnownHostsFile /dev/null" -o StrictHostKeyChecking=no jongwook@50.xxx.xx.xxx

Setup path and CLASSPATH to run hbase and hadoop codes. You need to make sure what HADOOP and HBASE you have installed at /usr/local.

Code Block

export HADOOP_HOME=/usr/local/hadoop-0.20.2
export HBASE_HOME=/usr/local/hbase-0.90.0
export ZOOKEEPER_HOME=/usr/local/zookeeper-3.3.3
export PATH=$HADOOP_HOME/bin:$HBASE_HOME/bin:$PATH

# CLASSPATH for HADOOP
export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-core.jar:$HADOOP_HOME/hadoop-0.20.2-ant.jar:$CLASSPATH
export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-examples.jar:$HADOOP_HOME/hadoop-0.20.2-test.jar:$CLASSPATH
export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-tools.jar:$CLASSPATH
#export CLASSPATH=$HADOOP_HOME/commons-logging-1.0.4.jar:$HADOOP_HOME/commons-logging-api-1.0.4.jar:$CLASSPATH

# CLASSPATH for HBASE
export CLASSPATH=$HBASE_HOME/hbase-0.90.0.jar:$HBASE_HOME/lib/zookeeper-3.3.2.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/commons-logging-1.1.1.jar:$HBASE_HOME/lib/avro-1.3.3.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/log4j-1.2.16.jar:$HBASE_HOME/lib/commons-cli-1.2.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/jackson-core-asl-1.5.5.jar:$HBASE_HOME/lib/jackson-mapper-asl-1.4.2.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/commons-httpclient-3.1.jar:$HBASE_HOME/lib/jetty-6.1.26.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/jetty-util-6.1.26.jar:$HBASE_HOME/lib/hadoop-core.jar:$CLASSPATH
export CLASSPATH=$HBASE_HOME/lib/hbase-0.90.0.jar:$HBASE_HOME/lib/hsqldb-1.8.0.10.jar:$CLASSPATH

First run Hadoop pi demo at the remote node in order to make sure if Hadoop works:

Code Block

[jongwook@ip-10-xx-xx-xx ~]# cd /usr/local/hadoop-0.20.2/
[jongwook@ip-10-xx-xx-xx hadoop-0.20.2]# bin/hadoop jar hadoop-0.20.2-examples.jar pi 20 1000

Second, run HBase demo in order to make sure if HBase works:

Code Block

jongwook@ip-10-xx-xx-xx:/usr/local$ cd hbase-0.90.0/
jongwook@ip-10-xx-xx-xx:/usr/local/hbase-0.90.0$ ls
bin	     conf  hbase-0.90.0.jar	   hbase-webapps  LICENSE.txt  pom.xml	   src
CHANGES.txt  docs  hbase-0.90.0-tests.jar  lib		  NOTICE.txt   README.txt
jongwook@ip-10-xx-xx-xx:/usr/local/hbase-90.0$ bin/hbase shell
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version: 0.90.0, r1056514, Fri Jan  7 21:22:53 UTC 2011

hbase(main):001:0> status 'simple'
2 live servers
    domU-12-xx-xx-xx-xx-xx.compute-1.internal:60020 1358812084397
        requests=0, regions=2, usedHeap=36, maxHeap=1974
    domU-12-xx-xx-xx-xx-xx.compute-1.internal:60020 1358812084972
        requests=0, regions=0, usedHeap=57, maxHeap=1974
0 dead servers
Aggregate load: 0, regions: 2

Configuration

Whirr is configured using a properties file, and optionally using command line arguments when using the CLI. Command line arguments take precedence over properties specified in a properties file.

...

Code Block
% whirr-ec2 destroy-cluster --service-name hadoop --cluster-name tomhadoopcluster

The following will destroy a Hadoop cluster with multiple nodes on AWS EC2

Code Block

jongwook@localhost:~/whirr$ bin/whirr destroy-cluster --config ./recipes/hbase.properties  --private-key-file ~/.ssh/id_rsa_whirr

Whirr API

Whirr provides a Java API for stopping and starting clusters. Please see the unit test source code for how to achieve this.

There's also some example code at http://github.com/hammer/whirr-demoImage Removed.