Getting Started with Whirr
Whirr CLI
Pre-requisites
You need to install Java 6 on your machine. Also, you need to have an account with a cloud provider, such as Amazon EC2.
Install Whirr
Download or build Whirr. Call the directory which contains the Whirr JAR files WHIRR_HOME
(you might like to define this environment variable).
You can test that Whirr is working by running:
% java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar
It is handy to create an alias for whirr, and for one including cloud credentials:
% alias whirr='java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar' % alias whirr-ec2='whirr --identity=$AWS_ACCESS_KEY_ID --credential=$AWS_SECRET_ACCESS_KEY'
Launch a cluster
The following will launch a Hadoop cluster with a single machine for the namenode and jobtracker, and a further machine for a datanode and tasktracker.
% whirr-ec2 launch-cluster --service-name=hadoop --cluster-name=tomhadoopcluster \ --instance-templates='1 nn+jt 1 dn+tt'
Once the cluster has launched you can browse it by connecting to http://master-host:50030.
Configuration
Whirr is configured using a properties file, and optionally using command line arguments when using the CLI. Command line arguments take precedence over properties specified in a properties file.
See Configuration+Guide for more on configuration.
Destroy a cluster
When you've finished using a cluster you can terminate the instances and clean up resources with
% whirr-ec2 destroy-cluster --service-name hadoop --cluster-name tomhadoopcluster
Whirr API
Whirr provides a Java API for stopping and starting clusters. Please see the unit test source code for how to achieve this.
There's also some example code at http://github.com/hammer/whirr-demo.