Getting Started with Whirr

Whirr CLI

Pre-requisites

You need to install Java 6 on your machine. Also, you need to have an account with a cloud provider, such as Amazon EC2.

Install Whirr

Download or build Whirr. Call the directory which contains the Whirr JAR files WHIRR_HOME (you might like to define this environment variable).

You can test that Whirr is working by running:

% java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar

It is handy to create an alias for whirr, and for one including cloud credentials:

% alias whirr='java -jar $WHIRR_HOME/whirr-cli-0.1.0-SNAPSHOT.jar'
% alias whirr-ec2='whirr --identity=$AWS_ACCESS_KEY_ID --credential=$AWS_SECRET_ACCESS_KEY'

Launch a cluster

The following will launch a Hadoop cluster with a single machine for the namenode and jobtracker, and a further machine for a datanode and tasktracker.

% whirr-ec2 launch-cluster --service-name=hadoop --cluster-name=tomhadoopcluster \
  --instance-templates='1 nn+jt 1 dn+tt'

Once the cluster has launched you can browse it by connecting to http://master-host:50030.

Configuration

Whirr is configured using a properties file, and optionally using command line arguments when using the CLI. Command line arguments take precedence over properties specified in a properties file.

See Configuration Guide for more on configuration.

Destroy a cluster

When you've finished using a cluster you can terminate the instances and clean up resources with

% whirr-ec2 destroy-cluster --service-name hadoop --cluster-name tomhadoopcluster

Whirr API

Whirr provides a Java API for stopping and starting clusters. Please see the unit test source code for how to achieve this.

There's also some example code at http://github.com/hammer/whirr-demo.

Child pages

Quick Start Guide