Download And Build

The most recent CAS-Curator project can be downloaded from the OODT website or it can be checked out from the OODT repository using Subversion. We recommend checking out the latest released version (v1.0.0 at the time of writing).
Maven is the build management system used for OODT projects. We currently support Maven 2.0 and later. For more information on Maven, see our Maven Guide .
Assuming a *nix-like environment, with both Maven and Subversion clients installed and on your path, an example of the checkout and build process is presented below:

mkdir /usr/local/src
cd /usr/local/src
svn checkout http://oodt/repo/cas-curator/tags/1_0_0_release \

After the Subversion command completes, you will have the source for the CAS-Curator project in the /usr/local/src/cas-curator-v1.0.0 directory.
In order to build the WAR (Web ARchive) file from this source, issue the following commands:

cd /usr/local/src/cas-curator-v1.0.0
mvn package    

Once the Maven command completes successfully, you should have a target directory under cas-curator-v1.0.0/. The WAR file, called cas-curator-1.0.0.war, can be found under target/.
In the next section, we will discuss deploying this WAR file to a Tomcat instance.

Tomcat Deployment

Once you have built a war file, it is necessary to deploy the web application using a servlet container such as Tomcat or Jetty. For the purposes of this guide, we will assume that you are using Tomcat. Tomcat can be installed in a user account or at the system level. The base configuration launches a web server on port 8080. You can learn more about Tomcat and download the latest release from their website. NOTE: There are three concurrent versions of Tomcat: 5.5.X, 6.0.X and 7.0.X. CAS-Curator is compatible with all versions.
We will assume that you have downloaded Tomcat to an appropriate directory, are using the default configuration, and have taken the appropriate steps to allow access to port 8080. See your System Administrator is you have any questions about firewall security and policy regarding port access. We will further assume that you have set an environment variable, $TOMCAT_HOME, to the base directory of your Tomcat installation.
There are a number of ways to deploy a WAR file to Tomcat, though we recommend using a context file. A context file is a XML file that provides Tomcat with "context" for using a particular web application. In order to create a context file for the CAS-Curator, open your favorite text editor and copy and paste the following:

<Context path="/my-curator"
  <Parameter name=""
  <Parameter name="org.apache.oodt.cas.curator.projectName"
        value="My Project"/>

Save the context file to $TOMCAT_HOME/conf/Catalina/localhost/my-curator.xml. Now you can point a web browser to http://localhost:8080/my-curator and you should see a log-in screen for CAS-Curator. Note: Tomcat will only use the path attribute if the context is defined in server.xml. Tomcat uses the xml file name instead. See the Tomcat documentation for further information.

The parameter that we set in the context file configures the CAS-Curator for a "dummy" log-in to its Single Sign On service. Because of this, we are able to log into the web application with a blank user name and a blank password. For help in implementing security with CAS-Curator, see our Advanced Guide.

In the next sections, we will talk about setting up staging areas, metadata extractors, and launching a CAS-Filemgr instance into which CAS-Curator will ingest data products.

Staging Area Setup

Staging areas are directories on your local machine that hold data products to be curated. The staging area can have arbitrary structure. The only requirement that CAS-Curator has with regard to this structure is that the directory structure be mirrored in a metadata generation area. This generation area is used by CAS-Curator to create metadata files to associate with data products.
For example, if there is a product, say an MP3 file of Bach's Der Geist hilft unsrer Schwachheit auf, in the staging area at:


Then the CAS-Curator will generate all associated metadata products in [metadata_gen_base]/audio/classical/bach/.
In order to set up the staging area and the metadata generation area, we first create base directories for each, shown below:

mkdir /usr/local/staging
mkdir /usr/local/staging/products
mkdir /usr/local/staging/metadata

Next, we will set the following parameters in the CAS-Curator context file:

<Parameter name="org.apache.oodt.cas.curator.stagingAreaPath"
<Parameter name="org.apache.oodt.cas.curator.metAreaPath"
<Parameter name="org.apache.oodt.cas.curator.metExtension"

The org.apache.oodt.cas.curator.stagingAreaPath parameter should be set to the product staging area and the org.apache.oodt.cas.curator.metAreaPath should be set to the metedata generation area. Additionally, we specified the parameter org.apache.oodt.cas.curator.metExtension to be .met. This parameter specifies the extension for all of the metadata files produced in the metadata generation area.
For illustrative purposes, we will load an mp3 file into the staging area:

mkdir /usr/local/staging/products/mp3
cd /usr/local/staging/products/mp3
curl -LO

We should note that this music file was produced by the Fulda Symphonic Orchestra and is freely distributed under the EFF Open Audio License, version 1.0. We have edited the ID3 tag of this file (in order to make the later metadata extraction example more interesting), but original authorship is retained. Now back to the tutorial...
Remember that we need to mirror the product staging area and the metadata generation area, so will also need to create the matching directory structure there:

mkdir /usr/local/staging/metadata/mp3

Once you restart Tomcat, the changes you have made to the context file will be used. The staging area will now be set to /usr/local/staging/products.

Double-clicking on "mp3", we can see that the staging area path in the top left is now /mp3 and Bach-SuiteNo2.mp3 can be seen the main left staging pane. For the time-being, there is no metadata detected (as reported in the main right staging pane), but in the next section, we will be setting up a basic, command-line metadata extractor in order to show how extractors are integrated into CAS-Curator.

  • No labels