You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Next »

1. Defining a Workflow with CAS-PGE

See CAS-PGE Learn by Example

2. Adding Metadata

Workflow Context Metadata

Every workflow instance has the following core metadata keys:

-TaskId
-WorkflowInstId
-JobId
-ProcessingNode
-WorkflowManagerUrl
-QueueName
-TaskLoad

The above met keys can be accessed inside a PGE script. For e.g:

<customMetadata>
    <metadata key="JobWorkDir" val="[PGE_WORK_DIR]/[JobId]"/>
</customMetadata>

In addition to the above keys, you can add metadata to the workflow using the --metaData option while kicking off a workflow event.

The --metaData command line option adds key-value pairs to the Workflow context metadata as seen below:

./wmgr-client --url http://localhost:9001 --operation --sendEvent --eventName fileconcatenator-pge --metaData --key RunID testNumber1

The key can be used inside a PGE, such as in the augmented metadata i.e <customMetadata> , like below:

..
<customMetadata>

<metadata key="InputFile" val= "SQL(FORMAT='$Filename') {SELECT Filename FROM GenericFile WHERE RID = '[RunID]' }" />

</customMetadata>
Augmenting Metadata in a PGE

This part is taken from the 'CAS-Workflow 2:A User Guide' by Brian Foster

The element for augmenting metadata is <customMetadata>. Although this element is at the end of the file, it doesn’t mean that it is the last to be loaded. <customMetadata> is actually the first element loaded in this pge-config.xml (the only other element that is loaded before it is the import element – not in this example). Inside <customMetadata> any number of <metadata> elements are allowed.

  • If you want a metadata to pass on through following tasks in a workflow, you can specify the attribute workflowMet='true'. For example: <metadata key='filename' val='data.dat' workflowMet='true'/>
Product-Type Metadata

The product-type metadata refers to the metadata for the files that are ingested during the workflow.This is defined in a met file that is specified in the "args" attribute of the 'files' element in the PgeConfig.xml :

<files name="FiletoIngest" metFileWriterClass="org.apache.oodt.cas.pge.writers.metlist.MetadataListPcsMetFileWriter" args="PGE_CONFIG_HOME/MetOut_FiletoIngest.xml"/>

The MetOut_FiletoIngest.xml should typically look like the below:

<?xml version="1.0" encoding="UTF-8"?>
  <metadataList>
      <!-- Any File -->
      <metadata key="ProductName" val="[Filename]"/>
      <metadata key="Filename"/>
      <metadata key="FileLocation"/>
      <metadata key="FileSize"/>
      <metadata key="ProductType"/>
      <!--Add any element specified in your elements.xml that you want to be written out 
          as metadata for the output file-->
  </metadataList>
<?xml version="1.0" encoding="UTF-8"?>

The metFileWriters create the metadata (.met) file for the output files that will be ingested by the file manager. 

  • No labels