1. Defining a Workflow with CAS-PGE
2. Adding Metadata
Workflow Context Metadata
Every workflow instance has the following core metadata keys:
The above met keys can be accessed inside a PGE script. For e.g:
In addition to the above keys, you can add metadata to the workflow using the --metaData option while kicking off a workflow event.
The --metaData command line option adds key-value pairs to the Workflow context metadata as seen below:
The key can be used inside a PGE, such as in the augmented metadata i.e <customMetadata> , like below:
Augmenting Metadata in a PGE
This part is taken from the 'CAS-Workflow 2:A User Guide' by Brian Foster
The element for augmenting metadata is <customMetadata>. Although this element is at the end of the file, it doesn’t mean that it is the last to be loaded. <customMetadata> is actually the first element loaded in this pge-config.xml (the only other element that is loaded before it is the import element – not in this example). Inside <customMetadata> any number of <metadata> elements are allowed.
To pass metadata through all tasks in a workflow, you can specify the attribute workflowMet='true'. For example: <metadata key='filename' val='data.dat' workflowMet='true'/>
Metadata elements specified in a different file can be accessed in a PGE using the <import> tag. For example if common-metadata.xml contains the below:
The above file can be imported into the PGE task configs as shown below PgeConfig example:
The product-type metadata refers to the metadata for the files that are ingested during the workflow.This is defined in a met file that is specified in the "args" attribute of the 'files' element in the PgeConfig.xml :
<files name="FiletoIngest" metFileWriterClass="org.apache.oodt.cas.pge.writers.metlist.MetadataListPcsMetFileWriter" args="PGE_CONFIG_HOME/MetOut_FiletoIngest.xml"/>
The MetOut_FiletoIngest.xml should typically look like the below:
The metFileWriters create the metadata (.met) file for the output files that will be ingested by the file manager.