The Default Clinical Pipeline is a great place for the new user of a binary installation to start.  New developers should look at the ctakes-examples project to start.

The Default Clinical Pipeline produces the most commonly desired output from cTAKES.  This includes annotations for Anatomical sites, Signs/Symptoms, Procedures, Diseases/Disorders and Medications.  For each annotation there are normalized UMLS CUIs, plus values for negation, uncertainty and subject.

Figure 1.  A sample sentence processed by the Default Clinical Pipeline.

Step-by-step guide

Run via command line.

  1. Execute bin/runClinicalPipeline  -i inputDirectory  --xmiOut outputDirectory  --user umlsUsername  --pass umlsPassword

The pipeline will write log information to the screen and will write an XMI file for each file in inputDirectory and its subdirectories.  The directory tree below inputDirectory will be mirrored in outputDirectory.

You can view information in the XMI files using the UIMA Cas Visual Debugger (CVD).

  1. Execute bin/runctakesCVD
  2. Select File > Read Type System File
  3. Select TypeSystem.xml in resources/org/apache/ctakes/typesystem/types/
  4. Select File > Read XMI CAS File
  5. Select any .xmi file in your outputDirectory

Selections in the tree on the left can provide highlighting in the document text on the right.  Browsing annotations is not necessarily straightforward.  Reference the CVD main area documentation for how to use the CVD. See the cTAKES 4.0 Component Use Guide for cTAKES annotations and attributes.

For cTAKES 4.0, if runClinicalPipeline fails with "ERROR PipelineBuilder - No Collection Reader specified.", verify that you used  -inputDirectory

The command line bin/runClinicalPipeline runs the Piper File DefaultFastPipeline.piper in resources/org/apache/ctakes/clinical/pipeline/