This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Skip to end of metadata
Go to start of metadata

The default configuration of Dictionary Lookup uses an hsqldb database containing terms and normalized codes (CUIs).  Dictionary databases containing typically desired information from the UMLS are available at sourceforge .

However, there may be cases for which the standard dictionaries are not applicable.  For this reason cTAKES has a GUI that can assist in the creation of custom dictionaries.  The GUI currently only allows the most basic customization: Desired source vocabularies, semantic types, and additional vocabulary codes of interest.  

    * Greater customization is available, but requires the editing of property files and is outside the scope of this document.

Step-by-step guide

  1. From a command-line in the cTAKES root directory, execute:   bin\runDictionaryCreator

  2. Select a cTAKES installation directory.  The default directory should be correct.
  3. Select a UMLS installation directory.  This is the directory containing the META/ subdirectory (which contains RRF files). 
    After selecting the UMLS installation directory, the available vocabularies are gathered.
  4. Select Source Vocabularies.  Source vocabularies contain CUIs that interest you.
  5. Select Target Vocabularies.  The dictionary will contain target vocabulary codes.
  6. Select Semantic Types.  The standard cTAKES types are selected by default.
  7. Type a Dictionary Name.  Use all lower case.
  8. Click Build Dictionary.


Once a new dictionary has been built, point to it in one of 2 ways:

Set the fast dictionary parameter LookupXml to org/apache/ctakes/dictionary/lookup/fast/DictionaryName.xml


Set the runClinicalPipeline or runPiperFile command-line parameter -l to org/apache/ctakes/dictionary/lookup/fast/DictionaryName.xml

UMLS License

Please ensure that you comply with the UMLS License.