The default configuration of Dictionary Lookup uses an hsqldb database containing terms and normalized codes (CUIs). Dictionary databases containing typically desired information from the UMLS are available at sourceforge .
However, there may be cases for which the standard dictionaries are not applicable. For this reason cTAKES has a GUI that can assist in the creation of custom dictionaries. The GUI currently only allows the most basic customization: Desired source vocabularies, semantic types, and additional vocabulary codes of interest.
*Greater customization is available, but requires the editing of property files and is outside the scope of this document.
From a command-line in the cTAKES root directory, execute:
- Select a cTAKES installation directory. The default directory should be correct.
- Select a UMLS installation directory. This is the directory containing the
META/subdirectory (which contains RRF files).
After selecting the UMLS installation directory, the available vocabularies are gathered.
- Select Source Vocabularies. Source vocabularies contain CUIs that interest you.
- Select Target Vocabularies. The dictionary will contain target vocabulary codes.
- Select Semantic Types. The standard cTAKES types are selected by default.
- Type a Dictionary Name. Use all lower case.
- Click Build Dictionary.
Once a new dictionary has been built, point to it in one of 2 ways:
Set the fast dictionary parameter LookupXml to
Set the runClinicalPipeline or runPiperFile command-line parameter -l to org/apache/ctakes/dictionary/lookup/fast/DictionaryName/DictionaryName.xml