Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

Ref Guide Topics

Meta-Documentation

*** As of June 2017, the latest Solr Ref Guide is located at https://lucene.apache.org/solr/guide ***

Please note comments on these pages have now been disabled for all users.

Skip to end of metadata
Go to start of metadata

You can integrate the Apache Unstructured Information Management Architecture (UIMA) with Solr. UIMA lets you define custom pipelines of Analysis Engines that incrementally add metadata to your documents as annotations.

For more information about Solr UIMA integration, see https://wiki.apache.org/solr/SolrUIMA.

Configuring UIMA

The SolrUIMA UpdateRequestProcessor is a custom update request processor that takes documents being indexed, sends them to a UIMA pipeline, and then returns the documents enriched with the specified metadata. To configure UIMA for Solr, follow these steps:

  1. Copy solr-uima-VERSION.jar (under /solr-VERSION/dist/) and its libraries (under contrib/uima/lib) to a Solr libraries directory, or set <lib/> tags in solrconfig.xml appropriately to point to those jar files:

  2. Modify schema.xml, adding your desired metadata fields specifying proper values for type, indexed, stored, and multiValued options. For example:

  3. Add the following snippet to solrconfig.xml:

    VALID_ALCHEMYAPI_KEY is your AlchemyAPI Access Key. You need to register an AlchemyAPI Access key to use AlchemyAPI services: http://www.alchemyapi.com/api/register.html.

    VALID_OPENCALAIS_KEY is your Calais Service Key. You need to register a Calais Service key to use the Calais services: http://www.opencalais.com/apikey.

    analysisEngine must contain an AE descriptor inside the specified path in the classpath.

    analyzeFields must contain the input fields that need to be analyzed by UIMA. If merge=true then their content will be merged and analyzed only once.

    Field mapping describes which features of which types should go in a field.

  4. In your solrconfig.xml replace the existing default UpdateRequestHandler or create a new UpdateRequestHandler:

Once you are done with the configuration your documents will be automatically enriched with the specified fields when you index them.

  • No labels

4 Comments

  1. The 2nd to last code block on this page is another place where large blank space is being inserted into the example in the PDF.

  2. I wonder if this documentation is updated for SOLR 5. 

    This texts should be updated to make documentation compatible with SOLR 5:

    solr-uima-4.x.y.jar (under /solr-4.x.y/dist/) -> solr-uima-5.x.y.jar (under /solr-5.x.y/dist/)

    <requestHandler name="/update" class="solr.XmlUpdateRequestHandler"> -> <requestHandler name="/update" class="solr.UpdateRequestHandler">

  3. Alchemy API has been deprecated by IBM and starting today, it will not be offered in their catalog. More details are available at https://www.ibm.com/watson/developercloud/doc/alchemydata-news/index.html