Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

Ref Guide Topics

Meta-Documentation

*** As of June 2017, the latest Solr Ref Guide is located at https://lucene.apache.org/solr/guide ***

Please note comments on these pages have now been disabled for all users.

Skip to end of metadata
Go to start of metadata

Managed resources expose a REST API endpoint for performing Create-Read-Update-Delete (CRUD) operations on a Solr object. Any long-lived Solr object that has configuration settings and/or data is a good candidate to be a managed resource.  Managed resources complement other programmatically manageable components in Solr, such as the RESTful schema API to add fields to a managed schema. Consider a Web-based UI that offers Solr-as-a-Service where users need to configure a set of stop words and synonym mappings as part of an initial setup process for their search application. This type of use case can easily be supported using the Managed Stop Filter & Managed Synonym Filter Factories provided by Solr, via the Managed resources REST API.  Users can also write their own custom plugins, that leverage the same internal hooks to make additional resources REST managed.

All of the examples in this section assume you are running the "techproducts" Solr example:

Overview

Let's begin learning about managed resources by looking at a couple of examples provided by Solr for managing stop words and synonyms using a REST API. After reading this section, you'll be ready to dig into the details of how managed resources are implemented in Solr so you can start building your own implementation.

Stop words

To begin, you need to define a field type that uses the ManagedStopFilterFactory , such as:

There are two important things to notice about this field type definition. First, the filter implementation class is  solr.ManagedStopFilterFactory . This is a special implementation of the StopFilterFactory that uses a set of stop words that are managed from a REST API. Second, the  managed=”english”  attribute gives a name to the set of managed stop words, in this case indicating the stop words are for English text.

The REST endpoint for managing the English stop words in the techproducts collection is: /solr/techproducts/schema/analysis/stopwords/english.

The example resource path should be mostly self-explanatory. It should be noted that the ManagedStopFilterFactory implementation determines the /schema/analysis/stopwords part of the path, which makes sense because this is an analysis component defined by the schema. It follows that a field type that uses the following filter:

would resolve to path: /solr/techproducts/schema/analysis/stopwords/french.

So now let’s see this API in action, starting with a simple GET request:

Assuming you sent this request to Solr, the response body is a JSON document:

The sample_techproducts_configs config set ships with a pre-built set of managed stop words, however you should only interact with this file using the API and not edit it directly.

One thing that should stand out to you in this response is that it contains a  managedList  of words as well as initArgs . This is an important concept in this framework—managed resources typically have configuration and data. For stop words, the only configuration parameter is a boolean that determines whether to ignore the case of tokens during stop word filtering (ignoreCase=true|false). The data is a list of words, which is represented as a JSON array named managedList in the response.

Now, let’s add a new word to the English stop word list using an HTTP PUT:

Here we’re using cURL to PUT a JSON list containing a single word “foo” to the managed English stop words set. Solr will return 200 if the request was successful. You can also put multiple words in a single PUT request.

You can test to see if a specific word exists by sending a GET request for that word as a child resource of the set, such as:

This request will return a status code of 200 if the child resource (foo) exists or 404 if it does not exist the managed list.

To delete a stop word, you would do:

Note: PUT/POST is used to add terms to an existing list instead of replacing the list entirely. This is because it is more common to add a term to an existing list than it is to replace a list altogether, so the API favors the more common approach of incrementally adding terms especially since deleting individual terms is also supported.

Synonyms

For the most part, the API for managing synonyms behaves similar to the API for stop words, except instead of working with a list of words, it uses a map, where the value for each entry in the map is a set of synonyms for a term. As with stop words, the sample_techproducts_configs config set includes a pre-built set of synonym mappings suitable for the sample data that is activated by the following field type definition in schema.xml:

To get the map of managed synonyms, send a GET request to:

This request will return a response that looks like:

Managed synonyms are returned under the managedMap property which contains a JSON Map where the value of each entry is a set of synonyms for a term, such as "happy" has synonyms "glad" and "joyful" in the example above.

To add a new synonym mapping, you can PUT/POST a single mapping such as:

The API will return status code 200 if the PUT request was successful. To determine the synonyms for a specific term, you send a GET request for the child resource, such as  /schema/analysis/synonyms/english/mad  would return  ["angry","upset"]

You can also PUT a list of symmetric synonyms, which will be expanded into a mapping for each term in the list. For example, you could PUT the following list of symmetric synonyms using the JSON list syntax instead of a map:

Note that the expansion is performed when processing the PUT request so the underlying persistent state is still a managed map. Consequently, if after sending the previous PUT request, you did a GET for /schema/analysis/synonyms/english/jocular, then you would receive a list containing ["funny", "entertaining", "whimiscal"]. Once you've created synonym mappings using a list, each term must be managed separately.

Lastly, you can delete a mapping by sending a DELETE request to the managed endpoint.

Applying Changes

Changes made to managed resources via this REST API are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded. For example:, after adding or deleting a stop word, you must reload the core/collection before changes become active; related APIs: CoreAdmin API and Collections API.

This approach is required when running in distributed mode so that we are assured changes are applied to all cores in a collection at the same time so that behavior is consistent and predictable. It goes without saying that you don’t want one of your replicas working with a different set of stop words or synonyms  than the others.

One subtle outcome of this apply-changes-at-reload approach is that the once you make changes with the API, there is no way to read the active data. In other words, the API returns the most up-to-date data from an API perspective, which could be different than what is currently being used by Solr components. However, the intent of this API implementation is that changes will be applied using a reload within a short time frame after making them so the time in which the data returned by the API differs from what is active in the server is intended to be negligible.

Changing things like stop words and synonym mappings typically require re-indexing existing documents if being used by index-time analyzers. The RestManager framework does not guard you from this, it simply makes it possible to programmatically build up a set of stop words, synonyms etc.

RestManager Endpoint

Metadata about registered ManagedResources is available using the /schema/managed endpoint for each collection. Assuming you have the managed_en field type shown above defined in your schema.xml, sending a GET request to the following resource will return metadata about which schema-related resources are being managed by the RestManager:

The response body is a JSON document containing metadata about managed resources under the /schema root:

You can also create new managed resource using PUT/POST to the appropriate URL – before ever configuring anything that uses these resources.

For example: imagine we want to build up a set of German stop words. Before we can start adding stop words, we need to create the endpoint:

/solr/techproducts/schema/analysis/stopwords/german

To create this endpoint, send the following PUT/POST request to the endpoint we wish to create:

Solr will respond with status code 200 if the request is successful. Effectively, this action registers a new endpoint for a managed resource in the RestManager. From here you can start adding German stop words as we saw above:

For most users, creating resources in this way should never be necessary, since managed resources are created automatically when configured.

However: You may want to explicitly delete managed resources if they are no longer being used by a Solr component.

For instance, the managed resource for German that we created above can be deleted because there are no Solr components that are using it, whereas the managed resource for English stop words cannot be deleted because there is a token filter declared in schema.xml that is using it.

Related Topics

 

  • No labels

21 Comments

  1. FYI: The initial version of this page was a much longer adaptation of tim's blog post that went into a lot more low level details and discussed some of the design choices.

    Tim & i talked about it in IRC, and in order to target a more user based audience and change the tone to be more reference based, we wound up cutting out a lot of info

  2. Hoss Man Is there possible way to upload a dictionary to use a suggester with rest manager?

  3. We should mention what file name these things map to. So the people are not surprised when they look on the filesystem.

  4. Does /config/managed actually work? Both original article and this one mention it but - at least on 5.5 - it returns 404 error. Neither article gives an example what it supposed to return.

    1. It works for me with Solr 6.0.1.

    2. It doesn't seem to work for me on solr 6.2 either. /schema/managed works fine.

      1. It doesn't work for me either. I see it referenced in the test (TestRestManager.java), but the test actually only tests schema/managed. I would suggest filing an issue, or we can just remove it from documentation.

        1. I can't figure out from the history who added it in either. Nor, what it is supposed to do. So, I suspect we can just delete it.

          1. I removed it - just one reference.

  5. "schema.xml" is mentioned in this section.  But I no longer see "schema.xml" in Solr 6.0.1. It seems to have been renamed to "managed-schema" (without .xml suffix; I wonder why). This document needs to be updated to mention the new name, if the new name is here to stay.

    Also, the managed-schema file itself has this comment:

     This file should be named "schema.xml" and 
    should be in the conf directory under the solr home
    (i.e. ./solr/conf/schema.xml by default)

    This needs to be updated.  The wiki page the file refers to, http://wiki.apache.org/solr/SchemaXml, no longer seems valid.

     

  6. Hello everyone!

    We are trying to perform this curl command:

    curl -X POST -H "Content-type:application/json" -u "USERNAME":"PASSWORD" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/CLUSTER_ID/solr/COLLECTION_NAME/schema/analysis/synonyms/english" --data-binary "["ARS","Argentinian Peso"]"

    The result:

    {
    "responseHeader": {
    "status": 500,
    "QTime": 2
    },
    "error": {
    "msg": "Expected ',' or ']': char=(EOF),position=16 BEFORE='[ARS,Argentinian'",
    "trace": "org.noggit.JSONParser$ParseException: Expected ',' or ']': char=(EOF),position=16 BEFORE='[ARS,Argentinian'\n\tat org.noggit.JSONParser.err(JSONParser.java:356)\n\tat org.noggit.JSONParser.nextEvent(JSONParser.java:983)\n\tat org.noggit.ObjectBuilder.getArray(ObjectBuilder.java:149)\n\tat org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:59)\n\tat org.noggit.ObjectBuilder.getVal(ObjectBuilder.java:37)\n\tat org.noggit.ObjectBuilder.fromJSON(ObjectBuilder.java:33)\n\tat org.apache.solr.rest.RestManager$ManagedEndpoint.parseJsonFromRequestBody(RestManager.java:425)\n\tat org.apache.solr.rest.RestManager$ManagedEndpoint.post(RestManager.java:351)\n\tat org.restlet.resource.ServerResource.doHandle(ServerResource.java:454)\n\tat org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:359)\n\tat org.restlet.resource.ServerResource.handle(ServerResource.java:1044)\n\tat org.restlet.resource.Finder.handle(Finder.java:236)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Router.doHandle(Router.java:422)\n\tat org.restlet.routing.Router.handle(Router.java:639)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:140)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202)\n\tat org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:75)\n\tat org.restlet.Application.handle(Application.java:385)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Router.doHandle(Router.java:422)\n\tat org.restlet.routing.Router.handle(Router.java:639)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.routing.Router.doHandle(Router.java:422)\n\tat org.restlet.routing.Router.handle(Router.java:639)\n\tat org.restlet.routing.Filter.doHandle(Filter.java:150)\n\tat org.restlet.routing.Filter.handle(Filter.java:197)\n\tat org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202)\n\tat org.restlet.Component.handle(Component.java:408)\n\tat org.restlet.Server.handle(Server.java:507)\n\tat org.restlet.engine.connector.ServerHelper.handle(ServerHelper.java:63)\n\tat org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:143)\n\tat org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1117)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:595)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:191)\n\tat org.eclipse.jetty.server.Dispatcher.forward(Dispatcher.java:72)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:266)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:499)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)\n\tat org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)\n\tat java.lang.Thread.run(Thread.java:745)\n",
    "code": 500
    }
    }
    curl: (3) [globbing] unmatched close brace/bracket in column 5

    Do you have any idea why? The API doesn't support "multi-term" synonyms?

    Thank you for your help.

    1. Daniel, please ask questions like this on the user list for access/subscription information - see https://lucene.apache.org/solr/resources.html#mailing-lists.  In this case, it looks like you have quoting issues - I'm guessing your shell is stripping some quotes before giving values to curl - try using single quotes around the data-binary param:

      --data-binary '["ARS","Argentinian Peso"]'


  7. Hi!

    It is possible to extract all synonyms added as managed resources in csv?

    Thanks.

  8. Link at related topics is dead.

  9. Hi all!

    I will need your help guys. 

    We now need to know if the following definition is correct to have synonyms in query time and configured as managed resources, for the watson_text type that is the type used to indexed fields.

    <fieldType name="watson_text" class="com.ibm.watson.hector.plugins.fieldtype.WatsonTextField" omitNorms="false" omitTermFreqAndPositions="false" indexed="true" termOffsets="true" stored="true" termPositions="true" termVectors="true">
    <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPossessiveFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="lang/protwords_en.txt"/>
    <filter class="solr.PorterStemFilterFactory"/>
    </analyzer>
    <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.ManagedSynonymFilterFactory" managed="english"/>
    <filter class="solr.StopFilterFactory" words="lang/stopwords_en.txt" ignoreCase="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPossessiveFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="lang/protwords_en.txt"/>
    <filter class="solr.PorterStemFilterFactory"/>
    </analyzer>
    </fieldType>

     

    Thank you, I'll wait your answer asap.

    Cheers,

    DM

    1. Daniel, please use the mailing list for questions like this.  See my reply to your previous comment on this page for more information.

      1. Hi Steve Rowe !

        Where can I see your reply? Sorry for the post, I'll use the mailing list next time!

        Best,
        DM 

        1. Scroll up or search on this page in your browser?