Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

6.6 Draft Ref Guide Topics

Meta-Documentation

This Unreleased Guide Will Cover Apache Solr 6.6

Skip to end of metadata
Go to start of metadata

The Schema API utilizes the ManagedIndexSchemaFactory class, which is the default schema factory in modern Solr versions.  See the section Schema Factory Definition in SolrConfig for more information about choosing a schema factory for your index.

This API provides read and write access to the Solr schema for each collection (or core, when using standalone Solr). Read access to all schema elements is supported.  Fields, dynamic fields, field types and copyField rules may be added, removed or replaced. Future Solr releases will extend write access to allow more schema elements to be modified.

Why is hand editing of the managed schema discouraged?

The file named "managed-schema" in the example configurations may include a note that recommends never hand-editing the file.  Before the Schema API existed, such edits were the only way to make changes to the schema, and users may have a strong desire to continue making changes this way.

The reason that this is discouraged is because hand-edits of the schema may be lost if the Schema API described here is later used to make a change, unless the core or collection is reloaded or Solr is restarted before using the Schema API.  If care is taken to always reload or restart after a manual edit, then there is no problem at all with doing those edits.

The API allows two output modes for all calls: JSON or XML. When requesting the complete schema, there is another output mode which is XML modeled after the managed-schema file itself, which is in XML format.

When modifying the schema with the API, a core reload will automatically occur in order for the changes to be available immediately for documents indexed thereafter.  Previously indexed documents will not be automatically updated - they must be re-indexed if existing index data uses schema elements that you changed.

Re-index after schema modifications!

If you modify your schema, you will likely need to re-index all documents. If you do not, you may lose access to documents, or not be able to interpret them properly, e.g. after replacing a field type.

Modifying your schema will never modify any documents that are already indexed. You must re-index documents in order to apply schema changes to them.  Queries and updates made after the change may encounter errors that were not present before the change.  Completely deleting the index and rebuilding it is usually the only option to fix such errors.

The base address for the API is http://<host>:<port>/solr/<collection_name>. If for example you run Solr's "cloud" example (via the bin/solr command shown below), which creates a "gettingstarted" collection, then the base URL for that collection (as in all the sample URLs in this section) would be: http://localhost:8983/solr/gettingstarted .

 

API Entry Points

/schema: retrieve the schema, or modify the schema to add, remove, or replace fields, dynamic fields, copy fields, or field types
/schema/fields: retrieve information about all defined fields or a specific named field
/schema/dynamicfields: retrieve information about all dynamic field rules or a specific named dynamic rule
/schema/fieldtypes: retrieve information about all field types or a specific field type
/schema/copyfields: retrieve information about copy fields
/schema/name: retrieve the schema name
/schema/version: retrieve the schema version
/schema/uniquekey: retrieve the defined uniqueKey
/schema/similarity: retrieve the global similarity definition
/schema/solrqueryparser/defaultoperator: retrieve the default operator

Modify the Schema

POST /collection/schema

To add, remove or replace fields, dynamic field rules, copy field rules, or new field types, you can send a POST request to the /collection/schema/ endpoint with a sequence of commands to perform the requested actions. The following commands are supported:

  • add-field: add a new field with parameters you provide.
  • delete-field: delete a field.
  • replace-field: replace an existing field with one that is differently configured.
  • add-dynamic-field: add a new dynamic field rule with parameters you provide.
  • delete-dynamic-field: delete a dynamic field rule.
  • replace-dynamic-field: replace an existing dynamic field rule with one that is differently configured.
  • add-field-type: add a new field type with parameters you provide.
  • delete-field-type: delete a field type.
  • replace-field-type: replace an existing field type with one that is differently configured.
  • add-copy-field: add a new copy field rule.
  • delete-copy-field: delete a copy field rule.

These commands can be issued in separate POST requests or in the same POST request. Commands are executed in the order in which they are specified.

In each case, the response will include the status and the time to process the request, but will not include the entire schema.

When modifying the schema with the API, a core reload will automatically occur in order for the changes to be available immediately for documents indexed thereafter.  Previously indexed documents will not be automatically handled - they must be re-indexed if they used schema elements that you changed.

Add a New Field

The add-field command adds a new field definition to your schema. If a field with the same name exists an error is thrown. 

All of the properties available when defining a field with manual schema.xml edits can be passed via the API. These request attributes are described in detail in the section Defining Fields

For example, to define a new stored field named "sell-by", of type "tdate", you would POST the following request:

Delete a Field

The delete-field command removes a field definition from your schema. If the field does not exist in the schema, or if the field is the source or destination of a copy field rule, an error is thrown. 

For example, to delete a field named "sell-by", you would POST the following request:

Replace a Field

The replace-field command replaces a field's definition.  Note that you must supply the full definition for a field - this command will not partially modify a field's definition.  If the field does not exist in the schema an error is thrown. 

All of the properties available when defining a field with manual schema.xml edits can be passed via the API. These request attributes are described in detail in the section Defining Fields

For example, to replace the definition of an existing field "sell-by", to make it be of type "date" and to not be stored, you would POST the following request:

Add a Dynamic Field Rule

The add-dynamic-field command adds a new dynamic field rule to your schema. 

All of the properties available when editing schema.xml can be passed with the POST request. The section Dynamic Fields has details on all of the attributes that can be defined for a dynamic field rule.

For example, to create a new dynamic field rule where all incoming fields ending with "_s" would be stored and have field type "string", you can POST a request like this:

Delete a Dynamic Field Rule

The delete-dynamic-field command deletes a dynamic field rule from your schema. If the dynamic field rule does not exist in the schema, or if the schema contains a copy field rule with a target or destination that matches only this dynamic field rule, an error is thrown.

For example, to delete a dynamic field rule matching "*_s", you can POST a request like this:

Replace a Dynamic Field Rule

The replace-dynamic-field command replaces a dynamic field rule in your schema.  Note that you must supply the full definition for a dynamic field rule - this command will not partially modify a dynamic field rule's definition.  If the dynamic field rule does not exist in the schema an error is thrown. 

All of the properties available when editing schema.xml can be passed with the POST request. The section Dynamic Fields has details on all of the attributes that can be defined for a dynamic field rule.

For example, to replace the definition of the "*_s" dynamic field rule with one where the field type is "text_general" and it's not stored, you can POST a request like this:

Add a New Field Type

The add-field-type command adds a new field type to your schema. 

All of the field type properties available when editing schema.xml by hand are available for use in a POST request. The structure of the command is a json mapping of the standard field type definition, including the name, class, index and query analyzer definitions, etc. Details of all of the available options are described in the section Solr Field Types.

For example, to create a new field type named "myNewTxtField", you can POST a request as follows:

Note in this example that we have only defined a single analyzer section that will apply to index analysis and query analysis. If we wanted to define separate analysis, we would replace the analyzer section in the above example with separate sections for indexAnalyzer and queryAnalyzer. As in this example:

Delete a Field Type

The delete-field-type command removes a field type from your schema.  If the field type does not exist in the schema, or if any field or dynamic field rule in the schema uses the field type, an error is thrown.  

For example, to delete the field type named "myNewTxtField", you can make a POST request as follows:

Replace a Field Type

The replace-field-type command replaces a field type in your schema.  Note that you must supply the full definition for a field type - this command will not partially modify a field type's definition.  If the field type does not exist in the schema an error is thrown. 

All of the field type properties available when editing schema.xml by hand are available for use in a POST request. The structure of the command is a json mapping of the standard field type definition, including the name, class, index and query analyzer definitions, etc. Details of all of the available options are described in the section Solr Field Types.

For example, to replace the definition of a field type named "myNewTxtField", you can make a POST request as follows:

Add a New Copy Field Rule

The add-copy-field command adds a new copy field rule to your schema.

The attributes supported by the command are the same as when creating copy field rules by manually editing the schema.xml, as below: 

Name

Required

Description

source

Yes

The source field.

destYesA field or an array of fields to which the source field will be copied.
maxCharsNoThe upper limit for the number of characters to be copied. The section Copying Fields has more details.

For example, to define a rule to copy the field "shelf" to the "location" and "catchall" fields, you would POST the following request:

Delete a Copy Field Rule

The delete-copy-field command deletes a copy field rule from your schema.  If the copy field rule does not exist in the schema an error is thrown. 

The source and dest attributes are required by this command.

For example, to delete a rule to copy the field "shelf" to the "location" field, you would POST the following request:

Multiple Commands in a Single POST

It is possible to perform one or more add requests in a single command. The API is transactional and all commands in a single call either succeed or fail together.

The commands are executed in the order in which they are specified. This means that if you want to create a new field type and in the same request use the field type on a new field, the section of the request that creates the field type must come before the section that creates the new field. Similarly, since a field must exist for it to be used in a copy field rule, a request to add a field must come before a request for the field to be used as either the source or the destination for a copy field rule.

The syntax for making multiple requests supports several approaches. First, the commands can simply be made serially, as in this request to create a new field type and then a field that uses that type:

Or, the same command can be repeated, as in this example:

Finally, repeated commands can be sent as an array:

Schema Changes among Replicas

When running in SolrCloud mode, changes made to the schema on one node will propagate to all replicas in the collection. You can pass the updateTimeoutSecs parameter with your request to set the number of seconds to wait until all replicas confirm they applied the schema updates. This helps your client application be more robust in that you can be sure that all replicas have a given schema change within a defined amount of time. If agreement is not reached by all replicas in the specified time, then the request fails and the error message will include information about which replicas had trouble. In most cases, the only option is to re-try the change after waiting a brief amount of time. If the problem persists, then you'll likely need to investigate the server logs on the replicas that had trouble applying the changes. If you do not supply an updateTimeoutSecs parameter, the default behavior is for the receiving node to return immediately after persisting the updates to ZooKeeper. All other replicas will apply the updates asynchronously. Consequently, without supplying a timeout, your client application cannot be sure that all replicas have applied the changes.

Retrieve Schema Information

The following endpoints allow you to read how your schema has been defined. You can GET the entire schema, or only portions of it as needed.

To modify the schema, see the previous section Modify the Schema.

Retrieve the Entire Schema

GET /collection/schema

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters should be added to the API request after '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json, xml or schema.xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content

The output will include all fields, field types, dynamic rules and copy field rules, in the format requested (JSON or XML). The schema name and version are also included.

EXAMPLES

Get the entire schema in JSON.

 

Get the entire schema in XML.

 

Get the entire schema in "schema.xml" format.

List Fields

GET /collection/schema/fields

GET /collection/schema/fields/fieldname

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

fieldnameThe specific fieldname (if limiting request to a single field).

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

flstringNo(all fields)Comma- or space-separated list of one or more fields to return. If not specified, all fields will be returned by default.
includeDynamicbooleanNofalseIf true, and if the fl query parameter is specified or the fieldname path parameter is used, matching dynamic fields are included in the response and identified with the dynamicBase property. If neither the fl query parameter nor the fieldname path parameter is specified, the includeDynamic query parameter is ignored. If false, matching dynamic fields will not be returned.
showDefaultsbooleanNofalse

If true, all default field properties from each field's field type will be included in the response (e.g.   tokenized  for   solr.TextField). If false, only explicitly specified field properties will be included.

OUTPUT

Output Content

The output will include each field and any defined configuration for each field. The defined configuration can vary for each field, but will minimally include the field name, the type, if it is indexed and if it is stored. If multiValued is defined as either true or false (most likely true), that will also be shown. See the section Defining Fields for more information about each parameter.

EXAMPLES

Get a list of all fields.

The sample output below has been truncated to only show a few fields.

List Dynamic Fields

GET /collection/schema/dynamicfields

GET /collection/schema/dynamicfields/name

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

nameThe name of the dynamic field rule (if limiting request to a single dynamic field rule).

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json, xml. If not specified, JSON will be returned by default.

showDefaultsbooleanNofalse

If true, all default field properties from each dynamic field's field type will be included in the response (e.g. tokenized for solr.TextField). If false, only explicitly specified field properties will be included.

OUTPUT

Output Content

The output will include each dynamic field rule and the defined configuration for each rule. The defined configuration can vary for each rule, but will minimally include the dynamic field name, the type, if it is indexed and if it is stored. See the section Dynamic Fields for more information about each parameter.

EXAMPLES

Get a list of all dynamic field declarations:

The sample output below has been truncated.

List Field Types

GET /collection/schema/fieldtypes

GET /collection/schema/fieldtypes/name

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

nameThe name of the field type (if limiting request to a single field type).

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

showDefaultsbooleanNofalse

If true, all default field properties from each field type will be included in the response (e.g. tokenized for solr.TextField). If false, only explicitly specified field properties will be included.

OUTPUT

Output Content

The output will include each field type and any defined configuration for the type. The defined configuration can vary for each type, but will minimally include the field type name and the class. If query or index analyzers, tokenizers, or filters are defined, those will also be shown with other defined parameters. See the section Solr Field Types for more information about how to configure various types of fields.

EXAMPLES

Get a list of all field types.

The sample output below has been truncated to show a few different field types from different parts of the list.

List Copy Fields

GET /collection/schema/copyfields

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

source.flstringNo(all source fields)Comma- or space-separated list of one or more copyField source fields to include in the response - copyField directives with all other source fields will be excluded from the response. If not specified, all copyField-s will be included in the response.
dest.flstringNo(all dest fields)Comma- or space-separated list of one or more copyField dest fields to include in the response - copyField directives with all other dest fields will be excluded. If not specified, all copyField-s will be included in the response.

OUTPUT

Output Content

The output will include the source and destination of each copy field rule defined in schema.xml. For more information about copying fields, see the section Copying Fields.

EXAMPLES

Get a list of all copyfields.

The sample output below has been truncated to the first few copy definitions.

Show Schema Name

GET /collection/schema/name

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content
The output will be simply the name given to the schema.

EXAMPLES

Get the schema name.

Show the Schema Version

GET /collection/schema/version

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content

The output will simply be the schema version in use.

EXAMPLES

Get the schema version

 

List UniqueKey

GET /collection/schema/uniquekey

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content

The output will include simply the field name that is defined as the uniqueKey for the index.

EXAMPLES

List the uniqueKey.

Show Global Similarity

GET /collection/schema/similarity

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content

The output will include the class name of the global similarity defined (if any).

EXAMPLES

Get the similarity implementation.

 

Get the Default Query Operator

GET /collection/schema/solrqueryparser/defaultoperator

INPUT

Path Parameters

Key

Description

collection

The collection (or core) name.

Query Parameters

The query parameters can be added to the API request after a '?'.

Key

Type

Required

Default

Description

wt

string

No

json

Defines the format of the response. The options are json or xml. If not specified, JSON will be returned by default.

OUTPUT

Output Content

The output will include simply the default operator if none is defined by the user.

EXAMPLES

Get the default operator.

Manage Resource Data

The Managed Resources REST API provides a mechanism for any Solr plugin to expose resources that should support CRUD (Create, Read, Update, Delete) operations. Depending on what Field Types and Analyzers are configured in your Schema, additional /schema/ REST API paths may exist.  See the Managed Resources section for more information and examples.

 

  • No labels

19 Comments

  1. I have no idea why, or how to fix it, but in the draft PDF Hoss made, the JSON example of the entire schema is putting 3/4 of a page page between the 2nd to last and last copyField in the sample response. My only guess is that the ellipsis (the '...') is confusing it, and I don't know if it would still happen if the last bit of that response was removed.

  2. Typo? In copyfield Example the URL still has /field vs. /copyfield (the earlier view examples, not the create example at the bottom)

    1. I fixed this typo, and found another one at the same time that I fixed.

  3. When creating multiple fields it uses POST, vs. the singular field form uses PUT. I thought at first that the latter was for modifying a field vs. adding, but even in the single-field / PUT mode, it still says create. If this really matters the wiki could highlight it and explain the difference.

    1. Could you explain this further? I'm not quite getting what you see as a difference that isn't explained on this page.

  4. The structure of this page (in particular the "Modify the schema" section) doesn't really make sense to me now – there are evidently 2 different APIs that can be used to add fields ("add-field" sub-section, with "POST /schema" vs "Create new schema fields" subsection with "POST /collection/schema/fields") but there isn't any explanation of when/why users should use one API vs the other ... and they structure/formatting of the sub-sections about these 2 different APIs are inconsistent with each other.

    likewise: the same problem/confusion exists between "add-copy-field" vs "Create new copyField directives" subsections.

    Also very confusing is that the "add-field" subsection example explicitly points out that an "array" of fields can be declared at one time -- but it's not clear if the same array approach cna be used with add-copy-field, add-copy-field, add-dynamic-field, & add-field-type which also use "POST /schema" ... in fact there is no mention *anywhere* in the descriptive text of these apis that an array can be specified -- just the comment in the example.

    (as opposed to the "Create new schema fields" and "Create new copyField directives" sub-sections where the descriptions do explicitly state "The  JSON must contain an array of one or more new field specifications, ..."  and "The body must contain an array of zero or more copyField directives,...")

     

  5. The objective is to remove the individual API sections to avoid this confusion. We are anyway planning to deprecate those APIs and we may remove them in a later release

    1. how does poor formatting and incomplete examples "avoid this confusion"

      even if some of these API end points are deprecated, they still need documented – and the docs need to make sense – as things stand now, they don't:

      • the new apis you recently added are formatted totally inconsistently with the rest of the page
      • it's not clear which of the new APIs you recently added support a "list" of things vs a single "thing"
      • it's not clear when/why someone would/should choose to use the various APIs that serve similar purposes (if the answer is "this is deprecated" then note that, if the answer is "this is less efficient" then not that.)

      I would rather we have ONE documented way to do X, that is well formated, with good examples, and makes sense - then have TWO documented ways to do X, where the first method found when reading hte docs is poorly formatted, makes no sense, and has an incomplete example ... better to remove that part of the doc and add it back later if/when we are ready to say something meaningful about the new APIs and if/why they should be used.

      1. I understand your concern.

        The document is  "Work in progress" and before the release we should get it in shape. 

        There will be no two ways to do one thing. We will have a consistent way of performing an operation.

  6. Don't want to jump into editing this page because I haven't really used/worked with the Schema API, but looks like this comment is also outdated:

    "All types of schema data support GET (read) access, but in only new fields and copyField directives may be added to the schema with PUT or POST. Future Solr releases will extend this functionality to allow more schema elements to be updated."

  7. Why is that when using managed schema in SolrCloud, adding fields into schema would SOMTIMES end up prompting "Can't find resource 'schema.xml' in classpath or '/configs/collection1', cwd=/export/solr/solr-5.1.0/server", there is of course no schema.xml in configs, but 'schema.xml.bak' and 'managed-schema'

  8. I add a new field to store the name of my documents because my schema API dont have name field and i reindex my data from system files(.doc .docx .pdf) in solr5.3 but the new field return nothing. what i shoud do..

     

    1. I would recommend sending your question to the solr-user mailing list. Access to the list is described here: http://lucene.apache.org/solr/resources.html#community. Please be sure to include a more detailed description of the problem, including how you indexed the files, if you see anything in the logs, your schema, etc. Tips for good user list questions are available here: http://wiki.apache.org/solr/UsingMailingLists.

  9. For add-field-type, it is not documented that you can define a multiTermAnalyzer.

  10. Is it possible to change the default schema name ("example") to something more specific via a POST call?

    1. Unfortunately no, not yet, see SOLR-7242.