Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

Ref Guide Topics


*** As of June 2017, the latest Solr Ref Guide is located at https://lucene.apache.org/solr/guide ***

Please note comments on these pages have now been disabled for all users.

Skip to end of metadata
Go to start of metadata

Schemaless Mode is a set of Solr features that, when used together, allow users to rapidly construct an effective schema by simply indexing sample data, without having to manually edit the schema. These Solr features, all controlled via solrconfig.xml, are:

  1. Managed schema: Schema modifications are made at runtime through Solr APIs, which requires the use of schemaFactory that supports these changes - see Schema Factory Definition in SolrConfig for more details.
  2. Field value class guessing: Previously unseen fields are run through a cascading set of value-based parsers, which guess the Java class of field values - parsers for Boolean, Integer, Long, Float, Double, and Date are currently available.
  3. Automatic schema field addition, based on field value class(es): Previously unseen fields are added to the schema, based on field value Java classes, which are mapped to schema field types - see Solr Field Types.

Using the Schemaless Example

The three features of schemaless mode are pre-configured in the data_driven_schema_configs config set in the Solr distribution. To start an example instance of Solr using these configs, run the following command:


This will launch a Solr server, and automatically create a collection (named "gettingstarted") that contains only three fields in the initial schema: id, _version_, and _text_.

You can use the /schema/fields Schema API to confirm this: curl http://localhost:8983/solr/gettingstarted/schema/fields will output:


Because the data_driven_schema_configs config set includes a copyField directive that causes all content to be indexed in a predefined "catch-all" _text_ field, to enable single-field search that includes all fields' content, the index will be larger than it would be without the copyField. When you nail down your schema, consider removing the _text_ field and the corresponding copyField directive if you don't need it.

Configuring Schemaless Mode

As described above, there are three configuration elements that need to be in place to use Solr in schemaless mode. In the data_driven_schema_configs config set included with Solr these are already configured. If, however, you would like to implement schemaless on your own, you should make the following changes.

Enable Managed Schema

As described in the section Schema Factory Definition in SolrConfig, Managed Schema support is enabled by default, unless your configuration specifies that ClassicIndexSchemaFactory should be used.

You can configure the ManagedIndexSchemaFactory (and control the resource file used, or disable future modifications) by adding an explicit <schemaFactory/> like the one below, please see Schema Factory Definition in SolrConfig for more details on the options available.

#666666xmlsolid true managed-schema ]]>

Define an UpdateRequestProcessorChain

The UpdateRequestProcessorChain allows Solr to guess field types, and you can define the default field type classes to use. To start, you should define it as follows (see the javadoc links below for update processor factory documentation):

#666666xmlsolid [^\w-\.] _ yyyy-MM-dd'T'HH:mm:ss.SSSZ yyyy-MM-dd'T'HH:mm:ss,SSSZ yyyy-MM-dd'T'HH:mm:ss.SSS yyyy-MM-dd'T'HH:mm:ss,SSS yyyy-MM-dd'T'HH:mm:ssZ yyyy-MM-dd'T'HH:mm:ss yyyy-MM-dd'T'HH:mmZ yyyy-MM-dd'T'HH:mm yyyy-MM-dd HH:mm:ss.SSSZ yyyy-MM-dd HH:mm:ss,SSSZ yyyy-MM-dd HH:mm:ss.SSS yyyy-MM-dd HH:mm:ss,SSS yyyy-MM-dd HH:mm:ssZ yyyy-MM-dd HH:mm:ss yyyy-MM-dd HH:mmZ yyyy-MM-dd HH:mm yyyy-MM-dd strings java.lang.Boolean booleans java.util.Date tdates java.lang.Long java.lang.Integer tlongs java.lang.Number tdoubles ]]>

Javadocs for update processor factories mentioned above:

Make the UpdateRequestProcessorChain the Default for the UpdateRequestHandler

Once the UpdateRequestProcessorChain has been defined, you must instruct your UpdateRequestHandlers to use it when working with index updates (i.e., adding, removing, replacing documents). Here is an example using InitParams to set the defaults on all /update request handlers:

#666666xmlsolid add-unknown-fields-to-the-schema ]]>

After each of these changes have been made, Solr should be restarted (or, you can reload the cores to load the new solrconfig.xml definitions).

Examples of Indexed Documents

Once the schemaless mode has been enabled (whether you configured it manually or are using data_driven_schema_configs ), documents that include fields that are not defined in your schema should be added to the index, and the new fields added to the schema.

For example, adding a CSV document will cause its fields that are not in the schema to be added, with fieldTypes based on values:


Output indicating success:

#666666xmlsolid 0106 ]]>

The fields now in the schema (output from curl http://localhost:8983/solr/gettingstarted/schema/fields ):

#666666javascriptsolid strings fieldType { "name":"Artist", "type":"strings"}, // Field value guessed as String -> strings fieldType { "name":"FromDistributor", "type":"tlongs"}, // Field value guessed as Long -> tlongs fieldType { "name":"Rating", "type":"tdoubles"}, // Field value guessed as Double -> tdoubles fieldType { "name":"Released", "type":"tdates"}, // Field value guessed as Date -> tdates fieldType { "name":"Sold", "type":"tlongs"}, // Field value guessed as Long -> tlongs fieldType { "name":"_text_", ... }, { "name":"_version_", ... }, { "name":"id", ... }]}]]>You Can Still Be Explicit

Even if you want to use schemaless mode for most fields, you can still use the Schema API to pre-emptively create some fields, with explicit types, before you index documents that use them.

Internally, the Schema API and the Schemaless Update Processors both use the same Managed Schema functionality.

Once a field has been added to the schema, its field type is fixed. As a consequence, adding documents with field value(s) that conflict with the previously guessed field type will fail. For example, after adding the above document, the "Sold" field has the fieldType tlongs, but the document below has a non-integral decimal value in this field:


This document will fail, as shown in this output:

#666666xmlsolid 400 7 ERROR: [doc=19F] Error adding field 'Sold'='4.93' msg=For input string: "4.93" 400 ]]>

  • No labels


  1. I'm new to the "schemaless mode", I followed the instruction, and used the default "data_driven_schema_configs", and was able to get the example csv indexed, but it doesn't work on XML for some reason? Here is an example xml:

    1. I think the XML file must follow the Solr doc schema. Your data needs to be transformed to something like:

      <field name="accession">Q14524</field>
      <field name="name">SCN5A_HUMAN</field>