The Term Vector Component (TVC) is a SearchComponent designed to return information about documents that is stored when setting the termVector attribute on a field:
<field name="features" type="text" indexed="true" stored="true" multiValued="true" termVectors="true" termPositions="true" termOffsets="true"/>
For each document, the TVC can return the term vector, the term frequency, inverse document frequency, and position and offset information. As with most components, there are a number of options that are outlined in the samples below.
All examples are based on using the Solr example server.
You need to enable the TermVectorComponent in your solr configuration (this is already in the example solrconfig.xml):
<searchComponent name="tvComponent" class="org.apache.solr.handler.component.TermVectorComponent"/>
A RequestHandler configuration using this component could look like this:
<requestHandler name="tvrh" class="org.apache.solr.handler.component.SearchHandler"> <lst name="defaults"> <bool name="tv">true</bool> </lst> <arr name="last-components"> <str>tvComponent</str> </arr> </requestHandler>
In the example schema, the "includes" field has term vectors enabled. The following example HTTP request asks for the term vectors of all documents with something in the includes field.
In the example server, the component is associated with a request handler named tvrh, but you can associate it with any RequestHandler. To turn on the component for a request, add the
tv=true parameter (or add it to your RequestHandler defaults configuration).
Example output: See TermVectorComponentExampleEnabled.
An example HTTP request using these options:
(Solr3.1) Options may be specified per-field, similar to the way per field options work in faceting, as in
If you specify f.fieldName you must also explicitly declare &tv.fl or &fl
In this example, all features are requested, but then term frequency is turned off for the "includes" field (the only field returned)
In this example, all features are requested, but then offsets are turned off for the "includes" field (the only field returned)
If you do not specify per field options but still specify a field, it will assume the general options.
If a request field does not support the options specified, warnings will be returned indicating that the field does not support that option. There are three types of warnings:
Each of these items is a List of Strings containing the field name that does not support the option specified.
There is a patch in progress for strongly-typed SolrJ support .