StatsComponent

(warning) Solr1.4

The stats component returns simple statistics for indexed numeric fields within the DocSet.

Parameters

param

description

stats

true – then do stats

stats.field

add one parameter for each field that needs statistics

stats.facet

Return sub results for values within the given facet.

Example

With the example data loaded: http://localhost:8983/solr/select?q=*:*&stats=true&stats.field=price&stats.field=popularity&rows=0&indent=true

<lst name="stats">
 <lst name="stats_fields">
  <lst name="price">
    <double name="min">0.0</double>
    <double name="max">2199.0</double>
    <double name="sum">5251.2699999999995</double>
    <long name="count">15</long>
    <long name="missing">11</long>
    <double name="sumOfSquares">6038619.160300001</double>
    <double name="mean">350.08466666666664</double>
    <double name="stddev">547.737557906113</double>
  </lst>
  <lst name="popularity">
    <double name="min">0.0</double>
    <double name="max">10.0</double>
    <double name="sum">90.0</double>
    <long name="count">26</long>
    <long name="missing">0</long>
    <double name="sumOfSquares">628.0</double>
    <double name="mean">3.4615384615384617</double>
    <double name="stddev">3.5578731762756157</double>
  </lst>
 </lst>
</lst>

Same results faceted on inStock:
&stats.facet=inStock

<lst name="stats">
 <lst name="stats_fields">
  <lst name="price">
  <double name="min">0.0</double>
  <double name="max">2199.0</double>
  <double name="sum">5251.2699999999995</double>
  <long name="count">15</long>
  <long name="missing">11</long>
  <double name="sumOfSquares">6038619.160300001</double>
  <double name="mean">350.08466666666664</double>
  <double name="stddev">547.737557906113</double>
  <lst name="facets">
   <lst name="inStock">
    <lst name="false">
      <double name="min">11.5</double>
      <double name="max">649.99</double>
      <double name="sum">1161.39</double>
      <long name="count">4</long>
      <long name="missing">0</long>
      <double name="sumOfSquares">653369.2551</double>
      <double name="mean">290.3475</double>
      <double name="stddev">324.63444676281654</double>
    </lst>
    <lst name="true">
      <double name="min">0.0</double>
      <double name="max">2199.0</double>
      <double name="sum">4089.879999999999</double>
      <long name="count">11</long>
      <long name="missing">0</long>
      <double name="sumOfSquares">5385249.905200001</double>
      <double name="mean">371.8072727272727</double>
      <double name="stddev">621.6592938755265</double>
    </lst>
   </lst>
  </lst>
 </lst>
</lst>

Notes

  • The facet field can be selectively applied. That is if you want stats on field "A" and "B", you can facet a on "X" and B on "Y" using &stats.field=A&f.A.stats.facet=X&stats.field=B&f.B.stats.facet=Y
  • (warning) Warning, as implemented, all facet results are returned, be careful what fields you ask for!
  • Multi-valued fields and facets may be slow.
  • Computing statistics using stats.facet over a multi-valued field does not work properly. https://issues.apache.org/jira/browse/SOLR-1782
  • Multi-value fields rely on UnInvertedField.java for implementation. This is like a FieldCache, so be aware of your memory footprint.
  • TrieFields has to use a precisionStep of -1 to avoid using UnInvertedField.java. Consider using one field for doing stats, and one for doing range facetting on.

Results

value

description

min

The minimum value

max

The maximum value

sum

Sum of all values

count

How many (non-null) values

missing

How many null values

sumOfSquares

Sum of all values squared (useful for stddev)

mean

The average (v1+v2...+vN)/N

stddev

Standard Deviation – measuring how widely spread the values in a data set are.

  • No labels