StatsComponent
The stats component returns simple statistics for indexed numeric fields within the DocSet.
Parameters
param |
description |
stats |
true – then do stats |
stats.field |
add one parameter for each field that needs statistics |
stats.facet |
Return sub results for values within the given facet. |
Example
With the example data loaded: http://localhost:8983/solr/select?q=*:*&stats=true&stats.field=price&stats.field=popularity&rows=0&indent=true
<lst name="stats"> <lst name="stats_fields"> <lst name="price"> <double name="min">0.0</double> <double name="max">2199.0</double> <double name="sum">5251.2699999999995</double> <long name="count">15</long> <long name="missing">11</long> <double name="sumOfSquares">6038619.160300001</double> <double name="mean">350.08466666666664</double> <double name="stddev">547.737557906113</double> </lst> <lst name="popularity"> <double name="min">0.0</double> <double name="max">10.0</double> <double name="sum">90.0</double> <long name="count">26</long> <long name="missing">0</long> <double name="sumOfSquares">628.0</double> <double name="mean">3.4615384615384617</double> <double name="stddev">3.5578731762756157</double> </lst> </lst> </lst>
Same results faceted on inStock:
&stats.facet=inStock
<lst name="stats"> <lst name="stats_fields"> <lst name="price"> <double name="min">0.0</double> <double name="max">2199.0</double> <double name="sum">5251.2699999999995</double> <long name="count">15</long> <long name="missing">11</long> <double name="sumOfSquares">6038619.160300001</double> <double name="mean">350.08466666666664</double> <double name="stddev">547.737557906113</double> <lst name="facets"> <lst name="inStock"> <lst name="false"> <double name="min">11.5</double> <double name="max">649.99</double> <double name="sum">1161.39</double> <long name="count">4</long> <long name="missing">0</long> <double name="sumOfSquares">653369.2551</double> <double name="mean">290.3475</double> <double name="stddev">324.63444676281654</double> </lst> <lst name="true"> <double name="min">0.0</double> <double name="max">2199.0</double> <double name="sum">4089.879999999999</double> <long name="count">11</long> <long name="missing">0</long> <double name="sumOfSquares">5385249.905200001</double> <double name="mean">371.8072727272727</double> <double name="stddev">621.6592938755265</double> </lst> </lst> </lst> </lst> </lst>
Notes
- The facet field can be selectively applied. That is if you want stats on field "A" and "B", you can facet a on "X" and B on "Y" using &stats.field=A&f.A.stats.facet=X&stats.field=B&f.B.stats.facet=Y
- Warning, as implemented, all facet results are returned, be careful what fields you ask for!
- Multi-valued fields and facets may be slow.
- Computing statistics using stats.facet over a multi-valued field does not work properly. https://issues.apache.org/jira/browse/SOLR-1782
- Multi-value fields rely on UnInvertedField.java for implementation. This is like a FieldCache, so be aware of your memory footprint.
- TrieFields has to use a precisionStep of -1 to avoid using UnInvertedField.java. Consider using one field for doing stats, and one for doing range facetting on.
Results
value |
description |
min |
The minimum value |
max |
The maximum value |
sum |
Sum of all values |
count |
How many (non-null) values |
missing |
How many null values |
sumOfSquares |
Sum of all values squared (useful for stddev) |
mean |
The average (v1+v2...+vN)/N |
stddev |
Standard Deviation – measuring how widely spread the values in a data set are. |