Function queries enable you to generate a relevancy score using the actual value of one or more numeric fields. Function queries are supported by the DisMax, Extended DisMax, and standard query parsers.
Function queries use functions. The functions can be a constant (numeric or string literal), a field, another function or a parameter substitution argument. You can use these functions to modify the ranking of results for users. These could be used to change the ranking of results based on a user's location, or some other calculation.
Function query topics covered in this section:
Using Function Query
Functions must be expressed as function calls (for example, sum(a,b)
instead of simply a+b
).
There are several ways of using function queries in a Solr query:
Via an explicit QParser that expects function arguments, such
func
orfrange
. For example:In a Sort expression. For example:
Add the results of functions as pseudofields to documents in query results. For instance, for:
the output would be:
Use in a parameter that is explicitly for specifying functions, such as the EDisMax query parser's
boost
param, or DisMax query parser'sbf
(boost function) parameter. (Note that thebf
parameter actually takes a list of function queries separated by white space and each with an optional boost. Make sure you eliminate any internal white space in single function queries when usingbf
). For example:Introduce a function query inline in the lucene QParser with the
_val_
keyword. For example:
Only functions with fast random access are recommended.
Available Functions
The table below summarizes the functions available for function queries.
Function  Description  Syntax Examples 

abs  Returns the absolute value of the specified value or function. 

"constant"  Specifies a floating point constant. 

def 


div  Divides one value or function by another. div(x,y) divides x by y. 

dist  Return the distance between two vectors (points) in an ndimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. 

docfreq(field,val)  Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index). 

field  Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself with out using the When using docValues, an optional 2nd argument can be specified to select the " 0 is returned for documents without a value in the field.  These 3 examples are all equivalent:
The last form is convinient when your field name is atypical:
For multivalued docValues fields:

hsin  The Haversine distance calculates the distance between two points on a sphere when traveling along the sphere. The values must be in radians. 

idf  Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also 

if  Enables conditional function queries. In
An expression can be any function which outputs boolean values, or even functions returning numeric values, in which case value 0 will be interpreted as false, or strings, in which case empty string is interpreted as false. 

linear  Implements 

log  Returns the log base 10 of the specified function. 

map  Maps any values of an input function x that fall within min and max inclusive to the specified target. The arguments min and max must be constants. The arguments 

max  Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: (Use the 

maxdoc  Returns the number of documents in the index, including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). 

min  Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: (Use the  min(myfield,myotherfield,0) 
ms  Returns milliseconds of difference between its arguments. Dates are relative to the Unix or POSIX time epoch, midnight, January 1, 1970 UTC. Arguments may be the name of an indexed


norm(field)  Returns the "norm" stored in the index for the specified field. This is the product of the index time boost and the length normalization factor, according to the Similarity for the field. 

numdocs  Returns the number of documents in the index, not including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). 

ord  Returns the ordinal of the indexed field value within the indexed list of terms for that field in Lucene index order (lexicographically ordered by unicode value), starting at 1. In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multivalued). 0 is returned for documents without a value in the field.
See also 

payload  Returns the float value computed from the decoded payloads of the term specified. The return value is computed using the
 payload(payloaded_field_dpf,term,0.0,first) 
pow  Raises the specified base to the specified power. 

product  Returns the product of multiple values or functions, which are specified in a commaseparated list. 

query  Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter dereferencing 

recip  Performs a reciprocal function with When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is 

rord  Returns the reverse ordering of that returned by 

scale  Scales values of the function x such that they fall between the specified The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses 0.0 values for these cases. This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from. In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range, as shown here: 

sqedist  The Square Euclidean distance calculates the 2norm (Euclidean distance) but does not take the square root, thus saving a fairly expensive operation. It is often the case that applications that care about Euclidean distance do not need the actual distance, but instead can use the square of the distance. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. 

sqrt  Returns the square root of the specified value or function. 

strdist  Calculate the distance between two strings. Uses the Lucene spell checker 

sub  Returns xy from sub(x,y). 

sum  Returns the sum of multiple values or functions, which are specified in a commaseparated list. 

sumtotaltermfreq  Returns the sum of  If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A): 
termfreq  Returns the number of times the term appears in the field for that document. 

tf  Term frequency; returns the term frequency factor for the given term, using the Similarity for the field. The 

top  Causes the function query argument to derive its values from the toplevel IndexReader containing all parts of an index. For example, the ordinal of a value in a single segment will be different from the ordinal of that same value in the complete index. The 

totaltermfreq  Returns the number of times the term appears in the field in the entire index. (Aliases 

The following functions are boolean – they return true or false. They are mostly useful as the first argument of the if
function, and some of these can be combined. If used somewhere else, it will yield a '1' or '0'.
Function  Description  Syntax Examples 

and  Returns a value of true if and only if all of its operands evaluate to true. 

or  A logical disjunction. 

xor  Logical exclusive disjunction, or one or the other but not both. 

not  The logically negated value of the wrapped function. 

exists  Returns TRUE if any member of the field exists. 

gt, gte, lt, lte, eq  5 comparison functions: Greater Than, Greater Than or Equal, Less Than, Less Than or Equal, Equal  if(lt(ms(mydatefield),315569259747),0.8,1) translates to this pseudocode: if mydatefield < 315569259747 then 0.8 else 1 
Example Function Queries
To give you a better understanding of how function queries can be used in Solr, suppose an index stores the dimensions in meters x,y,z of some hypothetical boxes with arbitrary names stored in field boxname
. Suppose we want to search for box matching name findbox
but ranked according to volumes of boxes. The query parameters would be:
q=boxname:findbox _val_:"product(x,y,z)"
This query will rank the results based on volumes. In order to get the computed volume, you will need to request the score
, which will contain the resultant volume:
&fl=*, score
Suppose that you also have a field storing the weight of the box as weight
. To sort by the density of the box and return the value of the density in score, you would submit the following query:
Sort By Function
You can sort your query results by the output of a function. For example, to sort results by distance, you could enter:
Sort by function also supports pseudofields: fields can be generated dynamically and return results as though it was normal field in the index. For example,
&fl=id,sum(x, y),score
would return:
Related Topics
12 Comments
Ahmet Arslan
Typo (unbalanced quotes and missing space) in example function queries section.
q=boxname:findbox_val_:"product(product(x,y),z) => q=boxname:findbox _val_:"product(product(x,y),z)"
http://localhost:8983/solr/select/?q=boxname:findbox_val_div(weight,product(product(x,y),z))" => http://localhost:8983/solr/select/?q=boxname:findbox _val_:"div(weight,product(product(x,y),z))"
Hoss Man
thanks  fixed
Shreejay Nair
The syntax given IF statement is incorrect. The '==' operator is not valid here. Discussed with hoss on IRC.
"==" is't valid syntax in the if ... you need to use a nested function that returns a boolean (or an integer where 0 means false)
Hoss Man
thanks  fixed
Hoss Man
Hmmm... "vector()" doesn't seem to be documented here ... need to add that, and while at it: audit the full list of documented functions compared to the list of default ValueSourceParsers.
Cassandra Targett
Is this page missing the 'mod' function?
Emmanuel Stalling
Yes, mod is missing. Also absent:
literal (though it is mentioned in the context of the first paragraph as string literals being a type of constant) ,
currency (may be premature since its implementation is Solr 4.2)
min (though 'max' is present)
Grant Ingersoll
I think all of the Math.* ones are missing, no? Also, in addition to hsin, there is ghhsin (geohash based haversine). Also missing is the geohash method.
Brian Maltzan
Should the 4th distance function example be Manhattan?
Shalin Shekhar Mangar
You're right. Fixed, thanks!
Eric Lavault
Missing parenthesis in the syntax example of the boolean functions gt, gte, lt, lte, eq :
David Smiley
Fixed; thanks for reporting.