Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

6.6 Draft Ref Guide Topics

Meta-Documentation

This Unreleased Guide Will Cover Apache Solr 6.6

Skip to end of metadata
Go to start of metadata

Query Re-Ranking allows you to run a simple query (A) for matching documents and then re-rank the top N documents using the scores from a more complex query (B). Since the more costly ranking from query B is only applied to the top N documents it will have less impact on performance then just using the complex query B by itself – the trade off is that documents which score very low using the simple query A may not be considered during the re-ranking phase, even if they would score very highly using query B.

Specifying A Ranking Query

A Ranking query can be specified using the "rq" request parameter.  The "rq" parameter must specify a query string that when parsed, produces a RankQuery. Three rank queries are currently included in the Solr distribution. You can also configure a custom QParserPlugin you have written, but most users can just use a parser provided with Solr.

ParserQParserPlugin class
rerankReRankQParserPlugin
xportExportQParserPlugin
 ltr LTRQParserPlugin

ReRank Query Parser

The "rerank" parser wraps a query specified by an local parameter, along with additional parameters indicating how many documents should be re-ranked, and how the final scores should be computed:

Parameter

Default

Description

reRankQuery(Mandatory)

The query string for your complex ranking query - in most cases a variable will be used to refer to another request parameter.

reRankDocs200The number of top N documents from the original query that should be re-ranked. This number will be treated as a minimum, and may be increased internally automatically in order to rank enough documents to satisfy the query (ie: start+rows)
reRankWeight2.0A multiplicative factor that will be applied to the score from the reRankQuery for each of the top matching documents, before that score is added to the original score

In the example below, the top 1000 documents matching the query "greetings" will be re-ranked using the query "(hi hello hey hiya)". The resulting scores for each of those 1000 documents will be 3 times their score from the "(hi hello hey hiya)", plus the score from the original "greetings" query:

If a document matches the original query, but does not match the re-ranking query, the document's original score will remain.

LTR Query Parser

The "ltr" stands for Learning To Rank, please see Learning To Rank for more detailed information.

Combining Ranking Queries With Other Solr Features

The "rq" parameter and the re-ranking feature in general works well with other Solr features. For example, it can be used in conjunction with the collapse parser to re-rank the group heads after they've been collapsed. It also preserves the order of documents elevated by the elevation component. And it even has its own custom explain so you can see how the re-ranking scores were derived when looking at debug information.

  • No labels

6 Comments

  1. Joel: as some one unfamiliar with the code, and trying to view this doc from the perspective of a solr user, i'm having a lot of trouble making sense of this page on it's own - or even in conjunction with the RankQuery API page

    If I'm understanding correctly, the key things users need to undersstand about query reranking are:

      • use the "rq" param to indicate that you want re-ranking done (i guess this is interpreted by SearchHandler?)

      • when using rq, the "rerank" parser must be specified
        • is rerank the default parser for "rq"?
        • in general any parser that returns RankQuery is ok, but rerank is the only one (i think?) included with Solr that does so, and writing custom plugins is usually outside the score of the ref guide
      • reRankDocs is a localparam that the "rerank" parser looks for, telling you how many of the top scoring docs you want to re-rank
        • does it have a default value?
      • reRankQuery is a localparam that the "rerank" parser looks for and is a nested query that decides the scores of the docs that get re-ranked
        • why isn't this just "v" like other parser wrappers?
        • since it's not v, how does this parser behalve if you specify a body? ie: what does this do with the "foo" body? ....
          • rq={!rerank reRankDocs=100 reRankQuery=$rqq reRankWeight=3}foo
      • reRankWeight is a localparam that the "rerank" parser looks for that affects the score in some way
        • how exactly does this affect things? is it optional? what is the default value?

    In general, it feels like this page should be added as a child of Searching with a very cursory mention of ReRankQParserPlugin added to Other Parsers that just mentions it's existence and links back here – because (i don't think?) there are no other uses for the "rerank" parser ... correct?

     

    can you please verify some of the assumptions i'm making here, and provide answers for some of these questions so we can think about how to refactor/refine this page?

    1. I went ahead and revised the page after reviewing hte code to answer my questions

  2. Hoss, sorry it took so long to respond to this. The page looks great. I just made an update to this page based on this thread:

    http://www.mail-archive.com/solr-user@lucene.apache.org/msg100870.html

     

     

  3. The result paging section of this page can be removed.

    This issue was resolved in SOLR-6323.

    -- Done

     

  4. I have the data with stores with products. I want to search the products with nearest location. But the sorting gives me the data with all the datas from the nearest location. But I want the products data to be displayed as some data from a store and then next data from the next store............I mean to mixed up products data from nearest store and next stores...How do I do that?

  5. Just to add to the below query....

    I have the data with stores with products. I want to search the products with nearest location. But the sorting gives me the data with all the datas from the nearest location. But I want the products data to be displayed as some data from a store and then next data from the next store............I mean to mixed up products data from nearest store and next stores...How do I do that?

     

    I have the data like these.......

     Store Id, Product Id, Distance , Rank

    Store1, 12345, distance-1.5, rank-2

    Store1, 12346, distance-1.5,rank-85

    Store1, 12347, distance-1.5,rank-44

    Store1, 12348, distance-1.5,rank-3

    Store1, 12349, distance-1.5,rank-7

    store2, 23453, distance-2.6,rank-1

    store2, 23454, distance-2.6,rank-55

    store2, 23455, distance-2.6,rank-25

    store2, 23456, distance-2.6,rank-5

    store2, 23457, distance-2.6,rank-4

    I want the result like these..

     Store Id, Product Id, Distance , Rank

    Store1, 12345, distance-1.5, rank-2

    Store1, 12348, distance-1.5,rank-3

    Store1, 12349, distance-1.5,rank-7

    store2, 23453, distance-2.6,rank-1

    store2, 23457, distance-2.6,rank-4

    store2, 23456, distance-2.6,rank-5

    Please help me to find out a solution to this???