...
- Absolute Truth / Matrix / Grid / TREC / Relevancy Assertions
- The correct answers for each search are known ahead of time
- Humans judges often decide these correct answers, stored as Relevancy Assertions
- Can be labor intensive to setup
- ... continued on the Relevancy Assertion Testing page
- AB Testing / User Preference
- Tracks explicit or implicit preferences between engines A/B
- Often dispenses with the notion of the "correct" answer
- Can be easier to setup, but some fear the best answers will be missed by both engines
- continued on the AB Testing page
Beyond Precision and Recall: How Engines are Judged
...