| Apache Open Relevance > Index > ExistingCollections |
Here are some existing test collections with queries and relevance judgements that can be downloaded from the internet. Perhaps it would be nice to improve the lucene benchmark package to be able to easily download these collections and run the evaluations?
Hamshahri: http://ece.ut.ac.ir/dbrg/Hamshahri/
Mahak: http://ce.sharif.edu/~shesmail/Mahak/
Several small ones here: http://www.cs.utk.edu/~lsi/corpa.html
LISA: http://ir.dcs.gla.ac.uk/resources/test_collections/lisa/
NPL: http://ir.dcs.gla.ac.uk/resources/test_collections/npl/
Trec-5 confusion: http://trec.nist.gov/data/t5_confusion.html
Trec-9 filtering: http://trec.nist.gov/data/t9_filtering.html
braun corpus: http://ilps.science.uva.nl/resources/hdr
Tempo and Kompas: http://ilps.science.uva.nl/resources/bahasa
Jelita Asian's corpus: http://goanna.cs.rmit.edu.au/~jelita/corpus.html
> So I hope this use of your corpus is acceptable to you (it is not for any
> commercial purpose, just to improve lucene).
>
Yes, that is all right. That is what my corpus made for ![]()