External Links
Google Scholar
provided by
German Research Center for Artificial Intelligence
with support by
as well as by

Information Retrieval Evaluation

definition: The task of Information Retrieval (IR) systems is to find as many documents as possible that are relevant to a query, and as few as possible irrelevant documents. IR systems are evaluated by making use of a collection of documents, a set of queries, and a set of relevance judgements for document/query pairs. Text collections typically comprise several gigabytes of data; some terabyte-sized collections are available. The fundamental measures in information retrieval are precision and recall. Precision is the proportion of relevant documents among all documents found by an IR system. Recall is the proportion of relevant documents found by a system among all relevant documents in the test collection. Since 1992, the National Institute of Standards (NIST) has organised the Text Retrieval Conferences (TREC), which distributes evaluation data to different research groups and analyses results submitted by the groups. The TREC evaluations comprise various tracks, including spoken document retrieval, interactive retrieval, web retrieval, and question answering. Since 2000, the Cross-Language Evaluation Forum (CLEF) organises evaluations for cross-language retrieval systems. Large-scale evaluation compaigns such as TREC and CLEF are regarded as having a great influence on progress of the field of IR.
See also the corresponding HLT Survey chapter: http://www.lt-world.org/hlt_survey/ltw-chapter13-2.pdf
related person(s):
related publication(s):

TREC publications

Proceedings of LREC