A test collection for evaluating retrieval of studies for inclusion in systematic reviews

Scells, Harrisen and Zuccon, Guido and Koopman, Bevan and Deacon, Anthony and Azzopardi, Leif and Geva, Shlomo; (2017) A test collection for evaluating retrieval of studies for inclusion in systematic reviews. In: SIGIR '17 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, JPN, pp. 1237-1240. ISBN 9781450350228 (https://doi.org/10.1145/3077136.3080707)

[thumbnail of Scells-etal-SIGIR2017-A-test-collection-for-evaluating-retrieval-of-studies-for-inclusion]
Preview
Text. Filename: Scells_etal_SIGIR2017_A_test_collection_for_evaluating_retrieval_of_studies_for_inclusion.pdf
Accepted Author Manuscript

Download (602kB)| Preview

Abstract

This paper introduces a test collection for evaluating the effectiveness of different methods used to retrieve research studies for inclusion in systematic reviews. Systematic reviews appraise and synthesise studies that meet specific inclusion criteria. Systematic reviews intended for a biomedical science audience use boolean queries with many, often complex, search clauses to retrieve studies; these are then manually screened to determine eligibility for inclusion in the review. This process is expensive and time consuming. The development of systems that improve retrieval effectiveness will have an immediate impact by reducing the complexity and resources required for this process. Our test collection consists of approximately 26 million research studies extracted from the freely available MEDLINE database, 94 review (query) topics extracted from Cochrane systematic reviews, and corresponding relevance assessments. Tasks for which the collection can be used for information retrieval system evaluation are described and the use of the collection to evaluate common baselines within one such task is demonstrated. The test collection is available at https://github.com/ielab/SIGIR2017-PICO-Collection.