cwl_eval : An evaluation tool for information retrieval

Azzopardi, Leif and Thomas, Paul and Moffat, Alistair; (2019) cwl_eval : An evaluation tool for information retrieval. In: SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval . ACM, FRA, pp. 1321-1324. ISBN 9781450361729 (https://doi.org/10.1145/3331184.3331398)

[thumbnail of Azzopardi-etal-SIGIR-2019-cwl_eval-an-evaluation-tool-for-information-retrieval]
Preview
Text. Filename: Azzopardi_etal_SIGIR_2019_cwl_eval_an_evaluation_tool_for_information_retrieval.pdf
Accepted Author Manuscript

Download (729kB)| Preview

Abstract

We present a tool (“cwl_eval”) which unifies many metrics typically used to evaluate information retrieval systems using test collections. In the C/W/L framework metrics are specified via a single function which can be used to derive a number of related measurements: Expected Utility per item, Expected Total Utility, Expected Cost per item, Expected Total Cost, and Expected Depth. The C/W/L framework brings together several independent approaches for measuring the quality of a ranked list, and provides a coherent user model-based framework for developing measures based on utility (gain) and cost.Here we outline the C/W/L measurement framework; describe the cwl_eval architecture; and provide examples of how to use it. We provide implementations of a number of recent metrics, including Time Biased Gain, U-Measure, Bejewelled Measure, and the Information Foraging Based Measure, as well as previous metrics such as Precision, Average Precision, Discounted Cumulative Gain, Rank-Biased Precision, and INST. By providing state-of-the-art and traditional metrics within the same framework, we promote a standardised approach to evaluating search effectiveness.