Algorithmic bias : do good systems make relevant documents more retrievable?

Wilkie, Colin and Azzopardi, Leif; (2017) Algorithmic bias : do good systems make relevant documents more retrievable? In: CIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management. ACM, SGP, pp. 2375-2378. ISBN 9781450349185 (https://doi.org/10.1145/3132847.3133135)

[thumbnail of Wilkie-Azzopardi-CIKM-2017-Algorithmic-bias-do-good-systems-make-relevant-documents-more-retrievable]
Preview
Text. Filename: Wilkie_Azzopardi_CIKM_2017_Algorithmic_bias_do_good_systems_make_relevant_documents_more_retrievable.pdf
Accepted Author Manuscript

Download (208kB)| Preview

Abstract

Algorithmic bias presents a dificult challenge within Information Retrieval. Long has it been known that certain algorithms favour particular documents due to attributes of these documents that are not directly related to relevance. The evaluation of bias has recently been made possible through the use of retrievability, a quanti .able measure of bias. While evaluating bias is relatively novel, the evaluation of performance has been common since the dawn of the Cran.eld approach and TREC. To evaluate performance, a pool of documents to be judged by human assessors is created from the collection.is pooling approach has faced accusations of bias due to the fact that the state of the art algorithms were used to create it, thus the inclusion of biases associated with these algorithms may be included in the pool.e introduction of retrievability has provided a mechanism to evaluate the bias of these pools. This work evaluates the varying degrees of bias present in the groups of relevant and non-relevant documents for topics. The differentiating power of a system is also evaluated by examining the documents from the pool that are retrieved for each topic. The analysis .nds that the systems that perform better, tend to have a higher chance of retrieving a relevant document rather than a non-relevant document for a topic prior to retrieval, indicating that retrieval systems which perform better at TREC are already predisposed to agree with the judgements regardless of the query posed.