Improving BERT-based query-by-document retrieval with multi-task optimization

Abolghasemi, Amin and Verberne, Suzan and Azzopardi, Leif; Hagen, Matthias and Verberne, Suzan and Macdonald, Craig and Seifert, Christin and Balog, Krisztian and Nørvåg, Kjetil and Setty, Vinay, eds. (2022) Improving BERT-based query-by-document retrieval with multi-task optimization. In: Advances in Information Retrieval. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) . Springer, NOR, pp. 3-12. ISBN 9783030997397 (https://doi.org/10.1007/978-3-030-99739-7_1)

[thumbnail of Abolghasemi-etal-Springer-2020-Improving-BERT-based-query-by-document-retrieval]
Preview
Text. Filename: Abolghasemi-etal-Springer-2020-Improving-BERT-based-query-by-document-retrieval.pdf
Accepted Author Manuscript
License: Strathprints license 1.0

Download (1MB)| Preview

Abstract

Query-by-document (QBD) retrieval is an Information Retrieval task in which a seed document acts as the query and the goal is to retrieve related documents – it is particular common in professional search tasks. In this work we improve the retrieval effectiveness of the BERT re-ranker, proposing an extension to its fine-tuning step to better exploit the context of queries. To this end, we use an additional document-level representation learning objective besides the ranking objective when fine-tuning the BERT re-ranker. Our experiments on two QBD retrieval benchmarks show that the proposed multi-task optimization significantly improves the ranking effectiveness without changing the BERT re-ranker or using additional training samples. In future work, the generalizability of our approach to other retrieval tasks should be further investigated.