Improving BERT-based query-by-document retrieval with multi-task optimization

Abolghasemi, Amin and Verberne, Suzan and Azzopardi, Leif; Hagen, Matthias and Verberne, Suzan and Macdonald, Craig and Seifert, Christin and Balog, Krisztian and Nørvåg, Kjetil and Setty, Vinay, eds. (2022) Improving BERT-based query-by-document retrieval with multi-task optimization. In: Advances in Information Retrieval. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) . Springer, NOR, pp. 3-12. ISBN 9783030997397 (https://doi.org/10.1007/978-3-030-99739-7_1)

[thumbnail of Abolghasemi-etal-Springer-2020-Improving-BERT-based-query-by-document-retrieval]
Preview
Text. Filename: Abolghasemi-etal-Springer-2020-Improving-BERT-based-query-by-document-retrieval.pdf
Accepted Author Manuscript
License: Strathprints license 1.0

Download (1MB)| Preview

Abstract

Query-by-document (QBD) retrieval is an Information Retrieval task in which a seed document acts as the query and the goal is to retrieve related documents – it is particular common in professional search tasks. In this work we improve the retrieval effectiveness of the BERT re-ranker, proposing an extension to its fine-tuning step to better exploit the context of queries. To this end, we use an additional document-level representation learning objective besides the ranking objective when fine-tuning the BERT re-ranker. Our experiments on two QBD retrieval benchmarks show that the proposed multi-task optimization significantly improves the ranking effectiveness without changing the BERT re-ranker or using additional training samples. In future work, the generalizability of our approach to other retrieval tasks should be further investigated.

ORCID iDs

Abolghasemi, Amin, Verberne, Suzan and Azzopardi, Leif ORCID logoORCID: https://orcid.org/0000-0002-6900-0557; Hagen, Matthias, Verberne, Suzan, Macdonald, Craig, Seifert, Christin, Balog, Krisztian, Nørvåg, Kjetil and Setty, Vinay