Recognizing semantic relations by combining transformers and fully connected models

Roussinov, Dmitri and Sharoff, Serge and Puchnina, Nadezhda; Calzolari, Nicoletta and Béchet, Frédéric and Blache, Philippe and Choukri, Khalid and Cieri, Christopher and Declerck, Thierry and Goggi, Sara and Isahara, Hitoshi and Maegaard, Bente and Mariani, Joseph and Mazo, Hélène and Moreno, Asuncion and Odijk, Jan and Piperidis, Stelios, eds. (2020) Recognizing semantic relations by combining transformers and fully connected models. In: LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings. LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings . European Language Resources Association (ELRA), FRA, pp. 5838-5845. ISBN 9791095546344 (https://www.aclweb.org/anthology/2020.lrec-1.715/)

[thumbnail of Roussinov-etal-LREC-2021-Recognizing-semantic-relations-by-combining-transformers-and-fully-connected-models]
Preview
Text. Filename: Roussinov_etal_LREC_2021_Recognizing_semantic_relations_by_combining_transformers_and_fully_connected_models.pdf
Final Published Version
License: Creative Commons Attribution-NonCommercial 4.0 logo

Download (295kB)| Preview

Abstract

Automatically recognizing an existing semantic relation (e.g. "is a", "part of", "property of", "opposite of" etc.) between two words (phrases, concepts, etc.) is an important task affecting many NLP applications and has been subject of extensive experimentation and modeling. Current approaches to automatically telling if a relation exists between two given concepts X and Y can be grouped into two types: 1) those modeling word-paths connecting X and Y in text and 2) those modeling distributional properties of X and Y separately, not necessary in the proximity to each other. Here, we investigate how both types can be improved and combined. We suggest a distributional approach that is based on an attention-based transformer. We have also developed a novel word path model that combines useful properties of a convolutional network with a fully connected language model. While our transformer-based approach works better, both our models significantly outperform the state-of-the-art within their classes of approaches. We also demonstrate that combining the two approaches results in additional gains since they use somewhat different data sources.