Role of punctuation in semantic mapping between brain and transformer models

Lamprou, Zenon and Pollick, Frank and Moshfeghi, Yashar; Nicosia, Giuseppe and Giuffrida, Giovanni and Ojha, Varun and La Malfa, Emanuele and La Malfa, Gabriele and Pardalos, Panos and Di Fatta, Giuseppe and Umeton, Renato, eds. (2023) Role of punctuation in semantic mapping between brain and transformer models. In: Machine Learning, Optimization, and Data Science - 8th International Conference, LOD 2022, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) . Springer Science and Business Media Deutschland GmbH, ITA, pp. 458-472. ISBN 9783031258916 (https://doi.org/10.1007/978-3-031-25891-6_35)

[thumbnail of Lamprou-etal-ACAIN-2022-Role-of-punctuation-in-semantic-mapping-between-brain-and-transformer-models] Text. Filename: Lamprou_etal_ACAIN_2022_Role_of_punctuation_in_semantic_mapping_between_brain_and_transformer_models.pdf
Accepted Author Manuscript
Restricted to Repository staff only until 10 March 2025.
License: Strathprints license 1.0

Download (1MB) | Request a copy

Abstract

Modern neural networks specialised in natural language processing (NLP) are not implemented with any explicit rules regarding language. It has been hypothesised that they might learn something generic about language. Because of this property much research has been conducted on interpreting their inner representations. A novel approach has utilised an experimental procedure that uses human brain recordings to investigate if a mapping from brain to neural network representations can be learned. Since this novel approach has been introduced, more advanced models in NLP have been introduced. In this research we are using this novel approach to test four new NLP models to try and find the most brain aligned model. Moreover, in our effort to unravel important information on how the brain processes text semantically, we modify the text in the hope of getting a better mapping out of the models. We remove punctuation using four different scenarios to determine the effect of punctuation on semantic understanding by the human brain. Our results show that the RoBERTa model is most brain aligned. RoBERTa achieves a higher accuracy score on our evaluation than BERT. Our results also show for BERT that when punctuation was removed a higher accuracy was achieved and that as the context length increased the accuracy did not decrease as much as the original results that include punctuation.

ORCID iDs

Lamprou, Zenon, Pollick, Frank and Moshfeghi, Yashar ORCID logoORCID: https://orcid.org/0000-0003-4186-1088; Nicosia, Giuseppe, Giuffrida, Giovanni, Ojha, Varun, La Malfa, Emanuele, La Malfa, Gabriele, Pardalos, Panos, Di Fatta, Giuseppe and Umeton, Renato