Sentiment intensity prediction using neural word embeddings

Htait, Amal and Azzopardi, Leif; (2021) Sentiment intensity prediction using neural word embeddings. In: ICTIR '21 : Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. ACM, NLD, pp. 93-102. ISBN 9781450386111 (https://doi.org/10.1145/3471158.3472254)

[thumbnail of Htait-Azzopardi-ICTIR-2021-Sentiment-intensity-prediction-using-neural-word-embeddings]
Preview
Text. Filename: Htait_Azzopardi_ICTIR_2021_Sentiment_intensity_prediction_using_neural_word_embeddings.pdf
Accepted Author Manuscript

Download (2MB)| Preview

Abstract

Sentiment analysis is central to the process of mining opinions and attitudes from online texts. While much attention has been paid to the sentiment classification problem, much less work has tried to tackle the problem of predicting the intensity of the sentiment. The go to method is VADER --- an unsupervised lexicon based approach to scoring sentiment. However, such approaches are limited because of the vocabulary mismatch problem. In this paper, we present in detail and evaluate our AWESSOME framework (A Word Embedding Sentiment Scorer Of Many Emotions) for sentiment intensity scoring, that capitalizes on pre-existing lexicons, does not require training and provides fine grained and accurate sentiment intensity scores of words, phrases and text. In our experiments, we used seven Sentiment Collections to evaluate the proposed approach, against lexicon based approaches (e.g., VADER), and supervised methods such as deep learning based approaches (e.g., SentiBERT). The results show that despite not surpassing supervised approaches, the AWESSOME unsupervised approach significantly outperforms existing lexicon approaches and therefore provides a simple and effective approach for sentiment analysis. The AWESSOME framework can be flexibly adapted to cater for different seed lexicons and different neural word embeddings models in order to produce corpus specific lexicons -- without the need for extensive supervised learning or retraining.