Community question answering platforms vs. Twitter for predicting characteristics of urban neighbourhoods

Saeidi, Marzieh and Venerandi, Alessandro and Capra, Licia and Riedel, Sebastian (2017) Community question answering platforms vs. Twitter for predicting characteristics of urban neighbourhoods. Preprint / Working Paper. arXiv.org, Ithaca, N.Y..

[thumbnail of Saeidi-etal-Arxiv-2017-Comminity-question-answering-platforms-vs-twitter]
Preview
Text. Filename: Saeidi_etal_Arxiv_2017_Comminity_question_answering_platforms_vs_twitter.pdf
Final Published Version

Download (723kB)| Preview

Abstract

In this paper, we investigate whether text from a Community Question Answering (QA) platform can be used to predict and describe real-world attributes. We experiment with predicting a wide range of 62 demographic attributes for neighbourhoods of London. We use the text from QA platform of Yahoo! Answers and compare our results to the ones obtained from Twitter microblogs. Outcomes show that the correlation between the predicted demographic attributes using text from Yahoo! Answers discussions and the observed demographic attributes can reach an average Pearson correlation coefficient of \r{ho} = 0.54, slightly higher than the predictions obtained using Twitter data. Our qualitative analysis indicates that there is semantic relatedness between the highest correlated terms extracted from both datasets and their relative demographic attributes. Furthermore, the correlations highlight the different natures of the information contained in Yahoo! Answers and Twitter. While the former seems to offer a more encyclopedic content, the latter provides information related to the current sociocultural aspects or phenomena.

ORCID iDs

Saeidi, Marzieh, Venerandi, Alessandro ORCID logoORCID: https://orcid.org/0000-0003-4887-0120, Capra, Licia and Riedel, Sebastian;