Picture of smart phone in human hand

World leading smartphone and mobile technology research at Strathclyde...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by University of Strathclyde researchers, including by Strathclyde researchers from the Department of Computer & Information Sciences involved in researching exciting new applications for mobile and smartphone technology. But the transformative application of mobile technologies is also the focus of research within disciplines as diverse as Electronic & Electrical Engineering, Marketing, Human Resource Management and Biomedical Enginering, among others.

Explore Strathclyde's Open Access research on smartphone technology now...

Towards semantic category verification with arbitrary precision

Roussinov, Dmitri (2011) Towards semantic category verification with arbitrary precision. In: Advances in Information Retrieval Theory. Lecture Notes in Computer Science, Springer, pp. 274-284. ISBN 9783642233173

Full text not available in this repository. (Request a copy from the Strathclyde author)

Abstract

Many tasks related to or supporting information retrieval, such as query expansion, automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. “coffee” is a drink, “red” is a color, while “steak” is not a drink and “big” is not a color). We present a novel framework to automatically validate a membership in an arbitrary, not a trained a priori semantic category up to a desired level of accuracy. Our approach does not rely on any manually codified knowledge but instead capitalizes on the diversity of topics and word usage in a large corpus (e.g. World Wide Web). Using TREC factoid questions that expect the answer to belong to a specific semantic category, we show that a very high level of accuracy can be reached by automatically identifying more training seeds and more training patterns when needed. We develop a specific quantitative validation model that takes uncertainty and redundancy in the training data into consideration. We empirically confirm the important aspects of our model through ablation studies.