Towards semantic category verification with arbitrary precision

Roussinov, Dmitri; (2011) Towards semantic category verification with arbitrary precision. In: Advances in Information Retrieval Theory. Lecture Notes in Computer Science, Springer, pp. 274-284. ISBN 9783642233173 (https://doi.org/10.1007/978-3-642-23318-0_25)

Full text not available in this repository.Request a copy

Abstract

Many tasks related to or supporting information retrieval, such as query expansion, automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. “coffee” is a drink, “red” is a color, while “steak” is not a drink and “big” is not a color). We present a novel framework to automatically validate a membership in an arbitrary, not a trained a priori semantic category up to a desired level of accuracy. Our approach does not rely on any manually codified knowledge but instead capitalizes on the diversity of topics and word usage in a large corpus (e.g. World Wide Web). Using TREC factoid questions that expect the answer to belong to a specific semantic category, we show that a very high level of accuracy can be reached by automatically identifying more training seeds and more training patterns when needed. We develop a specific quantitative validation model that takes uncertainty and redundancy in the training data into consideration. We empirically confirm the important aspects of our model through ablation studies.