Strathprints logo
Strathprints Home | Open Access | Browse | Search | User area | Copyright | Help | Library Home | SUPrimo

The accessibility dimension for structured document retrieval

Roelleke, Thomas and Lalmas, Mounia and Kazai, Gabriella and Ruthven, Ian and Quicker, Stefan (2002) The accessibility dimension for structured document retrieval. In: Advances in Information Retrieval. Lecture Notes in Computer Science, 2291 . Springer, Germany, pp. 284-302. ISBN 978-3-540-43343-9

[img]
Preview
PDF (strathprints002463.pdf)
Download (340Kb) | Preview

    Abstract

    Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values.

    Item type: Book Section
    ID code: 2463
    Keywords: structured document retrieval, probabilistic relational algebra, accessibility dimension, Electronic computers. Computer science
    Subjects: Science > Mathematics > Electronic computers. Computer science
    Department: Faculty of Science > Computer and Information Sciences
    Related URLs:
    Depositing user: Strathprints Administrator
    Date Deposited: 23 Jan 2007
    Last modified: 06 Sep 2014 13:44
    URI: http://strathprints.strath.ac.uk/id/eprint/2463

    Actions (login required)

    View Item

    Fulltext Downloads: