Picture of person typing on laptop with programming code visible on the laptop screen

World class computing and information science research at Strathclyde...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by University of Strathclyde researchers, including by researchers from the Department of Computer & Information Sciences involved in mathematically structured programming, similarity and metric search, computer security, software systems, combinatronics and digital health.

The Department also includes the iSchool Research Group, which performs leading research into socio-technical phenomena and topics such as information retrieval and information seeking behaviour.

Explore

Updating collection representations for federated search

Shokouhi, M. and Baillie, M. and Azzopardi, L. (2007) Updating collection representations for federated search. In: 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007-07-23 - 2007-07-27. (Unpublished)

Full text not available in this repository. Request a copy from the Strathclyde author

Abstract

To facilitate the search for relevant information across a set of online distributed collections, a federated information retrieval system typically represents each collection, centrally, by a set of vocabularies or sampled documents. Accurate retrieval is therefore related to how precise each representation reflects the underlying content stored in that collection. As collections evolve over time, collection representations should also be updated to reflect any change, however, a current solution has not yet been proposed. In this study we examine both the implications of out-of-date representation sets on retrieval accuracy, as well as proposing three different policies for managing necessary updates. Each policy is evaluated on a testbed of forty-four dynamic collections over an eight-week period. Our findings show that out-of-date representations significantly degrade performance over time, however, adopting a suitable update policy can minimise this problem.