Picture of person typing on laptop with programming code visible on the laptop screen

World class computing and information science research at Strathclyde...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by University of Strathclyde researchers, including by researchers from the Department of Computer & Information Sciences involved in mathematically structured programming, similarity and metric search, computer security, software systems, combinatronics and digital health.

The Department also includes the iSchool Research Group, which performs leading research into socio-technical phenomena and topics such as information retrieval and information seeking behaviour.

Explore

Experiments with document archive size detection

Wu, S. and Gibb, F. and Crestani, F. (2003) Experiments with document archive size detection. In: Advances in information retrieval : 25th European Conference on IR Research, ECIR 2003, Pisa, Italy, April 14-16, 2003 : proceedings. Lecture notes on computer science, 2633 . Springer, Berlin, Germany, pp. 294-304. ISBN 3540012745

[img]
Preview
Text (strathprints001912)
strathprints001912.pdf - Accepted Author Manuscript

Download (168kB) | Preview

Abstract

The size of a document archive is a very important parameter for resource selection in distributed information retrieval systems. In this paper, we present a method for automatically detecting the size (ie the number of documents) of a document archive, in case the archive itself does not provide such information. In addition, a method for detecting incremental change of the archive size is also presented, which can be useful for deciding if a resource description has become obsolete and needs to be regenerated. An experimental evaluation of these methods shows that they provide quite acurate information.