Experiments with document archive size detection

Wu, S. and Gibb, F. and Crestani, F.; Sebastiani, F., ed. (2003) Experiments with document archive size detection. In: Advances in information retrieval. Lecture notes on computer science, 2633 . Springer, Berlin, Germany, pp. 294-304. ISBN 3540012745

[thumbnail of strathprints001912]
Preview
Text. Filename: strathprints001912.pdf
Accepted Author Manuscript

Download (168kB)| Preview

Abstract

The size of a document archive is a very important parameter for resource selection in distributed information retrieval systems. In this paper, we present a method for automatically detecting the size (ie the number of documents) of a document archive, in case the archive itself does not provide such information. In addition, a method for detecting incremental change of the archive size is also presented, which can be useful for deciding if a resource description has become obsolete and needs to be regenerated. An experimental evaluation of these methods shows that they provide quite acurate information.