Experiments with document archive size detection

Wu, S. and Gibb, F. and Crestani, F.; Sebastiani, F., ed. (2003) Experiments with document archive size detection. In: Advances in information retrieval : 25th European Conference on IR Research, ECIR 2003, Pisa, Italy, April 14-16, 2003 : proceedings. Lecture notes on computer science, 2633 . Springer, Berlin, Germany, pp. 294-304. ISBN 3540012745

Text (strathprints001912)
Accepted Author Manuscript

Download (168kB)| Preview


    The size of a document archive is a very important parameter for resource selection in distributed information retrieval systems. In this paper, we present a method for automatically detecting the size (ie the number of documents) of a document archive, in case the archive itself does not provide such information. In addition, a method for detecting incremental change of the archive size is also presented, which can be useful for deciding if a resource description has become obsolete and needs to be regenerated. An experimental evaluation of these methods shows that they provide quite acurate information.