Evaluation of Jensen-Shannon distance over sparse data
Connor, Richard and Cardillo, Franco Alberto and Moss, Robert and Rabitti, Fausto; Brisaboa, Nieves and Pedreira, Oscar and Zezula, Pavel, eds. (2013) Evaluation of Jensen-Shannon distance over sparse data. In: Similarity Search and Applications. Lecture Notes in Computer Science, 8199 . Springer, ESP, pp. 163-168. ISBN 9783642410611 (https://doi.org/10.1007/978-3-642-41062-8_16)
Preview |
Text.
Filename: Connor_etal_LNCS2013_Evaluation_Jensen_Shannon_distance_over_sparse_data.pdf
Accepted Author Manuscript Download (249kB)| Preview |
Abstract
Jensen-Shannon divergence is a symmetrised, smoothed version of Küllback-Leibler. It has been shown to be the square of a proper distance metric, and has other properties which make it an excellent choice for many high-dimensional spaces in R*. The metric as defined is however expensive to evaluate. In sparse spaces over many dimensions the Intrinsic Dimensionality of the metric space is typically very high, making similarity-based indexing ineffectual. Exhaustive searching over large data collections may be infeasible. Using a property that allows the distance to be evaluated from only those dimensions which are non-zero in both arguments, and through the identification of a threshold function, we show that the cost of the function can be dramatically reduced.
ORCID iDs
Connor, Richard ORCID: https://orcid.org/0000-0003-4734-8103, Cardillo, Franco Alberto, Moss, Robert and Rabitti, Fausto; Brisaboa, Nieves, Pedreira, Oscar and Zezula, Pavel-
-
Item type: Book Section ID code: 52372 Dates: DateEvent13 September 2013PublishedSubjects: Science > Mathematics > Electronic computers. Computer science Department: Faculty of Science > Computer and Information Sciences Depositing user: Pure Administrator Date deposited: 31 Mar 2015 07:23 Last modified: 23 Nov 2024 01:24 Related URLs: URI: https://strathprints.strath.ac.uk/id/eprint/52372