Picture of athlete cycling

Open Access research with a real impact on health...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by Strathclyde researchers, including by researchers from the Physical Activity for Health Group based within the School of Psychological Sciences & Health. Research here seeks to better understand how and why physical activity improves health, gain a better understanding of the amount, intensity, and type of physical activity needed for health benefits, and evaluate the effect of interventions to promote physical activity.

Explore open research content by Physical Activity for Health...

Quantifying the specificity of near-duplicate image classification functions

Connor, Richard and Cardillo, Franco Alberto (2016) Quantifying the specificity of near-duplicate image classification functions. In: 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2016-02-27 - 2016-02-29.

[img]
Preview
Text (Connor-Cardillo-VISAPP-2016-quantifying-the-specificity-of-near-duplicate-image-classification-functions)
Connor_Cardillo_VISAPP_2016_quantifying_the_specificity_of_near_duplicate_image_classification_functions.pdf - Accepted Author Manuscript

Download (787kB) | Preview

Abstract

There are many published methods for detecting similar and near-duplicate images. Here, we consider their use in the context of unsupervised near-duplicate detection, where the task is to find a (relatively small) near-duplicate intersection of two large candidate sets. Such scenarios are of particular importance in forensic near-duplicate detection. The essential properties of a such a function are: performance, sensitivity, and specificity. We show that, as collection sizes increase, then specificity becomes the most important of these, as without very high specificity huge numbers of false positive matches will be identified. This makes even very fast, highly sensitive methods completely useless. Until now, to our knowledge, no attempt has been made to measure the specificity of near-duplicate finders, or even to compare them with each other. Recently, a benchmark set of near-duplicate images has been established which allows such assessment by giving a near-duplicate ground truth over a large general image collection. Using this we establish a methodology for calculating specificity. A number of the most likely candidate functions are compared with each other and accurate measurement of sensitivity vs. specificity are given. We believe these are the first such figures be to calculated for any such function.