Identification of MIR-Flickr near-duplicate images : a benchmark collection for near-duplicate detection

Connor, Richard and MacKenzie-Leigh, Stewart and Cardillo, Franco Alberto and Moss, Robert (2015) Identification of MIR-Flickr near-duplicate images : a benchmark collection for near-duplicate detection. In: 10th International Conference on Computer Vision Theory and Applications (VISAPP 2015), 2015-03-11 - 2015-03-14. (https://doi.org/10.5220/0005359705650571)

[thumbnail of Connor-etal-VISAPP-2015-identification-of-mir-flickr-near-duplicate-images]

Preview

Text. Filename: Connor_etal_VISAPP_2015_identification_of_mir_flickr_near_duplicate_images.pdf
Accepted Author Manuscript
Download (863kB)| Preview

Abstract

There are many contexts where the automated detection of near-duplicate images is important, for example the detection of copyright infringement or images of child abuse. There are many published methods for the detection of similar and near-duplicate images; however it is still uncommon for methods to be objectively compared with each other, probably because of a lack of any good framework in which to do so. Published sets of near-duplicate images exist, but are typically small, specialist, or generated. Here, we give a new test set based on a large, serendipitously selected collection of high quality images. Having observed that the MIR- Flickr 1M image set contains a significant number of near-duplicate images, we have discovered the majority of these. We disclose a set of 1,958 near-duplicate clusters from within the set, and show that this is very likely to contain almost all of the near-duplicate pairs that exist. The main contribution of this publication is the identification of these images, which may then be used by other authors to make comparisons as they see fit. In particular however, near-duplicate classification functions may now be accurately tested for sensitivity and specificity over a general collection of images.

Share and Export

Item metadata

Item type:	Conference or Workshop Item(Paper)
ID code:	55814
Dates:	Date Event 14 March 2015 Published 23 December 2014 Accepted
Subjects:	Science > Mathematics > Electronic computers. Computer science
Department:	Faculty of Science > Computer and Information Sciences
Depositing user:	Pure Administrator
Date deposited:	09 Mar 2016 10:39
Last modified:	24 Jul 2024 00:47
Related URLs:	Event Related item
URI:	https://strathprints.strath.ac.uk/id/eprint/55814

CORE (COnnecting REpositories)