Measuring distances among graphs en route to graph clustering

Kyosev, Ivan and Paun, Iulia and Moshfeghi, Yashar and Ntarmos, Nikos; Wu, Xintao and Jermaine, Chris and Xiong, Li and Hu, Xiaohua and Kotevska, Olivera and Lu, Siyuan and Xu, Weijia and Aluru, Srinivas and Zhai, Chengxiang and Al-Masri, Eyhab and Chen, Zhiyuan and Saltz, Jeff, eds. (2021) Measuring distances among graphs en route to graph clustering. In: 2020 IEEE International Conference on Big Data. IEEE, USA, pp. 3632-3641. ISBN 9781728162515 (https://doi.org/10.1109/BigData50022.2020.9378333)

[thumbnail of Kyosev_etal_IEEE_ICBD2020_Measuring_distances_among_graphs]

Preview

Text. Filename: Kyosev_etal_IEEE_ICBD2020_Measuring_distances_among_graphs.pdf
Accepted Author Manuscript
License: Strathprints license 1.0
Download (2MB)| Preview

Abstract

The graph data structure offers a highly expressive way of representing many real-world constructs such as social networks, chemical compounds, the world wide web, street maps, etc. In essence, any collection of entities and the relationships between them can be modelled using a graph, thus preserving more information about the real-world objects than a simple vector space model. An issue that arises when operating on collections of graphs, however, is that most statistical analysis and machine learning methods expect their input data to be in the form of multidimensional vectors, where all items can be compared with each other using well-understood metrics such as Euclidean or Manhattan distance. This paper presents a variety of approaches for computing distances between graphs with known node correspondence, with the aim of applying those measures alongside clustering algorithms to discover patterns in a given dataset. The performance of each distance measure is then evaluated through its ability to identify communities of graphs with similar features. We show that because the considered distance metrics highlight different structural properties, the method that produces the highest quality result will depend on the characteristics of the processed graph population.

Share and Export

Item metadata

Item type:	Book Section
ID code:	76345
Dates:	Date Event 19 March 2021 Published 20 October 2020 Accepted
Notes:	© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Subjects:	Science > Mathematics > Electronic computers. Computer science
Department:	Faculty of Science > Computer and Information Sciences
Depositing user:	Pure Administrator
Date deposited:	06 May 2021 10:57
Last modified:	21 Apr 2024 00:12
Related URLs:	https://eprints.gla.ac.uk/226315/
URI:	https://strathprints.strath.ac.uk/id/eprint/76345

CORE (COnnecting REpositories)