Picture of a black hole

Strathclyde Open Access research that creates ripples...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of research papers by University of Strathclyde researchers, including by Strathclyde physicists involved in observing gravitational waves and black hole mergers as part of the Laser Interferometer Gravitational-Wave Observatory (LIGO) - but also other internationally significant research from the Department of Physics. Discover why Strathclyde's physics research is making ripples...

Strathprints also exposes world leading research from the Faculties of Science, Engineering, Humanities & Social Sciences, and from the Strathclyde Business School.

Discover more...

A bounded distance metric for comparing tree structure

Connor, R. and Simeoni, F. and Iakovos, M. and Moss, R. (2011) A bounded distance metric for comparing tree structure. Information Systems, 36 (4). pp. 748-764. ISSN 0306-4379

Full text not available in this repository. (Request a copy from the Strathclyde author)

Abstract

Comparing tree-structured data for structural similarity is a recurring theme and one on which much effort has been spent. Most approaches so far are grounded, implicitly or explicitly, in algorithmic information theory, being approximations to an information distance derived from Kolmogorov complexity. In this paper we propose a novel complexity metric, also grounded in information theory, but calculated via Shannon's entropy equations. This is used to formulate a directly and efficiently computable metric for the structural difference between unordered trees. The paper explains the derivation of the metric in terms of information theory, and proves the essential property that it is a distance metric. The property of boundedness means that the metric can be used in contexts such as clustering, where second-order comparisons are required. The distance metric property means that the metric can be used in the context of similarity search and metric spaces in general, allowing trees to be indexed and stored within this domain. We are not aware of any other tree similarity metric with these properties.