Picture of athlete cycling

Open Access research with a real impact on health...

The Strathprints institutional repository is a digital archive of University of Strathclyde's Open Access research outputs. Strathprints provides access to thousands of Open Access research papers by Strathclyde researchers, including by researchers from the Physical Activity for Health Group based within the School of Psychological Sciences & Health. Research here seeks to better understand how and why physical activity improves health, gain a better understanding of the amount, intensity, and type of physical activity needed for health benefits, and evaluate the effect of interventions to promote physical activity.

Explore open research content by Physical Activity for Health...

Clustering methods based on variational analysis in the space of measures

Van Lieshout, M.N.M. and Molchanov, I.S. and Zuev, S.A. (2001) Clustering methods based on variational analysis in the space of measures. Biometrika, 88 (4). pp. 1021-1033. ISSN 1464-3510

[img]
Preview
Text (strathprints004606)
strathprints004606.pdf - Accepted Author Manuscript

Download (290kB) | Preview

Abstract

We formulate clustering as a minimisation problem in the space of measures by modelling the cluster centres as a Poisson process with unknown intensity function.We derive a Ward-type clustering criterion which, under the Poisson assumption, can easily be evaluated explicitly in terms of the intensity function. We show that asymptotically, i.e. for increasing total intensity, the optimal intensity function is proportional to a dimension-dependent power of the density of the observations. For fixed finite total intensity, no explicit solution seems available. However, the Ward-type criterion to be minimised is convex in the intensity function, so that the steepest descent method of Molchanov and Zuyev (2001) can be used to approximate the global minimum. It turns out that the gradient is similar in form to the functional to be optimised. If we discretise over a grid, the steepest descent algorithm at each iteration step increases the current intensity function at those points where the gradient is minimal at the expense of regions with a large gradient value. The algorithm is applied to a toy one-dimensional example, a simulation from a popular spatial cluster model and a real-life dataset from Strauss (1975) concerning the positions of redwood seedlings. Finally, we discuss the relative merits of our approach compared to classical hierarchical and partition clustering techniques as well as to modern model based clustering methods using Markov point processes and mixture distributions.