Shannon's entropy of partitions determined by hierarchical clustering trees in asymmetry and dimension identification

In the multivariate statistics community, it is commonly acknowledged that among the hierarchical clustering tree (HCT) procedures, the single linkage rule for inter-cluster distance, tends to produce trees which are significantly more asymmetric than those obtained using other rules such as complet...

Full description

Saved in:
Bibliographic Details
Published in:Communications in statistics. Simulation and computation Vol. 51; no. 10; pp. 5954 - 5966
Main Authors: Corredor, J. S., Quiroz, A. J.
Format: Journal Article
Language:English
Published: Philadelphia Taylor & Francis 03-10-2022
Taylor & Francis Ltd
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the multivariate statistics community, it is commonly acknowledged that among the hierarchical clustering tree (HCT) procedures, the single linkage rule for inter-cluster distance, tends to produce trees which are significantly more asymmetric than those obtained using other rules such as complete linkage, for instance. We consider the use of Shannon's entropy of the partitions determined by HCTs as a measure of the asymmetry of the clustering trees. On a different direction, our simulations show an unexpected relationship between Shannon's entropy of partitions and dimension of the data. Based on this observation a procedure for intrinsic dimension identification based on entropy of partitions is proposed and studied. A theoretical result is established for the dimension identification method stating that, locally, for continuous data on a d-dimensional manifold, the entropy of partitions behaves as if the local data were uniformly sampled from the unit ball of Evaluation on simulated examples shows that the method proposed compares favorably with other procedures for dimension identification available in the literature.
ISSN:0361-0918
1532-4141
DOI:10.1080/03610918.2020.1788586