A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem
The field of cluster analysis is primarily concerned with the sorting of data points into different clusters so as to optimize a certain criterion. Rapid advances in technology have made it possible to address clustering problems via optimization theory. In this paper, we present a global optimizati...
Saved in:
Published in: | Journal of global optimization Vol. 32; no. 2; pp. 281 - 306 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Dordrecht
Springer Nature B.V
01-06-2005
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The field of cluster analysis is primarily concerned with the sorting of data points into different clusters so as to optimize a certain criterion. Rapid advances in technology have made it possible to address clustering problems via optimization theory. In this paper, we present a global optimization algorithm to solve the hard clustering problem, where each data point is to be assigned to exactly one cluster. The hard clustering problem is formulated as a nonlinear program, for which a tight linear programming relaxation is constructed via the Reformulation-Linearization Technique (RLT) in concert with additional valid inequalities that serve to defeat the inherent symmetry in the problem. This construct is embedded within a specialized branch-and-bound algorithm to solve the problem to global optimality. Pertinent implementation issues that can enhance the efficiency of the branch-and-bound algorithm are also discussed. Computational experience is reported using several standard data sets found in the literature as well as using synthetically generated larger problem instances. The results validate the robustness of the proposed algorithmic procedure and exhibit its dominance over the popular k-means clustering technique. Finally, a heuristic procedure to obtain a good quality solution at a relative ease of computational effort is also described. [PUBLICATION ABSTRACT] |
---|---|
ISSN: | 0925-5001 1573-2916 |
DOI: | 10.1007/s10898-004-2706-7 |