Estimating the Number of Clusters Based on Sequential Clustering Algorithms
The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms also require as an input parameter, the number of clusters the induced clustering shou...
Saved in:
Published in: | 2016 5th Brazilian Conference on Intelligent Systems (BRACIS) pp. 229 - 234 |
---|---|
Main Author: | |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-10-2016
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The main goal of clustering algorithms is to organize a given set of data patterns into groups (clusters) and their main strategy is to group patterns based on their similarity. However, some clustering algorithms also require as an input parameter, the number of clusters the induced clustering should have, or then, a threshold value used for limiting for the number of induced clusters. Both, the number of cluster as well a threshold value are often unknown, however it is well-known that results of clustering tasks can be very sensitive to them. This work presents a method for empirically estimating both values. The method is based on multiple runs of sequential clustering algorithms, by using increasing threshold values. Results from experiments conducted using several data domains from two repositories, the UCI and the Keel, as well as a few artificially created data, are presented and a comparative analysis is carried out, as evidence of the good estimates on both values given by the method. |
---|---|
DOI: | 10.1109/BRACIS.2016.050 |