Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm

Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups...

Full description

Saved in:
Bibliographic Details
Published in:Journal of intelligent systems Vol. 32; no. 1; pp. 99 - 106
Main Authors: Al-kababchee, Sarah Ghanim Mahmood, Algamal, Zakariya Yahya, Qasim, Omar Saber
Format: Journal Article
Language:English
Published: Berlin De Gruyter 16-02-2023
Walter de Gruyter GmbH
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the -means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of -means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support.
AbstractList Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the K-means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of K-means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support.
Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the -means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of -means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support.
Abstract Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the K -means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of K -means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support.
Author Qasim, Omar Saber
Al-kababchee, Sarah Ghanim Mahmood
Algamal, Zakariya Yahya
Author_xml – sequence: 1
  givenname: Sarah Ghanim Mahmood
  surname: Al-kababchee
  fullname: Al-kababchee, Sarah Ghanim Mahmood
  email: sarahghanim@uohamdaniya.edu.iq
  organization: Department of Mathematics, Education College, University of AL-Hamdaniya, 41019 Bartella, Iraq
– sequence: 2
  givenname: Zakariya Yahya
  orcidid: 0000-0002-0229-7958
  surname: Algamal
  fullname: Algamal, Zakariya Yahya
  email: zakariya.algamal@uomosul.edu.iq
  organization: College of Engineering, University of Warith Al-Anbiyaa, 56001 Karbala, Iraq
– sequence: 3
  givenname: Omar Saber
  surname: Qasim
  fullname: Qasim, Omar Saber
  email: omar.saber@uomosul.edu.iq
  organization: Department of Mathematics, University of Mosul, 41002 Mosul, Iraq
BookMark eNp1kc1P3DAQxa2KSqWUc6-WOKfYTvzVG0IUVkXqBRA3a9Z2gleJvdiJ0Pavx8tWtJfOZZ6seT-P_T6jo5iiR-grJd8op_x8E8quNIww1hDWkg_omFFNqxaPR__oT-i0lA2p1WnKFT9GD1fxCaL1k48zTj3-2UweYsF2XMrsc4gDDhGvw4AdzIDXULzDKWL_vIQxrHNYJpy2c5jCb58xjEPKYX6avqCPPYzFn_7pJ-j-x9Xd5U1z--t6dXlx29iOybnRnNhOyK63WlOhmBJWeAuOt8BpJwVzklnWya6t64IDSTSvz2WOee4sUe0JWh24LsHGbHOYIO9MgmDeDlIeDOQ52NEb2VreMXCkV7byWkWJkL1TyktLe9tW1tmBtc3pefFlNpu05FjXN0xKoZVWQtep88OUzamU7Pv3Wykx-yzMWxZmn4XZZ1Ed3w-OFxjrlzo_5GVXxV_8_5yMtq94KJGZ
CitedBy_id crossref_primary_10_1080_03610918_2023_2249271
Cites_doi 10.1080/00949655.2020.1822358
10.1109/ICICA.2014.38
10.3390/electronics8101130
10.1016/j.neucom.2012.04.025
10.1145/1497577.1497578
10.1016/j.patrec.2009.09.011
10.1016/j.asoc.2018.05.045
10.1007/s00357-019-09342-4
10.1007/s10462-013-9400-4
10.1109/IGARSS.2009.5417707
10.1007/s10462-019-09682-y
10.1088/1742-6596/1897/1/012004
10.1080/1062936X.2020.1818616
10.1016/S1001-0742(09)60082-6
10.1002/9780470977811
10.1016/j.eswa.2014.03.021
10.1016/j.ins.2012.08.023
10.1007/978-3-642-04005-4
10.1016/j.chemolab.2021.104288
10.1016/j.asoc.2015.09.045
10.1016/j.knosys.2021.107769
10.1016/j.knosys.2019.105190
10.1007/978-3-662-08968-2_16
10.1016/B978-0-12-405163-8.00009-0
10.1016/j.knosys.2020.106167
10.1016/j.eswa.2018.09.015
10.1016/j.engappai.2016.11.003
10.1016/j.patrec.2013.11.012
10.1016/j.knosys.2014.03.015
10.1007/s11047-020-09809-z
10.1080/03610926.2021.1872639
10.1016/j.future.2020.08.031
10.1080/1062936X.2022.2064546
10.1007/s12652-019-01445-5
ContentType Journal Article
Copyright 2023. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2023. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
JQ2
DOA
DOI 10.1515/jisys-2022-0230
DatabaseName CrossRef
ProQuest Computer Science Collection
Directory of Open Access Journals
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList

ProQuest Computer Science Collection
CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: http://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2191-026X
EndPage 106
ExternalDocumentID oai_doaj_org_article_73c542ad0f8c439381067fd88e7c1fc3
10_1515_jisys_2022_0230
10_1515_jisys_2022_0230321
GroupedDBID 0R~
0~D
4.4
7WY
AAEMA
AAFPC
AAFWJ
AAGVJ
AAPJK
AAQCX
AASOL
AASQH
AAXCG
ABAOT
ABAQN
ABFKT
ABIQR
ABSOE
ABUVI
ABXMZ
ABYKJ
ACEFL
ACGFS
ACTFP
ACZBO
ADGQD
ADGYE
ADJVZ
ADOZN
AEJTT
AEQDQ
AERZL
AEXIE
AFBAA
AFCXV
AFPKN
AFQUK
AHGBP
AHGSO
AIERV
AIGSN
AJATJ
ALMA_UNASSIGNED_HOLDINGS
ARCSS
BAKPI
BBCWN
BCIFA
CFGNV
DBYYV
EBS
GROUPED_DOAJ
HZ~
IY9
M0C
O9-
OK1
P2P
QD8
RDG
SA.
AAYXX
AKXKS
CITATION
M48
SLJYH
JQ2
ID FETCH-LOGICAL-c427t-950c4674fc99168286c6ecad53a514762d72c24743915ada70955152d2e5dc083
IEDL.DBID DOA
ISSN 2191-026X
0334-1860
IngestDate Tue Oct 22 15:12:49 EDT 2024
Thu Oct 10 17:58:48 EDT 2024
Fri Aug 23 00:35:34 EDT 2024
Thu Mar 16 03:15:34 EDT 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License This work is licensed under the Creative Commons Attribution 4.0 International License.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c427t-950c4674fc99168286c6ecad53a514762d72c24743915ada70955152d2e5dc083
ORCID 0000-0002-0229-7958
OpenAccessLink https://doaj.org/article/73c542ad0f8c439381067fd88e7c1fc3
PQID 2776989869
PQPubID 2031329
PageCount 12
ParticipantIDs doaj_primary_oai_doaj_org_article_73c542ad0f8c439381067fd88e7c1fc3
proquest_journals_2776989869
crossref_primary_10_1515_jisys_2022_0230
walterdegruyter_journals_10_1515_jisys_2022_0230321
PublicationCentury 2000
PublicationDate 2023-02-16
PublicationDateYYYYMMDD 2023-02-16
PublicationDate_xml – month: 02
  year: 2023
  text: 2023-02-16
  day: 16
PublicationDecade 2020
PublicationPlace Berlin
PublicationPlace_xml – name: Berlin
PublicationTitle Journal of intelligent systems
PublicationYear 2023
Publisher De Gruyter
Walter de Gruyter GmbH
Publisher_xml – name: De Gruyter
– name: Walter de Gruyter GmbH
References 2023031519385607452_j_jisys-2022-0230_ref_013
2023031519385607452_j_jisys-2022-0230_ref_035
2023031519385607452_j_jisys-2022-0230_ref_014
2023031519385607452_j_jisys-2022-0230_ref_036
2023031519385607452_j_jisys-2022-0230_ref_015
2023031519385607452_j_jisys-2022-0230_ref_037
2023031519385607452_j_jisys-2022-0230_ref_016
2023031519385607452_j_jisys-2022-0230_ref_038
2023031519385607452_j_jisys-2022-0230_ref_017
2023031519385607452_j_jisys-2022-0230_ref_039
2023031519385607452_j_jisys-2022-0230_ref_018
2023031519385607452_j_jisys-2022-0230_ref_019
2023031519385607452_j_jisys-2022-0230_ref_030
2023031519385607452_j_jisys-2022-0230_ref_031
2023031519385607452_j_jisys-2022-0230_ref_010
2023031519385607452_j_jisys-2022-0230_ref_032
2023031519385607452_j_jisys-2022-0230_ref_011
2023031519385607452_j_jisys-2022-0230_ref_033
2023031519385607452_j_jisys-2022-0230_ref_012
2023031519385607452_j_jisys-2022-0230_ref_034
2023031519385607452_j_jisys-2022-0230_ref_002
2023031519385607452_j_jisys-2022-0230_ref_024
2023031519385607452_j_jisys-2022-0230_ref_003
2023031519385607452_j_jisys-2022-0230_ref_025
2023031519385607452_j_jisys-2022-0230_ref_004
2023031519385607452_j_jisys-2022-0230_ref_026
2023031519385607452_j_jisys-2022-0230_ref_005
2023031519385607452_j_jisys-2022-0230_ref_027
2023031519385607452_j_jisys-2022-0230_ref_006
2023031519385607452_j_jisys-2022-0230_ref_028
2023031519385607452_j_jisys-2022-0230_ref_007
2023031519385607452_j_jisys-2022-0230_ref_029
2023031519385607452_j_jisys-2022-0230_ref_008
2023031519385607452_j_jisys-2022-0230_ref_009
2023031519385607452_j_jisys-2022-0230_ref_040
2023031519385607452_j_jisys-2022-0230_ref_020
2023031519385607452_j_jisys-2022-0230_ref_021
2023031519385607452_j_jisys-2022-0230_ref_022
2023031519385607452_j_jisys-2022-0230_ref_001
2023031519385607452_j_jisys-2022-0230_ref_023
References_xml – ident: 2023031519385607452_j_jisys-2022-0230_ref_011
  doi: 10.1080/00949655.2020.1822358
– ident: 2023031519385607452_j_jisys-2022-0230_ref_026
  doi: 10.1109/ICICA.2014.38
– ident: 2023031519385607452_j_jisys-2022-0230_ref_033
  doi: 10.3390/electronics8101130
– ident: 2023031519385607452_j_jisys-2022-0230_ref_010
  doi: 10.1016/j.neucom.2012.04.025
– ident: 2023031519385607452_j_jisys-2022-0230_ref_012
  doi: 10.1145/1497577.1497578
– ident: 2023031519385607452_j_jisys-2022-0230_ref_004
  doi: 10.1016/j.patrec.2009.09.011
– ident: 2023031519385607452_j_jisys-2022-0230_ref_009
  doi: 10.1016/j.asoc.2018.05.045
– ident: 2023031519385607452_j_jisys-2022-0230_ref_022
  doi: 10.1007/s00357-019-09342-4
– ident: 2023031519385607452_j_jisys-2022-0230_ref_013
  doi: 10.1007/s10462-013-9400-4
– ident: 2023031519385607452_j_jisys-2022-0230_ref_017
  doi: 10.1109/IGARSS.2009.5417707
– ident: 2023031519385607452_j_jisys-2022-0230_ref_016
  doi: 10.1007/s10462-019-09682-y
– ident: 2023031519385607452_j_jisys-2022-0230_ref_032
  doi: 10.1088/1742-6596/1897/1/012004
– ident: 2023031519385607452_j_jisys-2022-0230_ref_034
  doi: 10.1080/1062936X.2020.1818616
– ident: 2023031519385607452_j_jisys-2022-0230_ref_025
– ident: 2023031519385607452_j_jisys-2022-0230_ref_030
  doi: 10.1016/S1001-0742(09)60082-6
– ident: 2023031519385607452_j_jisys-2022-0230_ref_006
  doi: 10.1002/9780470977811
– ident: 2023031519385607452_j_jisys-2022-0230_ref_027
  doi: 10.1016/j.eswa.2014.03.021
– ident: 2023031519385607452_j_jisys-2022-0230_ref_037
  doi: 10.1016/j.ins.2012.08.023
– ident: 2023031519385607452_j_jisys-2022-0230_ref_001
  doi: 10.1007/978-3-642-04005-4
– ident: 2023031519385607452_j_jisys-2022-0230_ref_015
  doi: 10.1016/j.chemolab.2021.104288
– ident: 2023031519385607452_j_jisys-2022-0230_ref_020
  doi: 10.1016/j.asoc.2015.09.045
– ident: 2023031519385607452_j_jisys-2022-0230_ref_040
  doi: 10.1016/j.knosys.2021.107769
– ident: 2023031519385607452_j_jisys-2022-0230_ref_031
  doi: 10.1016/j.knosys.2019.105190
– ident: 2023031519385607452_j_jisys-2022-0230_ref_014
  doi: 10.1007/978-3-662-08968-2_16
– ident: 2023031519385607452_j_jisys-2022-0230_ref_018
  doi: 10.1016/B978-0-12-405163-8.00009-0
– ident: 2023031519385607452_j_jisys-2022-0230_ref_023
  doi: 10.1016/j.knosys.2020.106167
– ident: 2023031519385607452_j_jisys-2022-0230_ref_036
– ident: 2023031519385607452_j_jisys-2022-0230_ref_021
  doi: 10.1016/j.eswa.2018.09.015
– ident: 2023031519385607452_j_jisys-2022-0230_ref_005
– ident: 2023031519385607452_j_jisys-2022-0230_ref_003
  doi: 10.1016/j.engappai.2016.11.003
– ident: 2023031519385607452_j_jisys-2022-0230_ref_002
  doi: 10.1016/j.patrec.2013.11.012
– ident: 2023031519385607452_j_jisys-2022-0230_ref_019
  doi: 10.1016/j.knosys.2014.03.015
– ident: 2023031519385607452_j_jisys-2022-0230_ref_029
  doi: 10.1007/s11047-020-09809-z
– ident: 2023031519385607452_j_jisys-2022-0230_ref_008
  doi: 10.1080/03610926.2021.1872639
– ident: 2023031519385607452_j_jisys-2022-0230_ref_024
– ident: 2023031519385607452_j_jisys-2022-0230_ref_039
  doi: 10.1016/j.future.2020.08.031
– ident: 2023031519385607452_j_jisys-2022-0230_ref_035
  doi: 10.1080/1062936X.2022.2064546
– ident: 2023031519385607452_j_jisys-2022-0230_ref_028
– ident: 2023031519385607452_j_jisys-2022-0230_ref_007
  doi: 10.1007/s12652-019-01445-5
– ident: 2023031519385607452_j_jisys-2022-0230_ref_038
  doi: 10.1007/s11047-020-09809-z
SSID ssj0000491585
Score 2.3205543
Snippet Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a...
Abstract Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data...
SourceID doaj
proquest
crossref
walterdegruyter
SourceType Open Website
Aggregation Database
Publisher
StartPage 99
SubjectTerms Algorithms
Big Data
Cluster analysis
Clustering
Data mining
equilibrium optimizer algorithm
feature selection
k-means
Machine learning
means
Optimization
penalized method
swarms
Unsupervised learning
Vector quantization
Title Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm
URI http://www.degruyter.com/doi/10.1515/jisys-2022-0230
https://www.proquest.com/docview/2776989869
https://doaj.org/article/73c542ad0f8c439381067fd88e7c1fc3
Volume 32
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagE0t5i0BBHhhYoubtZOTRCgmJhYfYIsd22lRtAm0iVH49d05SKBJiYY0cx7qz_X0Xn78j5NzivnLCUJmBZ3PTU5KZPFSumQBZ5h5MMcF1EdsHdv8S3gxQJmdV6gtzwmp54NpwfeYK33O4tNJQAHiiIFXAUgndM2Gnotb5tKJvwdSk5r02EOFGywcwuz_JFssFzAmIvZB2r8GQVutfo5jdd31YLdVoXi3L9nBUY85wh3Qbskgv60Hukg2V75HtthADbdblPnke5GP0Hv7po0VK78yZAgiiYlqhDgKgE81ymmQjigmhFJFL0iKn6q3KdM5_NaMF7B2z7AO65dNRMc_K8eyAPA0Hj9e3ZlMwwRSew0oz8i2B1UNSgawPL4iLQAkufZcDL4JtTzJHOB7GILbPJWeoPwcALh3lSwFk7JB08iJXR4TyAJgcvBeFVuJZiQoTm0sn8IUlXeBggUEuWvvFr7UuRozxBHQXa1PHaOoYTW2QK7TvqhkKWusH4Oa4cXP8l5sN0mu9EzerDD7BmK5_GUQGcX947KvVL8NyHfv4P0Z2QrawBD1mcttBj3TKeaVOyeZCVmd6Tn4CILblhw
link.rule.ids 315,783,787,867,2109,27936,27937
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enhancement+of+K-means+clustering+in+big+data+based+on+equilibrium+optimizer+algorithm&rft.jtitle=Journal+of+intelligent+systems&rft.au=Sarah+Ghanim+Mahmood+Al-kababchee&rft.au=Zakariya+Yahya+Algamal&rft.au=Omar+Saber+Qasim&rft.date=2023-02-16&rft.pub=Walter+de+Gruyter+GmbH&rft.issn=0334-1860&rft.eissn=2191-026X&rft.issue=1&rft_id=info:doi/10.1515%2Fjisys-2022-0230&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2191-026X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2191-026X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2191-026X&client=summon