Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm
Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups...
Saved in:
Published in: | Journal of intelligent systems Vol. 32; no. 1; pp. 99 - 106 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Berlin
De Gruyter
16-02-2023
Walter de Gruyter GmbH |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the
-means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of
-means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support. |
---|---|
AbstractList | Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the K-means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of K-means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support. Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the -means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of -means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support. Abstract Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the K -means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of K -means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support. |
Author | Qasim, Omar Saber Al-kababchee, Sarah Ghanim Mahmood Algamal, Zakariya Yahya |
Author_xml | – sequence: 1 givenname: Sarah Ghanim Mahmood surname: Al-kababchee fullname: Al-kababchee, Sarah Ghanim Mahmood email: sarahghanim@uohamdaniya.edu.iq organization: Department of Mathematics, Education College, University of AL-Hamdaniya, 41019 Bartella, Iraq – sequence: 2 givenname: Zakariya Yahya orcidid: 0000-0002-0229-7958 surname: Algamal fullname: Algamal, Zakariya Yahya email: zakariya.algamal@uomosul.edu.iq organization: College of Engineering, University of Warith Al-Anbiyaa, 56001 Karbala, Iraq – sequence: 3 givenname: Omar Saber surname: Qasim fullname: Qasim, Omar Saber email: omar.saber@uomosul.edu.iq organization: Department of Mathematics, University of Mosul, 41002 Mosul, Iraq |
BookMark | eNp1kc1P3DAQxa2KSqWUc6-WOKfYTvzVG0IUVkXqBRA3a9Z2gleJvdiJ0Pavx8tWtJfOZZ6seT-P_T6jo5iiR-grJd8op_x8E8quNIww1hDWkg_omFFNqxaPR__oT-i0lA2p1WnKFT9GD1fxCaL1k48zTj3-2UweYsF2XMrsc4gDDhGvw4AdzIDXULzDKWL_vIQxrHNYJpy2c5jCb58xjEPKYX6avqCPPYzFn_7pJ-j-x9Xd5U1z--t6dXlx29iOybnRnNhOyK63WlOhmBJWeAuOt8BpJwVzklnWya6t64IDSTSvz2WOee4sUe0JWh24LsHGbHOYIO9MgmDeDlIeDOQ52NEb2VreMXCkV7byWkWJkL1TyktLe9tW1tmBtc3pefFlNpu05FjXN0xKoZVWQtep88OUzamU7Pv3Wykx-yzMWxZmn4XZZ1Ed3w-OFxjrlzo_5GVXxV_8_5yMtq94KJGZ |
CitedBy_id | crossref_primary_10_1080_03610918_2023_2249271 |
Cites_doi | 10.1080/00949655.2020.1822358 10.1109/ICICA.2014.38 10.3390/electronics8101130 10.1016/j.neucom.2012.04.025 10.1145/1497577.1497578 10.1016/j.patrec.2009.09.011 10.1016/j.asoc.2018.05.045 10.1007/s00357-019-09342-4 10.1007/s10462-013-9400-4 10.1109/IGARSS.2009.5417707 10.1007/s10462-019-09682-y 10.1088/1742-6596/1897/1/012004 10.1080/1062936X.2020.1818616 10.1016/S1001-0742(09)60082-6 10.1002/9780470977811 10.1016/j.eswa.2014.03.021 10.1016/j.ins.2012.08.023 10.1007/978-3-642-04005-4 10.1016/j.chemolab.2021.104288 10.1016/j.asoc.2015.09.045 10.1016/j.knosys.2021.107769 10.1016/j.knosys.2019.105190 10.1007/978-3-662-08968-2_16 10.1016/B978-0-12-405163-8.00009-0 10.1016/j.knosys.2020.106167 10.1016/j.eswa.2018.09.015 10.1016/j.engappai.2016.11.003 10.1016/j.patrec.2013.11.012 10.1016/j.knosys.2014.03.015 10.1007/s11047-020-09809-z 10.1080/03610926.2021.1872639 10.1016/j.future.2020.08.031 10.1080/1062936X.2022.2064546 10.1007/s12652-019-01445-5 |
ContentType | Journal Article |
Copyright | 2023. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2023. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | AAYXX CITATION JQ2 DOA |
DOI | 10.1515/jisys-2022-0230 |
DatabaseName | CrossRef ProQuest Computer Science Collection Directory of Open Access Journals |
DatabaseTitle | CrossRef ProQuest Computer Science Collection |
DatabaseTitleList | ProQuest Computer Science Collection CrossRef |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 2191-026X |
EndPage | 106 |
ExternalDocumentID | oai_doaj_org_article_73c542ad0f8c439381067fd88e7c1fc3 10_1515_jisys_2022_0230 10_1515_jisys_2022_0230321 |
GroupedDBID | 0R~ 0~D 4.4 7WY AAEMA AAFPC AAFWJ AAGVJ AAPJK AAQCX AASOL AASQH AAXCG ABAOT ABAQN ABFKT ABIQR ABSOE ABUVI ABXMZ ABYKJ ACEFL ACGFS ACTFP ACZBO ADGQD ADGYE ADJVZ ADOZN AEJTT AEQDQ AERZL AEXIE AFBAA AFCXV AFPKN AFQUK AHGBP AHGSO AIERV AIGSN AJATJ ALMA_UNASSIGNED_HOLDINGS ARCSS BAKPI BBCWN BCIFA CFGNV DBYYV EBS GROUPED_DOAJ HZ~ IY9 M0C O9- OK1 P2P QD8 RDG SA. AAYXX AKXKS CITATION M48 SLJYH JQ2 |
ID | FETCH-LOGICAL-c427t-950c4674fc99168286c6ecad53a514762d72c24743915ada70955152d2e5dc083 |
IEDL.DBID | DOA |
ISSN | 2191-026X 0334-1860 |
IngestDate | Tue Oct 22 15:12:49 EDT 2024 Thu Oct 10 17:58:48 EDT 2024 Fri Aug 23 00:35:34 EDT 2024 Thu Mar 16 03:15:34 EDT 2023 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | This work is licensed under the Creative Commons Attribution 4.0 International License. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c427t-950c4674fc99168286c6ecad53a514762d72c24743915ada70955152d2e5dc083 |
ORCID | 0000-0002-0229-7958 |
OpenAccessLink | https://doaj.org/article/73c542ad0f8c439381067fd88e7c1fc3 |
PQID | 2776989869 |
PQPubID | 2031329 |
PageCount | 12 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_73c542ad0f8c439381067fd88e7c1fc3 proquest_journals_2776989869 crossref_primary_10_1515_jisys_2022_0230 walterdegruyter_journals_10_1515_jisys_2022_0230321 |
PublicationCentury | 2000 |
PublicationDate | 2023-02-16 |
PublicationDateYYYYMMDD | 2023-02-16 |
PublicationDate_xml | – month: 02 year: 2023 text: 2023-02-16 day: 16 |
PublicationDecade | 2020 |
PublicationPlace | Berlin |
PublicationPlace_xml | – name: Berlin |
PublicationTitle | Journal of intelligent systems |
PublicationYear | 2023 |
Publisher | De Gruyter Walter de Gruyter GmbH |
Publisher_xml | – name: De Gruyter – name: Walter de Gruyter GmbH |
References | 2023031519385607452_j_jisys-2022-0230_ref_013 2023031519385607452_j_jisys-2022-0230_ref_035 2023031519385607452_j_jisys-2022-0230_ref_014 2023031519385607452_j_jisys-2022-0230_ref_036 2023031519385607452_j_jisys-2022-0230_ref_015 2023031519385607452_j_jisys-2022-0230_ref_037 2023031519385607452_j_jisys-2022-0230_ref_016 2023031519385607452_j_jisys-2022-0230_ref_038 2023031519385607452_j_jisys-2022-0230_ref_017 2023031519385607452_j_jisys-2022-0230_ref_039 2023031519385607452_j_jisys-2022-0230_ref_018 2023031519385607452_j_jisys-2022-0230_ref_019 2023031519385607452_j_jisys-2022-0230_ref_030 2023031519385607452_j_jisys-2022-0230_ref_031 2023031519385607452_j_jisys-2022-0230_ref_010 2023031519385607452_j_jisys-2022-0230_ref_032 2023031519385607452_j_jisys-2022-0230_ref_011 2023031519385607452_j_jisys-2022-0230_ref_033 2023031519385607452_j_jisys-2022-0230_ref_012 2023031519385607452_j_jisys-2022-0230_ref_034 2023031519385607452_j_jisys-2022-0230_ref_002 2023031519385607452_j_jisys-2022-0230_ref_024 2023031519385607452_j_jisys-2022-0230_ref_003 2023031519385607452_j_jisys-2022-0230_ref_025 2023031519385607452_j_jisys-2022-0230_ref_004 2023031519385607452_j_jisys-2022-0230_ref_026 2023031519385607452_j_jisys-2022-0230_ref_005 2023031519385607452_j_jisys-2022-0230_ref_027 2023031519385607452_j_jisys-2022-0230_ref_006 2023031519385607452_j_jisys-2022-0230_ref_028 2023031519385607452_j_jisys-2022-0230_ref_007 2023031519385607452_j_jisys-2022-0230_ref_029 2023031519385607452_j_jisys-2022-0230_ref_008 2023031519385607452_j_jisys-2022-0230_ref_009 2023031519385607452_j_jisys-2022-0230_ref_040 2023031519385607452_j_jisys-2022-0230_ref_020 2023031519385607452_j_jisys-2022-0230_ref_021 2023031519385607452_j_jisys-2022-0230_ref_022 2023031519385607452_j_jisys-2022-0230_ref_001 2023031519385607452_j_jisys-2022-0230_ref_023 |
References_xml | – ident: 2023031519385607452_j_jisys-2022-0230_ref_011 doi: 10.1080/00949655.2020.1822358 – ident: 2023031519385607452_j_jisys-2022-0230_ref_026 doi: 10.1109/ICICA.2014.38 – ident: 2023031519385607452_j_jisys-2022-0230_ref_033 doi: 10.3390/electronics8101130 – ident: 2023031519385607452_j_jisys-2022-0230_ref_010 doi: 10.1016/j.neucom.2012.04.025 – ident: 2023031519385607452_j_jisys-2022-0230_ref_012 doi: 10.1145/1497577.1497578 – ident: 2023031519385607452_j_jisys-2022-0230_ref_004 doi: 10.1016/j.patrec.2009.09.011 – ident: 2023031519385607452_j_jisys-2022-0230_ref_009 doi: 10.1016/j.asoc.2018.05.045 – ident: 2023031519385607452_j_jisys-2022-0230_ref_022 doi: 10.1007/s00357-019-09342-4 – ident: 2023031519385607452_j_jisys-2022-0230_ref_013 doi: 10.1007/s10462-013-9400-4 – ident: 2023031519385607452_j_jisys-2022-0230_ref_017 doi: 10.1109/IGARSS.2009.5417707 – ident: 2023031519385607452_j_jisys-2022-0230_ref_016 doi: 10.1007/s10462-019-09682-y – ident: 2023031519385607452_j_jisys-2022-0230_ref_032 doi: 10.1088/1742-6596/1897/1/012004 – ident: 2023031519385607452_j_jisys-2022-0230_ref_034 doi: 10.1080/1062936X.2020.1818616 – ident: 2023031519385607452_j_jisys-2022-0230_ref_025 – ident: 2023031519385607452_j_jisys-2022-0230_ref_030 doi: 10.1016/S1001-0742(09)60082-6 – ident: 2023031519385607452_j_jisys-2022-0230_ref_006 doi: 10.1002/9780470977811 – ident: 2023031519385607452_j_jisys-2022-0230_ref_027 doi: 10.1016/j.eswa.2014.03.021 – ident: 2023031519385607452_j_jisys-2022-0230_ref_037 doi: 10.1016/j.ins.2012.08.023 – ident: 2023031519385607452_j_jisys-2022-0230_ref_001 doi: 10.1007/978-3-642-04005-4 – ident: 2023031519385607452_j_jisys-2022-0230_ref_015 doi: 10.1016/j.chemolab.2021.104288 – ident: 2023031519385607452_j_jisys-2022-0230_ref_020 doi: 10.1016/j.asoc.2015.09.045 – ident: 2023031519385607452_j_jisys-2022-0230_ref_040 doi: 10.1016/j.knosys.2021.107769 – ident: 2023031519385607452_j_jisys-2022-0230_ref_031 doi: 10.1016/j.knosys.2019.105190 – ident: 2023031519385607452_j_jisys-2022-0230_ref_014 doi: 10.1007/978-3-662-08968-2_16 – ident: 2023031519385607452_j_jisys-2022-0230_ref_018 doi: 10.1016/B978-0-12-405163-8.00009-0 – ident: 2023031519385607452_j_jisys-2022-0230_ref_023 doi: 10.1016/j.knosys.2020.106167 – ident: 2023031519385607452_j_jisys-2022-0230_ref_036 – ident: 2023031519385607452_j_jisys-2022-0230_ref_021 doi: 10.1016/j.eswa.2018.09.015 – ident: 2023031519385607452_j_jisys-2022-0230_ref_005 – ident: 2023031519385607452_j_jisys-2022-0230_ref_003 doi: 10.1016/j.engappai.2016.11.003 – ident: 2023031519385607452_j_jisys-2022-0230_ref_002 doi: 10.1016/j.patrec.2013.11.012 – ident: 2023031519385607452_j_jisys-2022-0230_ref_019 doi: 10.1016/j.knosys.2014.03.015 – ident: 2023031519385607452_j_jisys-2022-0230_ref_029 doi: 10.1007/s11047-020-09809-z – ident: 2023031519385607452_j_jisys-2022-0230_ref_008 doi: 10.1080/03610926.2021.1872639 – ident: 2023031519385607452_j_jisys-2022-0230_ref_024 – ident: 2023031519385607452_j_jisys-2022-0230_ref_039 doi: 10.1016/j.future.2020.08.031 – ident: 2023031519385607452_j_jisys-2022-0230_ref_035 doi: 10.1080/1062936X.2022.2064546 – ident: 2023031519385607452_j_jisys-2022-0230_ref_028 – ident: 2023031519385607452_j_jisys-2022-0230_ref_007 doi: 10.1007/s12652-019-01445-5 – ident: 2023031519385607452_j_jisys-2022-0230_ref_038 doi: 10.1007/s11047-020-09809-z |
SSID | ssj0000491585 |
Score | 2.3205543 |
Snippet | Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a... Abstract Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data... |
SourceID | doaj proquest crossref walterdegruyter |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 99 |
SubjectTerms | Algorithms Big Data Cluster analysis Clustering Data mining equilibrium optimizer algorithm feature selection k-means Machine learning means Optimization penalized method swarms Unsupervised learning Vector quantization |
Title | Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm |
URI | http://www.degruyter.com/doi/10.1515/jisys-2022-0230 https://www.proquest.com/docview/2776989869 https://doaj.org/article/73c542ad0f8c439381067fd88e7c1fc3 |
Volume | 32 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagE0t5i0BBHhhYoubtZOTRCgmJhYfYIsd22lRtAm0iVH49d05SKBJiYY0cx7qz_X0Xn78j5NzivnLCUJmBZ3PTU5KZPFSumQBZ5h5MMcF1EdsHdv8S3gxQJmdV6gtzwmp54NpwfeYK33O4tNJQAHiiIFXAUgndM2Gnotb5tKJvwdSk5r02EOFGywcwuz_JFssFzAmIvZB2r8GQVutfo5jdd31YLdVoXi3L9nBUY85wh3Qbskgv60Hukg2V75HtthADbdblPnke5GP0Hv7po0VK78yZAgiiYlqhDgKgE81ymmQjigmhFJFL0iKn6q3KdM5_NaMF7B2z7AO65dNRMc_K8eyAPA0Hj9e3ZlMwwRSew0oz8i2B1UNSgawPL4iLQAkufZcDL4JtTzJHOB7GILbPJWeoPwcALh3lSwFk7JB08iJXR4TyAJgcvBeFVuJZiQoTm0sn8IUlXeBggUEuWvvFr7UuRozxBHQXa1PHaOoYTW2QK7TvqhkKWusH4Oa4cXP8l5sN0mu9EzerDD7BmK5_GUQGcX947KvVL8NyHfv4P0Z2QrawBD1mcttBj3TKeaVOyeZCVmd6Tn4CILblhw |
link.rule.ids | 315,783,787,867,2109,27936,27937 |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enhancement+of+K-means+clustering+in+big+data+based+on+equilibrium+optimizer+algorithm&rft.jtitle=Journal+of+intelligent+systems&rft.au=Sarah+Ghanim+Mahmood+Al-kababchee&rft.au=Zakariya+Yahya+Algamal&rft.au=Omar+Saber+Qasim&rft.date=2023-02-16&rft.pub=Walter+de+Gruyter+GmbH&rft.issn=0334-1860&rft.eissn=2191-026X&rft.issue=1&rft_id=info:doi/10.1515%2Fjisys-2022-0230&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2191-026X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2191-026X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2191-026X&client=summon |