Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance
The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a speci...
Saved in:
Published in: | IEEE access Vol. 12; pp. 95914 - 95925 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
Piscataway
IEEE
2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets. |
---|---|
AbstractList | The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets. |
Author | Ko, Young-Woong Kim, Dongyoung Lee, Jeong-Gun Kim, Dong-Kyu |
Author_xml | – sequence: 1 givenname: Dongyoung orcidid: 0000-0003-1998-7784 surname: Kim fullname: Kim, Dongyoung organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea – sequence: 2 givenname: Young-Woong surname: Ko fullname: Ko, Young-Woong organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea – sequence: 3 givenname: Dong-Kyu orcidid: 0000-0003-4917-0177 surname: Kim fullname: Kim, Dong-Kyu email: doctordk@naver.com organization: Department of Otorhinolaryngology-Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Chuncheon, South Korea – sequence: 4 givenname: Jeong-Gun orcidid: 0000-0001-6218-4560 surname: Lee fullname: Lee, Jeong-Gun email: jeonggun.lee@hallym.ac.kr organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea |
BookMark | eNpNkU1rGzEQhkVJoWmaX5AeBD30ZEdf-3V0lzQxGFpwchYj7ciVWUuOpATy77vOhpC5aBje9x00z1dyFmJAQq44W3LOuutV399st0vBhFpKJZSQ9SdyLnjdLWQl67MP_RdymfOeTdVOo6o5J4-rAONL9plGR-8ThOxiOmD6memqFAzFx0B_4T949jFRH-h2RDzSbYEd0n6EnL3zFl5lEAa68QdffNjRdaEl0vXhmOIz0r-YTrkQLH4jnx2MGS_f3gvy8Pvmvr9bbP7crvvVZmGFUmUhjB3QWTO0jawbAaKznWCt66SzXDaGdbXoACpmmJOVBSsQXDWgcS13dW3kBVnPuUOEvT4mf4D0oiN4_TqIaachFW9H1HJQVTdYZ4xUiltrJFrZDKKRygku5JT1Y86afvP4hLnofXxK0-WyltMp21o1zUklZ5VNMeeE7n0rZ_qESs-o9AmVfkM1ub7PLo-IHxxVOxFq5X_jVJMQ |
CODEN | IAECCG |
Cites_doi | 10.1371/journal.pone.0216456 10.3390/jpm12020136 10.1109/TBME.2022.3147187 10.1007/978-1-4939-2864-4_797 10.1093/jamia/ocy064 10.1073/pnas.1005493107 10.1016/0013-4694(69)90021-2 10.48550/ARXIV.1706.03762 10.1016/j.compbiomed.2018.08.022 10.1109/TNSRE.2017.2721116 10.1109/CISP-BMEI.2018.8633058 10.1109/TNSRE.2021.3076234 10.1109/EMBC.2018.8512214 10.7551/mitpress/9609.001.0001 10.3390/ijerph17114152 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2024.3424236 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: ESBDL name: IEEE Open Access Journals url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 95925 |
ExternalDocumentID | oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123 10_1109_ACCESS_2024_3424236 10586958 |
Genre | orig-research |
GrantInformation_xml | – fundername: University of Minnesota grantid: U01HL53934 – fundername: University of California, Davis grantid: U01HL53916 – fundername: New York University grantid: U01HL53931 – fundername: NRF through the Basic Science Research Program grantid: 2021R1F1A1047963 – fundername: University of Arizona grantid: U01HL53938 – fundername: Johns Hopkins University grantid: U01HL53937; U01HL64360 – fundername: Case Western Reserve University grantid: U01HL63463 – fundername: National Heart, Lung, and Blood Institute grantid: R24 HL114473; 75N92019R002 – fundername: Bio and Medical Technology Development Program of the National Research Foundation (NRF) funded by Korean Government [Ministry of Science and Information-Communication-Technology (MSIT)] grantid: RS-2023-00223501 – fundername: Boston University grantid: U01HL53941 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3 |
IEDL.DBID | RIE |
ISSN | 2169-3536 |
IngestDate | Tue Oct 22 15:11:09 EDT 2024 Wed Oct 16 11:30:53 EDT 2024 Wed Sep 11 13:52:22 EDT 2024 Wed Sep 11 06:09:18 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3 |
ORCID | 0000-0003-4917-0177 0000-0003-1998-7784 0000-0001-6218-4560 |
OpenAccessLink | https://ieeexplore.ieee.org/document/10586958 |
PQID | 3081864773 |
PQPubID | 4845423 |
PageCount | 12 |
ParticipantIDs | proquest_journals_3081864773 doaj_primary_oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123 crossref_primary_10_1109_ACCESS_2024_3424236 ieee_primary_10586958 |
PublicationCentury | 2000 |
PublicationDate | 20240000 2024-00-00 20240101 2024-01-01 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – year: 2024 text: 20240000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2024 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref23 ref15 ref14 Devlin (ref2) 2018 ref20 Quan (ref18) 1997; 20 ref10 ref21 ref1 Lu (ref9) ref16 ref19 ref7 Iber (ref11) 2007 Dosovitskiy (ref3) 2020 ref4 ref6 Touvron (ref8) ref5 Battaglia (ref24) 2018 Radford (ref17) 2018 Xiao (ref22) |
References_xml | – ident: ref13 doi: 10.1371/journal.pone.0216456 – ident: ref21 doi: 10.3390/jpm12020136 – start-page: 10347 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref8 article-title: Training data-efficient image transformers&distillation through attention contributor: fullname: Touvron – ident: ref7 doi: 10.1109/TBME.2022.3147187 – ident: ref20 doi: 10.1007/978-1-4939-2864-4_797 – volume-title: The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specification year: 2007 ident: ref11 contributor: fullname: Iber – start-page: 14663 volume-title: Proc. 35st Conf. Neural Inf. Process. Syst. ident: ref9 article-title: Bridging the gap between vision transformers and convolutional neural networks on small datasets contributor: fullname: Lu – ident: ref19 doi: 10.1093/jamia/ocy064 – ident: ref23 doi: 10.1073/pnas.1005493107 – ident: ref10 doi: 10.1016/0013-4694(69)90021-2 – year: 2018 ident: ref2 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: arXiv:1810.04805 contributor: fullname: Devlin – ident: ref1 doi: 10.48550/ARXIV.1706.03762 – start-page: 30392 volume-title: Proc. NIPS ident: ref22 article-title: Early convolutions help transformers see better contributor: fullname: Xiao – ident: ref4 doi: 10.1016/j.compbiomed.2018.08.022 – ident: ref12 doi: 10.1109/TNSRE.2017.2721116 – ident: ref14 doi: 10.1109/CISP-BMEI.2018.8633058 – year: 2020 ident: ref3 article-title: An image is worth 16×16 words: Transformers for image recognition at scale publication-title: arXiv:2010.11929 contributor: fullname: Dosovitskiy – year: 2018 ident: ref17 article-title: Improving language understanding by generative pre-training contributor: fullname: Radford – ident: ref15 doi: 10.1109/TNSRE.2021.3076234 – year: 2018 ident: ref24 article-title: Relational inductive biases, deep learning, and graph networks publication-title: arXiv:1806.01261 contributor: fullname: Battaglia – ident: ref5 doi: 10.1109/EMBC.2018.8512214 – ident: ref16 doi: 10.7551/mitpress/9609.001.0001 – volume: 20 start-page: 1077 issue: 12 year: 1997 ident: ref18 article-title: The sleep heart health study: Design, rationale, and methods publication-title: Sleep contributor: fullname: Quan – ident: ref6 doi: 10.3390/ijerph17114152 |
SSID | ssj0000816957 |
Score | 2.350507 |
Snippet | The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 95914 |
SubjectTerms | Analytical models Brain modeling Classification Classification algorithms Computer architecture Computer vision Convolutional neural networks Data models Datasets Deep learning Electroencephalography Electromyography Feature extraction Object recognition Recurrent neural networks Sleep Sleep stage classification Task analysis Training transformer Transformers |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagEwyIRxGFgjwgsRCaxElsj6W0KgtCapHYLL8OIaG0tOn_x3bSUsTAwhpZcXzf-R7O-TuErol0YQfTJqKZNFFGKUQKTBI5VaaQxVLy8Ed3PKFPr-xh6GlyNq2-fE1YTQ9cC65HTJZzo0Ep4jy31opYTajxrCzgzW6wvnGxlUwFG8ySgue0oRlKYt7rDwZuRS4hTLM7kvkoovjhigJjf9Ni5ZddDs5mdIgOmigR9-uvO0I7tjxG-1vcgSfoc00ngmeAp-vw0y5ulrhfVXURI27YDxf4vcSTD2vn2AWXbxaHVpi-SCjggmVpcLjp5N6MHytczXB92GDx8_fFgjZ6GQ2ng3HU9E-ItHPaVZQqbSxoZRglBU1lyjVPYwacgE4IVS57SbmUeaxiILmWOrUScmMVsASKQpFT1CpnpT1DOAeWUyCgJFi3y6Ui1BruxoJNLWG6g27XohTzmiZDhPQi5qKWvPCSF43kO-jei3sz1HNchwcOedEgL_5CvoPaHqyt-XLmQGcd1F2jJ5oNuRQkUPc5XSTn_zH3Bdrz66nPYrqoVS1W9hLtLs3qKijiFxKN4nQ priority: 102 providerName: Directory of Open Access Journals |
Title | Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance |
URI | https://ieeexplore.ieee.org/document/10586958 https://www.proquest.com/docview/3081864773 https://doaj.org/article/3d459dcfbb3441ccb3ec37d2734f2123 |
Volume | 12 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH-i4wIHNmCIbqPyAYkLGWkcx_ax6zptEkJIHRI3yx_vIaQpHW36_2M7aTeEOHCLIidx_Hv2-7Df7wG85zaaHcqHQtY2FLWUVDgK0yKKsqS6tFbnHd3rpfzyXV0uEk1Osc-FQcR8-AzP02Xeyw8rv02hsjjDhWq0UCMYSa36ZK19QCVVkNBCDsxC01J_ms3n8SeiD1jV57xOhkPzh_bJJP1DVZW_luKsX64O_7NnR_BiMCTZrEf-JTzB9hU8f0Qv-Bp-7RhH2IrY7c5CxfWHDZt1XX_OkQ0EiWv2s2XLO8R7Fu3PH8hytcx0jihDx2wbWE6Gim9mNx3rVqyPRyD7-pB7cAzfrha38-tiKLFQ-KjXu6JyPiB5F5Tkjaxspb2uSkWak59y6aKDU2lrRelK4sJbX6ElEdCRmlLTOP4GDtpVi2-BCVJCEidnCeNCYB2XGHRsS1ghV34MH3dDb-57Jg2TPZBSmx4pk5AyA1JjuEjw7JsmGux8I467GWaV4aEWOnhyjkezznvH0XMZEmUPJZ08huOE1aPv9TCN4WyHthnm7MbwzO4XxZWf_OOxU3iWuthHYM7goFtv8R2MNmE7yb78BJ4ulheXnydZMn8DzKfiag |
link.rule.ids | 315,782,786,798,866,2106,4028,27642,27932,27933,27934,54767,54942 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH9i3QE48DlE2QAfkLiQkdpJbB-7rlMnyoTUInGz_PGMkFA62vT_x3bcbgjtsFsUOYnjn5_fh59_D-AD08HsENYVvNKuqDj3hfFuVISpzH1Vai3Tju5swa9-iPNppMkp9mdhEDEln-FpvEx7-W5ltzFUFiS8Fo2sxQEcBremogM4nC7Ozuf7oEqsIiFrntmFRqX8PJ5Mwo8EP5BWp6yKxkPzjwZKRP25ssp_y3HSMRdP79m7Z_AkG5Nk3KP_HB5g-wIe36IYfAl_dqwjZOXJcmel4vrjhoy7rs91JJkkcU1-tWTxG_GaBBv0J5JUMTPmEiX4iG4dSQeiwpvJZUe6FeljEki-3Zw_OILvF9PlZFbkMguFDbq9K6ixDr01TnDWcKqptJKWwkvm7YhxE5wcKrWuS1N6VlttKWpfOzRejHzTGPYKBu2qxddAai9q7pk32mNYDLRhHJ0MbT1SZMIO4dNu6NV1z6ahkhdSStUjpSJSKiM1hLMIz75ppMJON8K4qyxZirmqls56Y1gw7aw1DC3jLtL2-KiXh3AUsbr1vR6mIZzs0FZZbjeKJYa_MGXZmzseew8PZ8uvczW_vPpyDI9id_uIzAkMuvUW38LBxm3f5Zn5F8i95Ew |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+Transformer%27s+Attention+Behavior+in+Sleep+Stage+Classification+and+Limiting+It+to+Improve+Performance&rft.jtitle=IEEE+access&rft.au=Kim%2C+Dongyoung&rft.au=Ko%2C+Young-Woong&rft.au=Kim%2C+Dong-Kyu&rft.au=Lee%2C+Jeong-Gun&rft.date=2024&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=12&rft.spage=95914&rft.epage=95925&rft_id=info:doi/10.1109%2FACCESS.2024.3424236&rft.externalDocID=10586958 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |