Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance

The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a speci...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access Vol. 12; pp. 95914 - 95925
Main Authors:	Kim, Dongyoung, Ko, Young-Woong, Kim, Dong-Kyu, Lee, Jeong-Gun
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Analytical models Brain modeling Classification Classification algorithms Computer architecture Computer vision Convolutional neural networks Data models Datasets Deep learning Electroencephalography Electromyography Feature extraction Object recognition Recurrent neural networks Sleep Sleep stage classification Task analysis Training transformer Transformers
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets.
AbstractList	The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets.
Author	Ko, Young-Woong Kim, Dongyoung Lee, Jeong-Gun Kim, Dong-Kyu
Author_xml	– sequence: 1 givenname: Dongyoung orcidid: 0000-0003-1998-7784 surname: Kim fullname: Kim, Dongyoung organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea – sequence: 2 givenname: Young-Woong surname: Ko fullname: Ko, Young-Woong organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea – sequence: 3 givenname: Dong-Kyu orcidid: 0000-0003-4917-0177 surname: Kim fullname: Kim, Dong-Kyu email: doctordk@naver.com organization: Department of Otorhinolaryngology-Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Chuncheon, South Korea – sequence: 4 givenname: Jeong-Gun orcidid: 0000-0001-6218-4560 surname: Lee fullname: Lee, Jeong-Gun email: jeonggun.lee@hallym.ac.kr organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea
BookMark	eNpNkU1rGzEQhkVJoWmaX5AeBD30ZEdf-3V0lzQxGFpwchYj7ciVWUuOpATy77vOhpC5aBje9x00z1dyFmJAQq44W3LOuutV399st0vBhFpKJZSQ9SdyLnjdLWQl67MP_RdymfOeTdVOo6o5J4-rAONL9plGR-8ThOxiOmD6memqFAzFx0B_4T949jFRH-h2RDzSbYEd0n6EnL3zFl5lEAa68QdffNjRdaEl0vXhmOIz0r-YTrkQLH4jnx2MGS_f3gvy8Pvmvr9bbP7crvvVZmGFUmUhjB3QWTO0jawbAaKznWCt66SzXDaGdbXoACpmmJOVBSsQXDWgcS13dW3kBVnPuUOEvT4mf4D0oiN4_TqIaachFW9H1HJQVTdYZ4xUiltrJFrZDKKRygku5JT1Y86afvP4hLnofXxK0-WyltMp21o1zUklZ5VNMeeE7n0rZ_qESs-o9AmVfkM1ub7PLo-IHxxVOxFq5X_jVJMQ
CODEN	IAECCG
Cites_doi	10.1371/journal.pone.0216456 10.3390/jpm12020136 10.1109/TBME.2022.3147187 10.1007/978-1-4939-2864-4_797 10.1093/jamia/ocy064 10.1073/pnas.1005493107 10.1016/0013-4694(69)90021-2 10.48550/ARXIV.1706.03762 10.1016/j.compbiomed.2018.08.022 10.1109/TNSRE.2017.2721116 10.1109/CISP-BMEI.2018.8633058 10.1109/TNSRE.2021.3076234 10.1109/EMBC.2018.8512214 10.7551/mitpress/9609.001.0001 10.3390/ijerph17114152
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID	97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA
DOI	10.1109/ACCESS.2024.3424236
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional
DatabaseTitleList	Materials Research Database
Database_xml	– sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: ESBDL name: IEEE Open Access Journals url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2169-3536
EndPage	95925
ExternalDocumentID	oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123 10_1109_ACCESS_2024_3424236 10586958
Genre	orig-research
GrantInformation_xml	– fundername: University of Minnesota grantid: U01HL53934 – fundername: University of California, Davis grantid: U01HL53916 – fundername: New York University grantid: U01HL53931 – fundername: NRF through the Basic Science Research Program grantid: 2021R1F1A1047963 – fundername: University of Arizona grantid: U01HL53938 – fundername: Johns Hopkins University grantid: U01HL53937; U01HL64360 – fundername: Case Western Reserve University grantid: U01HL63463 – fundername: National Heart, Lung, and Blood Institute grantid: R24 HL114473; 75N92019R002 – fundername: Bio and Medical Technology Development Program of the National Research Foundation (NRF) funded by Korean Government [Ministry of Science and Information-Communication-Technology (MSIT)] grantid: RS-2023-00223501 – fundername: Boston University grantid: U01HL53941
GroupedDBID	0R~ 4.4 5VS 6IK 97E AAJGR ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3
IEDL.DBID	RIE
ISSN	2169-3536
IngestDate	Tue Oct 22 15:11:09 EDT 2024 Wed Oct 16 11:30:53 EDT 2024 Wed Sep 11 13:52:22 EDT 2024 Wed Sep 11 06:09:18 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3
ORCID	0000-0003-4917-0177 0000-0003-1998-7784 0000-0001-6218-4560
OpenAccessLink	https://ieeexplore.ieee.org/document/10586958
PQID	3081864773
PQPubID	4845423
PageCount	12
ParticipantIDs	proquest_journals_3081864773 doaj_primary_oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123 crossref_primary_10_1109_ACCESS_2024_3424236 ieee_primary_10586958
PublicationCentury	2000
PublicationDate	20240000 2024-00-00 20240101 2024-01-01
PublicationDateYYYYMMDD	2024-01-01
PublicationDate_xml	– year: 2024 text: 20240000
PublicationDecade	2020
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE access
PublicationTitleAbbrev	Access
PublicationYear	2024
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref12 ref23 ref15 ref14 Devlin (ref2) 2018 ref20 Quan (ref18) 1997; 20 ref10 ref21 ref1 Lu (ref9) ref16 ref19 ref7 Iber (ref11) 2007 Dosovitskiy (ref3) 2020 ref4 ref6 Touvron (ref8) ref5 Battaglia (ref24) 2018 Radford (ref17) 2018 Xiao (ref22)
References_xml	– ident: ref13 doi: 10.1371/journal.pone.0216456 – ident: ref21 doi: 10.3390/jpm12020136 – start-page: 10347 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref8 article-title: Training data-efficient image transformers&distillation through attention contributor: fullname: Touvron – ident: ref7 doi: 10.1109/TBME.2022.3147187 – ident: ref20 doi: 10.1007/978-1-4939-2864-4_797 – volume-title: The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specification year: 2007 ident: ref11 contributor: fullname: Iber – start-page: 14663 volume-title: Proc. 35st Conf. Neural Inf. Process. Syst. ident: ref9 article-title: Bridging the gap between vision transformers and convolutional neural networks on small datasets contributor: fullname: Lu – ident: ref19 doi: 10.1093/jamia/ocy064 – ident: ref23 doi: 10.1073/pnas.1005493107 – ident: ref10 doi: 10.1016/0013-4694(69)90021-2 – year: 2018 ident: ref2 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: arXiv:1810.04805 contributor: fullname: Devlin – ident: ref1 doi: 10.48550/ARXIV.1706.03762 – start-page: 30392 volume-title: Proc. NIPS ident: ref22 article-title: Early convolutions help transformers see better contributor: fullname: Xiao – ident: ref4 doi: 10.1016/j.compbiomed.2018.08.022 – ident: ref12 doi: 10.1109/TNSRE.2017.2721116 – ident: ref14 doi: 10.1109/CISP-BMEI.2018.8633058 – year: 2020 ident: ref3 article-title: An image is worth 16×16 words: Transformers for image recognition at scale publication-title: arXiv:2010.11929 contributor: fullname: Dosovitskiy – year: 2018 ident: ref17 article-title: Improving language understanding by generative pre-training contributor: fullname: Radford – ident: ref15 doi: 10.1109/TNSRE.2021.3076234 – year: 2018 ident: ref24 article-title: Relational inductive biases, deep learning, and graph networks publication-title: arXiv:1806.01261 contributor: fullname: Battaglia – ident: ref5 doi: 10.1109/EMBC.2018.8512214 – ident: ref16 doi: 10.7551/mitpress/9609.001.0001 – volume: 20 start-page: 1077 issue: 12 year: 1997 ident: ref18 article-title: The sleep heart health study: Design, rationale, and methods publication-title: Sleep contributor: fullname: Quan – ident: ref6 doi: 10.3390/ijerph17114152
SSID	ssj0000816957
Score	2.350507
Snippet	The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement...
SourceID	doaj proquest crossref ieee
SourceType	Open Website Aggregation Database Publisher
StartPage	95914
SubjectTerms	Analytical models Brain modeling Classification Classification algorithms Computer architecture Computer vision Convolutional neural networks Data models Datasets Deep learning Electroencephalography Electromyography Feature extraction Object recognition Recurrent neural networks Sleep Sleep stage classification Task analysis Training transformer Transformers
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagEwyIRxGFgjwgsRCaxElsj6W0KgtCapHYLL8OIaG0tOn_x3bSUsTAwhpZcXzf-R7O-TuErol0YQfTJqKZNFFGKUQKTBI5VaaQxVLy8Ed3PKFPr-xh6GlyNq2-fE1YTQ9cC65HTJZzo0Ep4jy31opYTajxrCzgzW6wvnGxlUwFG8ySgue0oRlKYt7rDwZuRS4hTLM7kvkoovjhigJjf9Ni5ZddDs5mdIgOmigR9-uvO0I7tjxG-1vcgSfoc00ngmeAp-vw0y5ulrhfVXURI27YDxf4vcSTD2vn2AWXbxaHVpi-SCjggmVpcLjp5N6MHytczXB92GDx8_fFgjZ6GQ2ng3HU9E-ItHPaVZQqbSxoZRglBU1lyjVPYwacgE4IVS57SbmUeaxiILmWOrUScmMVsASKQpFT1CpnpT1DOAeWUyCgJFi3y6Ui1BruxoJNLWG6g27XohTzmiZDhPQi5qKWvPCSF43kO-jei3sz1HNchwcOedEgL_5CvoPaHqyt-XLmQGcd1F2jJ5oNuRQkUPc5XSTn_zH3Bdrz66nPYrqoVS1W9hLtLs3qKijiFxKN4nQ priority: 102 providerName: Directory of Open Access Journals
Title	Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance
URI	https://ieeexplore.ieee.org/document/10586958 https://www.proquest.com/docview/3081864773 https://doaj.org/article/3d459dcfbb3441ccb3ec37d2734f2123
Volume	12
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH-i4wIHNmCIbqPyAYkLGWkcx_ax6zptEkJIHRI3yx_vIaQpHW36_2M7aTeEOHCLIidx_Hv2-7Df7wG85zaaHcqHQtY2FLWUVDgK0yKKsqS6tFbnHd3rpfzyXV0uEk1Osc-FQcR8-AzP02Xeyw8rv02hsjjDhWq0UCMYSa36ZK19QCVVkNBCDsxC01J_ms3n8SeiD1jV57xOhkPzh_bJJP1DVZW_luKsX64O_7NnR_BiMCTZrEf-JTzB9hU8f0Qv-Bp-7RhH2IrY7c5CxfWHDZt1XX_OkQ0EiWv2s2XLO8R7Fu3PH8hytcx0jihDx2wbWE6Gim9mNx3rVqyPRyD7-pB7cAzfrha38-tiKLFQ-KjXu6JyPiB5F5Tkjaxspb2uSkWak59y6aKDU2lrRelK4sJbX6ElEdCRmlLTOP4GDtpVi2-BCVJCEidnCeNCYB2XGHRsS1ghV34MH3dDb-57Jg2TPZBSmx4pk5AyA1JjuEjw7JsmGux8I467GWaV4aEWOnhyjkezznvH0XMZEmUPJZ08huOE1aPv9TCN4WyHthnm7MbwzO4XxZWf_OOxU3iWuthHYM7goFtv8R2MNmE7yb78BJ4ulheXnydZMn8DzKfiag
link.rule.ids	315,782,786,798,866,2106,4028,27642,27932,27933,27934,54767,54942
linkProvider	IEEE
linkToHtml	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH9i3QE48DlE2QAfkLiQkdpJbB-7rlMnyoTUInGz_PGMkFA62vT_x3bcbgjtsFsUOYnjn5_fh59_D-AD08HsENYVvNKuqDj3hfFuVISpzH1Vai3Tju5swa9-iPNppMkp9mdhEDEln-FpvEx7-W5ltzFUFiS8Fo2sxQEcBremogM4nC7Ozuf7oEqsIiFrntmFRqX8PJ5Mwo8EP5BWp6yKxkPzjwZKRP25ssp_y3HSMRdP79m7Z_AkG5Nk3KP_HB5g-wIe36IYfAl_dqwjZOXJcmel4vrjhoy7rs91JJkkcU1-tWTxG_GaBBv0J5JUMTPmEiX4iG4dSQeiwpvJZUe6FeljEki-3Zw_OILvF9PlZFbkMguFDbq9K6ixDr01TnDWcKqptJKWwkvm7YhxE5wcKrWuS1N6VlttKWpfOzRejHzTGPYKBu2qxddAai9q7pk32mNYDLRhHJ0MbT1SZMIO4dNu6NV1z6ahkhdSStUjpSJSKiM1hLMIz75ppMJON8K4qyxZirmqls56Y1gw7aw1DC3jLtL2-KiXh3AUsbr1vR6mIZzs0FZZbjeKJYa_MGXZmzseew8PZ8uvczW_vPpyDI9id_uIzAkMuvUW38LBxm3f5Zn5F8i95Ew
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+Transformer%27s+Attention+Behavior+in+Sleep+Stage+Classification+and+Limiting+It+to+Improve+Performance&rft.jtitle=IEEE+access&rft.au=Kim%2C+Dongyoung&rft.au=Ko%2C+Young-Woong&rft.au=Kim%2C+Dong-Kyu&rft.au=Lee%2C+Jeong-Gun&rft.date=2024&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=12&rft.spage=95914&rft.epage=95925&rft_id=info:doi/10.1109%2FACCESS.2024.3424236&rft.externalDocID=10586958
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon