Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance

The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a speci...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 12; pp. 95914 - 95925
Main Authors: Kim, Dongyoung, Ko, Young-Woong, Kim, Dong-Kyu, Lee, Jeong-Gun
Format: Journal Article
Language:English
Published: Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets.
AbstractList The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement of using the transformer-based architecture is that the model must be trained on a large-scale dataset before it can be fine-tuned for a specific task like classification, object detection and etc. However, in this paper, we find that the transformer architecture has better generalization capability to capture the features from data samples for sleep stage classification than CNN-based architectures, despite using a small-scale dataset without pretraining on large-scale dataset. This outcome contradicts the widely-held belief that a transformer architecture is more effective when trained on large datasets. In this paper, we investigate the attention behavior of a transformer model and demonstrate how global and local attentions influence an attention map in a transformer architecture. Finally, through experiments, we show that restricting global attention using 'Masked Multi-Head Self-Attention (M-MHSA)' can lead to improved model generalization in sleep stage classification compared with the previous methodologies and original transformer-based architecture on three different datasets.
Author Ko, Young-Woong
Kim, Dongyoung
Lee, Jeong-Gun
Kim, Dong-Kyu
Author_xml – sequence: 1
  givenname: Dongyoung
  orcidid: 0000-0003-1998-7784
  surname: Kim
  fullname: Kim, Dongyoung
  organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea
– sequence: 2
  givenname: Young-Woong
  surname: Ko
  fullname: Ko, Young-Woong
  organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea
– sequence: 3
  givenname: Dong-Kyu
  orcidid: 0000-0003-4917-0177
  surname: Kim
  fullname: Kim, Dong-Kyu
  email: doctordk@naver.com
  organization: Department of Otorhinolaryngology-Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Chuncheon, South Korea
– sequence: 4
  givenname: Jeong-Gun
  orcidid: 0000-0001-6218-4560
  surname: Lee
  fullname: Lee, Jeong-Gun
  email: jeonggun.lee@hallym.ac.kr
  organization: Department of Computer Engineering, Hallym University, Chuncheon, South Korea
BookMark eNpNkU1rGzEQhkVJoWmaX5AeBD30ZEdf-3V0lzQxGFpwchYj7ciVWUuOpATy77vOhpC5aBje9x00z1dyFmJAQq44W3LOuutV399st0vBhFpKJZSQ9SdyLnjdLWQl67MP_RdymfOeTdVOo6o5J4-rAONL9plGR-8ThOxiOmD6memqFAzFx0B_4T949jFRH-h2RDzSbYEd0n6EnL3zFl5lEAa68QdffNjRdaEl0vXhmOIz0r-YTrkQLH4jnx2MGS_f3gvy8Pvmvr9bbP7crvvVZmGFUmUhjB3QWTO0jawbAaKznWCt66SzXDaGdbXoACpmmJOVBSsQXDWgcS13dW3kBVnPuUOEvT4mf4D0oiN4_TqIaachFW9H1HJQVTdYZ4xUiltrJFrZDKKRygku5JT1Y86afvP4hLnofXxK0-WyltMp21o1zUklZ5VNMeeE7n0rZ_qESs-o9AmVfkM1ub7PLo-IHxxVOxFq5X_jVJMQ
CODEN IAECCG
Cites_doi 10.1371/journal.pone.0216456
10.3390/jpm12020136
10.1109/TBME.2022.3147187
10.1007/978-1-4939-2864-4_797
10.1093/jamia/ocy064
10.1073/pnas.1005493107
10.1016/0013-4694(69)90021-2
10.48550/ARXIV.1706.03762
10.1016/j.compbiomed.2018.08.022
10.1109/TNSRE.2017.2721116
10.1109/CISP-BMEI.2018.8633058
10.1109/TNSRE.2021.3076234
10.1109/EMBC.2018.8512214
10.7551/mitpress/9609.001.0001
10.3390/ijerph17114152
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2024.3424236
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Materials Research Database

Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: http://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: ESBDL
  name: IEEE Open Access Journals
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 95925
ExternalDocumentID oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123
10_1109_ACCESS_2024_3424236
10586958
Genre orig-research
GrantInformation_xml – fundername: University of Minnesota
  grantid: U01HL53934
– fundername: University of California, Davis
  grantid: U01HL53916
– fundername: New York University
  grantid: U01HL53931
– fundername: NRF through the Basic Science Research Program
  grantid: 2021R1F1A1047963
– fundername: University of Arizona
  grantid: U01HL53938
– fundername: Johns Hopkins University
  grantid: U01HL53937; U01HL64360
– fundername: Case Western Reserve University
  grantid: U01HL63463
– fundername: National Heart, Lung, and Blood Institute
  grantid: R24 HL114473; 75N92019R002
– fundername: Bio and Medical Technology Development Program of the National Research Foundation (NRF) funded by Korean Government [Ministry of Science and Information-Communication-Technology (MSIT)]
  grantid: RS-2023-00223501
– fundername: Boston University
  grantid: U01HL53941
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3
IEDL.DBID RIE
ISSN 2169-3536
IngestDate Tue Oct 22 15:11:09 EDT 2024
Wed Oct 16 11:30:53 EDT 2024
Wed Sep 11 13:52:22 EDT 2024
Wed Sep 11 06:09:18 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c244t-2bcdefcbd873672a29c9208f93fc137b09629aa50b0f35cac2eaf5debf81f66b3
ORCID 0000-0003-4917-0177
0000-0003-1998-7784
0000-0001-6218-4560
OpenAccessLink https://ieeexplore.ieee.org/document/10586958
PQID 3081864773
PQPubID 4845423
PageCount 12
ParticipantIDs proquest_journals_3081864773
doaj_primary_oai_doaj_org_article_3d459dcfbb3441ccb3ec37d2734f2123
crossref_primary_10_1109_ACCESS_2024_3424236
ieee_primary_10586958
PublicationCentury 2000
PublicationDate 20240000
2024-00-00
20240101
2024-01-01
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – year: 2024
  text: 20240000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref23
ref15
ref14
Devlin (ref2) 2018
ref20
Quan (ref18) 1997; 20
ref10
ref21
ref1
Lu (ref9)
ref16
ref19
ref7
Iber (ref11) 2007
Dosovitskiy (ref3) 2020
ref4
ref6
Touvron (ref8)
ref5
Battaglia (ref24) 2018
Radford (ref17) 2018
Xiao (ref22)
References_xml – ident: ref13
  doi: 10.1371/journal.pone.0216456
– ident: ref21
  doi: 10.3390/jpm12020136
– start-page: 10347
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref8
  article-title: Training data-efficient image transformers&distillation through attention
  contributor:
    fullname: Touvron
– ident: ref7
  doi: 10.1109/TBME.2022.3147187
– ident: ref20
  doi: 10.1007/978-1-4939-2864-4_797
– volume-title: The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology, and Technical Specification
  year: 2007
  ident: ref11
  contributor:
    fullname: Iber
– start-page: 14663
  volume-title: Proc. 35st Conf. Neural Inf. Process. Syst.
  ident: ref9
  article-title: Bridging the gap between vision transformers and convolutional neural networks on small datasets
  contributor:
    fullname: Lu
– ident: ref19
  doi: 10.1093/jamia/ocy064
– ident: ref23
  doi: 10.1073/pnas.1005493107
– ident: ref10
  doi: 10.1016/0013-4694(69)90021-2
– year: 2018
  ident: ref2
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
  publication-title: arXiv:1810.04805
  contributor:
    fullname: Devlin
– ident: ref1
  doi: 10.48550/ARXIV.1706.03762
– start-page: 30392
  volume-title: Proc. NIPS
  ident: ref22
  article-title: Early convolutions help transformers see better
  contributor:
    fullname: Xiao
– ident: ref4
  doi: 10.1016/j.compbiomed.2018.08.022
– ident: ref12
  doi: 10.1109/TNSRE.2017.2721116
– ident: ref14
  doi: 10.1109/CISP-BMEI.2018.8633058
– year: 2020
  ident: ref3
  article-title: An image is worth 16×16 words: Transformers for image recognition at scale
  publication-title: arXiv:2010.11929
  contributor:
    fullname: Dosovitskiy
– year: 2018
  ident: ref17
  article-title: Improving language understanding by generative pre-training
  contributor:
    fullname: Radford
– ident: ref15
  doi: 10.1109/TNSRE.2021.3076234
– year: 2018
  ident: ref24
  article-title: Relational inductive biases, deep learning, and graph networks
  publication-title: arXiv:1806.01261
  contributor:
    fullname: Battaglia
– ident: ref5
  doi: 10.1109/EMBC.2018.8512214
– ident: ref16
  doi: 10.7551/mitpress/9609.001.0001
– volume: 20
  start-page: 1077
  issue: 12
  year: 1997
  ident: ref18
  article-title: The sleep heart health study: Design, rationale, and methods
  publication-title: Sleep
  contributor:
    fullname: Quan
– ident: ref6
  doi: 10.3390/ijerph17114152
SSID ssj0000816957
Score 2.350507
Snippet The transformer architecture has been focused on many tasks like natural language processes, vision tasks and etc. The most important and general requirement...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 95914
SubjectTerms Analytical models
Brain modeling
Classification
Classification algorithms
Computer architecture
Computer vision
Convolutional neural networks
Data models
Datasets
Deep learning
Electroencephalography
Electromyography
Feature extraction
Object recognition
Recurrent neural networks
Sleep
Sleep stage classification
Task analysis
Training
transformer
Transformers
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELagEwyIRxGFgjwgsRCaxElsj6W0KgtCapHYLL8OIaG0tOn_x3bSUsTAwhpZcXzf-R7O-TuErol0YQfTJqKZNFFGKUQKTBI5VaaQxVLy8Ed3PKFPr-xh6GlyNq2-fE1YTQ9cC65HTJZzo0Ep4jy31opYTajxrCzgzW6wvnGxlUwFG8ySgue0oRlKYt7rDwZuRS4hTLM7kvkoovjhigJjf9Ni5ZddDs5mdIgOmigR9-uvO0I7tjxG-1vcgSfoc00ngmeAp-vw0y5ulrhfVXURI27YDxf4vcSTD2vn2AWXbxaHVpi-SCjggmVpcLjp5N6MHytczXB92GDx8_fFgjZ6GQ2ng3HU9E-ItHPaVZQqbSxoZRglBU1lyjVPYwacgE4IVS57SbmUeaxiILmWOrUScmMVsASKQpFT1CpnpT1DOAeWUyCgJFi3y6Ui1BruxoJNLWG6g27XohTzmiZDhPQi5qKWvPCSF43kO-jei3sz1HNchwcOedEgL_5CvoPaHqyt-XLmQGcd1F2jJ5oNuRQkUPc5XSTn_zH3Bdrz66nPYrqoVS1W9hLtLs3qKijiFxKN4nQ
  priority: 102
  providerName: Directory of Open Access Journals
Title Analysis of Transformer's Attention Behavior in Sleep Stage Classification and Limiting It to Improve Performance
URI https://ieeexplore.ieee.org/document/10586958
https://www.proquest.com/docview/3081864773
https://doaj.org/article/3d459dcfbb3441ccb3ec37d2734f2123
Volume 12
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH-i4wIHNmCIbqPyAYkLGWkcx_ax6zptEkJIHRI3yx_vIaQpHW36_2M7aTeEOHCLIidx_Hv2-7Df7wG85zaaHcqHQtY2FLWUVDgK0yKKsqS6tFbnHd3rpfzyXV0uEk1Osc-FQcR8-AzP02Xeyw8rv02hsjjDhWq0UCMYSa36ZK19QCVVkNBCDsxC01J_ms3n8SeiD1jV57xOhkPzh_bJJP1DVZW_luKsX64O_7NnR_BiMCTZrEf-JTzB9hU8f0Qv-Bp-7RhH2IrY7c5CxfWHDZt1XX_OkQ0EiWv2s2XLO8R7Fu3PH8hytcx0jihDx2wbWE6Gim9mNx3rVqyPRyD7-pB7cAzfrha38-tiKLFQ-KjXu6JyPiB5F5Tkjaxspb2uSkWak59y6aKDU2lrRelK4sJbX6ElEdCRmlLTOP4GDtpVi2-BCVJCEidnCeNCYB2XGHRsS1ghV34MH3dDb-57Jg2TPZBSmx4pk5AyA1JjuEjw7JsmGux8I467GWaV4aEWOnhyjkezznvH0XMZEmUPJZ08huOE1aPv9TCN4WyHthnm7MbwzO4XxZWf_OOxU3iWuthHYM7goFtv8R2MNmE7yb78BJ4ulheXnydZMn8DzKfiag
link.rule.ids 315,782,786,798,866,2106,4028,27642,27932,27933,27934,54767,54942
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Nb9MwFH9i3QE48DlE2QAfkLiQkdpJbB-7rlMnyoTUInGz_PGMkFA62vT_x3bcbgjtsFsUOYnjn5_fh59_D-AD08HsENYVvNKuqDj3hfFuVISpzH1Vai3Tju5swa9-iPNppMkp9mdhEDEln-FpvEx7-W5ltzFUFiS8Fo2sxQEcBremogM4nC7Ozuf7oEqsIiFrntmFRqX8PJ5Mwo8EP5BWp6yKxkPzjwZKRP25ssp_y3HSMRdP79m7Z_AkG5Nk3KP_HB5g-wIe36IYfAl_dqwjZOXJcmel4vrjhoy7rs91JJkkcU1-tWTxG_GaBBv0J5JUMTPmEiX4iG4dSQeiwpvJZUe6FeljEki-3Zw_OILvF9PlZFbkMguFDbq9K6ixDr01TnDWcKqptJKWwkvm7YhxE5wcKrWuS1N6VlttKWpfOzRejHzTGPYKBu2qxddAai9q7pk32mNYDLRhHJ0MbT1SZMIO4dNu6NV1z6ahkhdSStUjpSJSKiM1hLMIz75ppMJON8K4qyxZirmqls56Y1gw7aw1DC3jLtL2-KiXh3AUsbr1vR6mIZzs0FZZbjeKJYa_MGXZmzseew8PZ8uvczW_vPpyDI9id_uIzAkMuvUW38LBxm3f5Zn5F8i95Ew
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Analysis+of+Transformer%27s+Attention+Behavior+in+Sleep+Stage+Classification+and+Limiting+It+to+Improve+Performance&rft.jtitle=IEEE+access&rft.au=Kim%2C+Dongyoung&rft.au=Ko%2C+Young-Woong&rft.au=Kim%2C+Dong-Kyu&rft.au=Lee%2C+Jeong-Gun&rft.date=2024&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=12&rft.spage=95914&rft.epage=95925&rft_id=info:doi/10.1109%2FACCESS.2024.3424236&rft.externalDocID=10586958
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon