Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related know...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 7; pp. 176600 - 176612
Main Authors: Yu, Shanshan, Su, Jindian, Luo, Da
Format: Journal Article
Language:English
Published: Piscataway IEEE 2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.
AbstractList General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.
Author Yu, Shanshan
Su, Jindian
Luo, Da
Author_xml – sequence: 1
  givenname: Shanshan
  orcidid: 0000-0002-0864-3970
  surname: Yu
  fullname: Yu, Shanshan
  email: susyu@139.com
  organization: College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, China
– sequence: 2
  givenname: Jindian
  orcidid: 0000-0001-5878-1848
  surname: Su
  fullname: Su, Jindian
  email: sujd@scut.edu.cn
  organization: College of Computer Science and Engineering, South China University of Technology, Guangzhou, China
– sequence: 3
  givenname: Da
  orcidid: 0000-0003-1183-8354
  surname: Luo
  fullname: Luo, Da
  organization: College of Computer Science and Engineering, South China University of Technology, Guangzhou, China
BookMark eNpNUU1PGzEQtSqQCoFfwMVSzxv8vfYxbFMagYREgjhaXq-dOtrYdL2h6b-v6SLESKMZzcybeZp3Dk5iig6AK4zmGCN1vWia5Xo9JwirOVGcKoW-gDOChaoop-LkU_4VXOa8Q8VkKfH6DDyt9i9Deg1xC2-Wj5vqxmTXwY07jrDpTc7BB2vGkCJ8DuMvuDgcQx_M8BeuXRxdtA6a2MHvaW9ChHcx_eldt3UX4NSbPrvL9zgDTz-Wm-Zndf9wu2oW95VlSI6V5JIVxrXF3EpKaN0pQhAR0irvJeOYY29ZV-OWSEEVEUgy7H1bK6qo7Fo6A6tpb5fMTr8MYV-o6WSC_l9Iw1abYQy2d5pLWVPhfEG2rGuZ4aq1VGLvpKKs-Ax8m3aVf_w-uDzqXToMsdDXhHEuKBFKlCk6Tdkh5Tw4_3EVI_0mh57k0G9y6Hc5CupqQgXn3AdClg7FlP4DPWaFKA
CODEN IAECCG
CitedBy_id crossref_primary_10_1002_cpe_6452
crossref_primary_10_1016_j_ijin_2021_06_005
crossref_primary_10_1002_cpe_6874
crossref_primary_10_1007_s10115_024_02118_2
crossref_primary_10_1155_2022_7071485
crossref_primary_10_1007_s10844_022_00724_6
crossref_primary_10_1371_journal_pone_0292903
crossref_primary_10_1109_ACCESS_2021_3091394
crossref_primary_10_1007_s10579_023_09682_z
crossref_primary_10_1109_ACCESS_2021_3068323
crossref_primary_10_3390_su14052802
crossref_primary_10_3390_sym15020395
crossref_primary_10_1016_j_engappai_2023_107783
crossref_primary_10_1016_j_neucom_2023_126488
crossref_primary_10_1007_s13278_021_00852_x
crossref_primary_10_1007_s12652_024_04807_w
crossref_primary_10_1109_ACCESS_2022_3201542
crossref_primary_10_1007_s10791_021_09398_0
crossref_primary_10_1007_s40299_024_00872_z
crossref_primary_10_56294_sctconf2024758
crossref_primary_10_1016_j_ins_2023_02_070
crossref_primary_10_1016_j_heliyon_2022_e11052
crossref_primary_10_3390_ai3040056
crossref_primary_10_1016_j_neucom_2022_08_057
crossref_primary_10_1016_j_knosys_2024_111975
crossref_primary_10_1080_01431161_2023_2270109
crossref_primary_10_1109_ACCESS_2022_3233228
crossref_primary_10_1109_ACCESS_2022_3166628
crossref_primary_10_1007_s10489_022_03655_5
crossref_primary_10_1155_2022_3498123
crossref_primary_10_1007_s11042_023_17414_2
crossref_primary_10_1016_j_dss_2023_114038
crossref_primary_10_3389_frai_2023_1225213
crossref_primary_10_2139_ssrn_4119325
crossref_primary_10_3390_s22249922
crossref_primary_10_1007_s10489_022_04346_x
crossref_primary_10_1007_s13755_024_00281_y
crossref_primary_10_32604_cmc_2022_025711
crossref_primary_10_3390_math10214124
crossref_primary_10_1016_j_jss_2022_111505
crossref_primary_10_6023_cjoc202012037
Cites_doi 10.18653/v1/P17-1052
10.18653/v1/N18-1202
10.3115/1219840.1219855
10.1007/978-3-030-32381-3_16
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2019.2953990
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList

Materials Research Database
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: http://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: ESBDL
  name: IEEE Open Access Journals
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 176612
ExternalDocumentID oai_doaj_org_article_588736ef393b4db4a59bc381fe893489
10_1109_ACCESS_2019_2953990
8903313
Genre orig-research
GrantInformation_xml – fundername: Guangdong Science and Technology Department
  grantid: 20168010124010
  funderid: 10.13039/501100007162
– fundername: Natural Science Foundation of Guangdong Province
  grantid: 2015A030310318
  funderid: 10.13039/501100003453
– fundername: Medical Scientific Research Foundation of Guangdong Province
  grantid: A2015065
– fundername: Guangdong Pharmaceutical University
  grantid: 52159433
  funderid: 10.13039/501100008366
– fundername: National Natural Science Foundation of China
  grantid: 61936003
  funderid: 10.13039/501100001809
– fundername: Research and Development Program in Key Areas of Guangdong Province
  grantid: 2018B010109004
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c408t-85842957c15c83237d9220268c9ff845151fc4d71b28639260841ffb793938db3
IEDL.DBID ESBDL
ISSN 2169-3536
IngestDate Tue Oct 22 14:53:47 EDT 2024
Thu Oct 10 19:10:09 EDT 2024
Fri Aug 23 03:24:14 EDT 2024
Mon Nov 04 11:48:57 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c408t-85842957c15c83237d9220268c9ff845151fc4d71b28639260841ffb793938db3
ORCID 0000-0003-1183-8354
0000-0002-0864-3970
0000-0001-5878-1848
OpenAccessLink https://ieeexplore.ieee.org/document/8903313
PQID 2455632696
PQPubID 4845423
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_588736ef393b4db4a59bc381fe893489
ieee_primary_8903313
proquest_journals_2455632696
crossref_primary_10_1109_ACCESS_2019_2953990
PublicationCentury 2000
PublicationDate 20190000
2019-00-00
20190101
2019-01-01
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – year: 2019
  text: 20190000
PublicationDecade 2010
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2019
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References xu (ref17) 2019
holge (ref1) 2017; 1
andrew (ref22) 2011
jeremy (ref13) 2018
alec (ref14) 2018
xiang (ref24) 2015
ref20
ref11
ashish (ref15) 2017
zichao (ref5) 2016
ref2
ref16
yonghui (ref19) 2016
miyato (ref6) 2017
tao (ref4) 2018
filippos (ref3) 2017
richard (ref21) 2013
sun (ref18) 2019
jacob (ref12) 2018
tomas (ref8) 2013
mikolov (ref7) 2013; 26
jeffrey (ref9) 2014; 12
ellen (ref23) 1999
mccann (ref10) 2017
References_xml – start-page: 5998
  year: 2017
  ident: ref15
  article-title: Attention is all you need
  publication-title: Proc 31st Conf Neural Inf Process Syst (NIPS)
  contributor:
    fullname: ashish
– start-page: 1480
  year: 2016
  ident: ref5
  article-title: Hierarchical attention networks for document classification
  publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics Hum Lang Technol
  contributor:
    fullname: zichao
– year: 2018
  ident: ref13
  article-title: Universal language model fine-tuning for text classification
  contributor:
    fullname: jeremy
– year: 2013
  ident: ref8
  article-title: Efficient estimation of word representations in vector space
  publication-title: Proc Workshop (ICLR)
  contributor:
    fullname: tomas
– start-page: 586
  year: 2017
  ident: ref3
  article-title: Structural attention neural networks for improved sentiment analysis
  publication-title: Proc 15th Conf Eur Chapter Assoc Comput Linguistics
  contributor:
    fullname: filippos
– volume: 12
  start-page: 1532
  year: 2014
  ident: ref9
  article-title: Glove: Global vectors for word representation
  publication-title: Proc Empiricial Methods Natural Lang Process (EMNLP)
  contributor:
    fullname: jeffrey
– year: 2019
  ident: ref18
  article-title: Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence
  publication-title: arXiv 1903 09588
  contributor:
    fullname: sun
– year: 2016
  ident: ref19
  article-title: Google's neural machine translation system: Bridging the gap between human and machine translation
  publication-title: arXiv 1609 08144
  contributor:
    fullname: yonghui
– start-page: 142
  year: 2011
  ident: ref22
  article-title: Learning word vectors for sentiment analysis
  publication-title: Proc Annu Meeting Assoc Comput Linguist Conf Human Lang Technol
  contributor:
    fullname: andrew
– start-page: 1
  year: 2015
  ident: ref24
  article-title: Character-level convolutional networks for text classification
  publication-title: Proc Neural Inf Process Syst
  contributor:
    fullname: xiang
– ident: ref2
  doi: 10.18653/v1/P17-1052
– volume: 1
  start-page: 1107
  year: 2017
  ident: ref1
  article-title: Very deep convolutional networks for text classification
  publication-title: Proc 15th Conf Eur Chapter Assoc for Comput Linguistics
  contributor:
    fullname: holge
– start-page: 82
  year: 1999
  ident: ref23
  article-title: The TREC-8-question answering track evaluation
  publication-title: Proc 8th Text Retrieval Conf
  contributor:
    fullname: ellen
– volume: 26
  start-page: 3111
  year: 2013
  ident: ref7
  article-title: Distributed representations of words and phrases and their compositionality
  publication-title: Proc Adv Neural Inf Process Syst
  contributor:
    fullname: mikolov
– ident: ref11
  doi: 10.18653/v1/N18-1202
– ident: ref20
  doi: 10.3115/1219840.1219855
– start-page: 1631
  year: 2013
  ident: ref21
  article-title: Recursive deep models for semantic compositionality over a sentiment treebank
  publication-title: Proc Conf Empirical Methods Natural Lang Process
  contributor:
    fullname: richard
– year: 2018
  ident: ref12
  article-title: Bert: Pre-training of deep bidirectional transformers for language understanding
  contributor:
    fullname: jacob
– year: 2019
  ident: ref17
  article-title: Bert post-training for review reading comprehension and aspect-based sentiment analysis
  publication-title: arXiv 1904 02232
  contributor:
    fullname: xu
– year: 2018
  ident: ref14
  article-title: Improving language understanding with unsupervised learning
  contributor:
    fullname: alec
– start-page: 6294
  year: 2017
  ident: ref10
  article-title: Learned in translation: Contextualized word vectors
  publication-title: Proc Adv Neural Inf Process Syst
  contributor:
    fullname: mccann
– ident: ref16
  doi: 10.1007/978-3-030-32381-3_16
– start-page: 5446
  year: 2018
  ident: ref4
  article-title: DISAN: Directional self-attention network for rnn/CNN-free language understanding
  publication-title: Proc AAAI Conf Artif Intell (AAAI)
  contributor:
    fullname: tao
– start-page: 1
  year: 2017
  ident: ref6
  article-title: Adversarial training methods for semi-supervised text classification
  publication-title: Proc Int Conf Learn Represent (ICLR)ICLR
  contributor:
    fullname: miyato
SSID ssj0000816957
Score 2.4294767
Snippet General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 176600
SubjectTerms bidirectional encoder representations from transformer
Bit error rate
Classification
Context modeling
Datasets
Domains
language model
Natural language processing
neural networks
Sentences
State vectors
Task analysis
Text categorization
text classification
Training
Training data
SummonAdditionalLinks – databaseName: Directory of Open Access Journals
  dbid: DOA
  link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV05T8MwFLaACQYEFES55IGR0PhIYo-9EBISA20Fm-UjFpUgRdBK8O95dtKqiIGFNUqc-Ht5p56_h9ClMVoIC4rki9wkPEtNIrRjCRXGMVfwjPs4OmFU3D-JwTDQ5KxGfYWesJoeuAauk4EWsLz0TDLDneE6kwbWJr4ET8tFfXQvzdeSqWiDBcllVjQ0QySVnW6_DzsKvVzymsrAx5r-cEWRsb8ZsfLLLkdnc7OHdpsoEXfrr9tHG2V1gHbWuANbaLIqB-De8GGc9MAdOTwGW4vjoMvQAhRRx4_T-TPuLj6nL1P9_oVHgYUTRI115fBg9qqnFb5bVtYO0eRmOO7fJs2MhMTyVMwTAQEEbKewJLOgnKxwklLIq4SV3gsO0QrxlruCGCogGIHsRXDivQG1lEw4w47QVjWrymOEdebhqdKG6eic0RLgpdR6SjXRvPC2ja6WcKm3mgpDxRQilapGVwV0VYNuG_UCpKtbA491vADSVY101V_SbaNWEMhqESFTxghro7OlgFSjcx-K8kB2RnOZn_zHq0_RdthOXW45Q1vz90V5jjY_3OIi_mvfowPS7g
  priority: 102
  providerName: Directory of Open Access Journals
Title Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge
URI https://ieeexplore.ieee.org/document/8903313
https://www.proquest.com/docview/2455632696
https://doaj.org/article/588736ef393b4db4a59bc381fe893489
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NbxMxEB1BeoEDLRREaIl84Nhtd21vbB_zVVUCcSCp4Gb5U0QqG9QmEvx7xo6zAsGF22q1Xtl-HntmbL8H8M5aI6VDQ4pibCve1raSxrOKSuuZF7zlMUsnLMXHL3K-SDQ5F_1dmBBCPnwWLtNj3sv3G7dLqbIrqWrGkkTtERXoNw_gaLGczj_0OZUkIqFaUciFmlpdTWYzbEc6waUuqUosrPUfC1Dm6S_CKn_NxnmJuT7-v8qdwLPiSpLJHvvn8Ch0L-DpbwSDp3Db5wzIdPFpVU1xzfJkhRMyyWqY6ZxQhoZ8Xm-_ksnux_pube5_kmWi6sTxQEznyXzzzaw78v6QfnsJt9eL1eymKkIKleO13FYSvQxsvXBN69CCmfCKUgy-pFMxSo4uTRMd96KxVKLHgiGO5E2MFm1XMektewWDbtOF10BMG7FUcElCnTMaFOOUukipaQwX0Q3h4tC7-vueL0PnOKNWeg-GTmDoAsYQpgmB_tNEdp1fYNfqYju6xYmQjUPEyljuLTetsji8mhjQ2eJSDeE0wdH_pCAxhPMDnroY5oOmPDGi0bEav_l3qTN4kiq4z7Kcw2B7vwtv4fGD341yvD4qg26Urwr-AjRp09M
link.rule.ids 315,782,786,798,866,2106,4028,27642,27932,27933,27934,54767,54942
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7R7YH2wKugLhTqA8emTWxnYx_3VRV12wO7FdwsP8VKkEXtrgT_nrHXG4Hgwi2K4sj257FnJpPvA3hvjBbCoiGFZmAKXpemENqxggrjmGt4zUOSTpg3t5_FZBppcs66f2G896n4zJ_Hy_Qt363sJqbKLoQsGYsStfsY1XDag_3pfDSZdTmVKCIh6yaTC1WlvBiOxziOWMElz6mMLKzlHwdQ4unPwip_7cbpiLl8-n-dewZPsitJhlvsn8Mj376Aw98IBo_grssZkNH046IY4ZnlyAI3ZJLUMGOdUIKGfFquv5Dh5sfy61Lf_yTzSNWJ64Ho1pHJ6ptetuR6l357CXeX08X4qshCCoXlpVgXAr0MHH1jq9qiBbPGSUox-BJWhiA4ujRVsNw1laECPRYMcQSvQjBou5IJZ9gr6LWr1h8D0XXAVt5GCXXOqJeMU2oDpbrSvAm2D2e72VXft3wZKsUZpVRbMFQEQ2Uw-jCKCHSPRrLrdAOnVmXbUTVuhGzgA3bGcGe4rqXB5VUFj84WF7IPRxGO7iUZiT6c7PBU2TAfFOWREY0O5OD1v1udwuOrxc1MzT7cXr-Bg9jZbcblBHrr-41_C3sPbvMuL71fUuDUyg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+BERT-Based+Text+Classification+With+Auxiliary+Sentence+and+Domain+Knowledge&rft.jtitle=IEEE+access&rft.au=Yu%2C+Shanshan&rft.au=Su%2C+Jindian&rft.au=Luo%2C+Da&rft.date=2019&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=7&rft.spage=176600&rft.epage=176612&rft_id=info:doi/10.1109%2FACCESS.2019.2953990&rft.externalDocID=8903313
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon