Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge
General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related know...
Saved in:
Published in: | IEEE access Vol. 7; pp. 176600 - 176612 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Piscataway
IEEE
2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model. |
---|---|
AbstractList | General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model. |
Author | Yu, Shanshan Su, Jindian Luo, Da |
Author_xml | – sequence: 1 givenname: Shanshan orcidid: 0000-0002-0864-3970 surname: Yu fullname: Yu, Shanshan email: susyu@139.com organization: College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou, China – sequence: 2 givenname: Jindian orcidid: 0000-0001-5878-1848 surname: Su fullname: Su, Jindian email: sujd@scut.edu.cn organization: College of Computer Science and Engineering, South China University of Technology, Guangzhou, China – sequence: 3 givenname: Da orcidid: 0000-0003-1183-8354 surname: Luo fullname: Luo, Da organization: College of Computer Science and Engineering, South China University of Technology, Guangzhou, China |
BookMark | eNpNUU1PGzEQtSqQCoFfwMVSzxv8vfYxbFMagYREgjhaXq-dOtrYdL2h6b-v6SLESKMZzcybeZp3Dk5iig6AK4zmGCN1vWia5Xo9JwirOVGcKoW-gDOChaoop-LkU_4VXOa8Q8VkKfH6DDyt9i9Deg1xC2-Wj5vqxmTXwY07jrDpTc7BB2vGkCJ8DuMvuDgcQx_M8BeuXRxdtA6a2MHvaW9ChHcx_eldt3UX4NSbPrvL9zgDTz-Wm-Zndf9wu2oW95VlSI6V5JIVxrXF3EpKaN0pQhAR0irvJeOYY29ZV-OWSEEVEUgy7H1bK6qo7Fo6A6tpb5fMTr8MYV-o6WSC_l9Iw1abYQy2d5pLWVPhfEG2rGuZ4aq1VGLvpKKs-Ax8m3aVf_w-uDzqXToMsdDXhHEuKBFKlCk6Tdkh5Tw4_3EVI_0mh57k0G9y6Hc5CupqQgXn3AdClg7FlP4DPWaFKA |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1002_cpe_6452 crossref_primary_10_1016_j_ijin_2021_06_005 crossref_primary_10_1002_cpe_6874 crossref_primary_10_1007_s10115_024_02118_2 crossref_primary_10_1155_2022_7071485 crossref_primary_10_1007_s10844_022_00724_6 crossref_primary_10_1371_journal_pone_0292903 crossref_primary_10_1109_ACCESS_2021_3091394 crossref_primary_10_1007_s10579_023_09682_z crossref_primary_10_1109_ACCESS_2021_3068323 crossref_primary_10_3390_su14052802 crossref_primary_10_3390_sym15020395 crossref_primary_10_1016_j_engappai_2023_107783 crossref_primary_10_1016_j_neucom_2023_126488 crossref_primary_10_1007_s13278_021_00852_x crossref_primary_10_1007_s12652_024_04807_w crossref_primary_10_1109_ACCESS_2022_3201542 crossref_primary_10_1007_s10791_021_09398_0 crossref_primary_10_1007_s40299_024_00872_z crossref_primary_10_56294_sctconf2024758 crossref_primary_10_1016_j_ins_2023_02_070 crossref_primary_10_1016_j_heliyon_2022_e11052 crossref_primary_10_3390_ai3040056 crossref_primary_10_1016_j_neucom_2022_08_057 crossref_primary_10_1016_j_knosys_2024_111975 crossref_primary_10_1080_01431161_2023_2270109 crossref_primary_10_1109_ACCESS_2022_3233228 crossref_primary_10_1109_ACCESS_2022_3166628 crossref_primary_10_1007_s10489_022_03655_5 crossref_primary_10_1155_2022_3498123 crossref_primary_10_1007_s11042_023_17414_2 crossref_primary_10_1016_j_dss_2023_114038 crossref_primary_10_3389_frai_2023_1225213 crossref_primary_10_2139_ssrn_4119325 crossref_primary_10_3390_s22249922 crossref_primary_10_1007_s10489_022_04346_x crossref_primary_10_1007_s13755_024_00281_y crossref_primary_10_32604_cmc_2022_025711 crossref_primary_10_3390_math10214124 crossref_primary_10_1016_j_jss_2022_111505 crossref_primary_10_6023_cjoc202012037 |
Cites_doi | 10.18653/v1/P17-1052 10.18653/v1/N18-1202 10.3115/1219840.1219855 10.1007/978-3-030-32381-3_16 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2019.2953990 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998-Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: http://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: ESBDL name: IEEE Open Access Journals url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 176612 |
ExternalDocumentID | oai_doaj_org_article_588736ef393b4db4a59bc381fe893489 10_1109_ACCESS_2019_2953990 8903313 |
Genre | orig-research |
GrantInformation_xml | – fundername: Guangdong Science and Technology Department grantid: 20168010124010 funderid: 10.13039/501100007162 – fundername: Natural Science Foundation of Guangdong Province grantid: 2015A030310318 funderid: 10.13039/501100003453 – fundername: Medical Scientific Research Foundation of Guangdong Province grantid: A2015065 – fundername: Guangdong Pharmaceutical University grantid: 52159433 funderid: 10.13039/501100008366 – fundername: National Natural Science Foundation of China grantid: 61936003 funderid: 10.13039/501100001809 – fundername: Research and Development Program in Key Areas of Guangdong Province grantid: 2018B010109004 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-85842957c15c83237d9220268c9ff845151fc4d71b28639260841ffb793938db3 |
IEDL.DBID | ESBDL |
ISSN | 2169-3536 |
IngestDate | Tue Oct 22 14:53:47 EDT 2024 Thu Oct 10 19:10:09 EDT 2024 Fri Aug 23 03:24:14 EDT 2024 Mon Nov 04 11:48:57 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-85842957c15c83237d9220268c9ff845151fc4d71b28639260841ffb793938db3 |
ORCID | 0000-0003-1183-8354 0000-0002-0864-3970 0000-0001-5878-1848 |
OpenAccessLink | https://ieeexplore.ieee.org/document/8903313 |
PQID | 2455632696 |
PQPubID | 4845423 |
PageCount | 13 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_588736ef393b4db4a59bc381fe893489 ieee_primary_8903313 proquest_journals_2455632696 crossref_primary_10_1109_ACCESS_2019_2953990 |
PublicationCentury | 2000 |
PublicationDate | 20190000 2019-00-00 20190101 2019-01-01 |
PublicationDateYYYYMMDD | 2019-01-01 |
PublicationDate_xml | – year: 2019 text: 20190000 |
PublicationDecade | 2010 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2019 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | xu (ref17) 2019 holge (ref1) 2017; 1 andrew (ref22) 2011 jeremy (ref13) 2018 alec (ref14) 2018 xiang (ref24) 2015 ref20 ref11 ashish (ref15) 2017 zichao (ref5) 2016 ref2 ref16 yonghui (ref19) 2016 miyato (ref6) 2017 tao (ref4) 2018 filippos (ref3) 2017 richard (ref21) 2013 sun (ref18) 2019 jacob (ref12) 2018 tomas (ref8) 2013 mikolov (ref7) 2013; 26 jeffrey (ref9) 2014; 12 ellen (ref23) 1999 mccann (ref10) 2017 |
References_xml | – start-page: 5998 year: 2017 ident: ref15 article-title: Attention is all you need publication-title: Proc 31st Conf Neural Inf Process Syst (NIPS) contributor: fullname: ashish – start-page: 1480 year: 2016 ident: ref5 article-title: Hierarchical attention networks for document classification publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics Hum Lang Technol contributor: fullname: zichao – year: 2018 ident: ref13 article-title: Universal language model fine-tuning for text classification contributor: fullname: jeremy – year: 2013 ident: ref8 article-title: Efficient estimation of word representations in vector space publication-title: Proc Workshop (ICLR) contributor: fullname: tomas – start-page: 586 year: 2017 ident: ref3 article-title: Structural attention neural networks for improved sentiment analysis publication-title: Proc 15th Conf Eur Chapter Assoc Comput Linguistics contributor: fullname: filippos – volume: 12 start-page: 1532 year: 2014 ident: ref9 article-title: Glove: Global vectors for word representation publication-title: Proc Empiricial Methods Natural Lang Process (EMNLP) contributor: fullname: jeffrey – year: 2019 ident: ref18 article-title: Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence publication-title: arXiv 1903 09588 contributor: fullname: sun – year: 2016 ident: ref19 article-title: Google's neural machine translation system: Bridging the gap between human and machine translation publication-title: arXiv 1609 08144 contributor: fullname: yonghui – start-page: 142 year: 2011 ident: ref22 article-title: Learning word vectors for sentiment analysis publication-title: Proc Annu Meeting Assoc Comput Linguist Conf Human Lang Technol contributor: fullname: andrew – start-page: 1 year: 2015 ident: ref24 article-title: Character-level convolutional networks for text classification publication-title: Proc Neural Inf Process Syst contributor: fullname: xiang – ident: ref2 doi: 10.18653/v1/P17-1052 – volume: 1 start-page: 1107 year: 2017 ident: ref1 article-title: Very deep convolutional networks for text classification publication-title: Proc 15th Conf Eur Chapter Assoc for Comput Linguistics contributor: fullname: holge – start-page: 82 year: 1999 ident: ref23 article-title: The TREC-8-question answering track evaluation publication-title: Proc 8th Text Retrieval Conf contributor: fullname: ellen – volume: 26 start-page: 3111 year: 2013 ident: ref7 article-title: Distributed representations of words and phrases and their compositionality publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: mikolov – ident: ref11 doi: 10.18653/v1/N18-1202 – ident: ref20 doi: 10.3115/1219840.1219855 – start-page: 1631 year: 2013 ident: ref21 article-title: Recursive deep models for semantic compositionality over a sentiment treebank publication-title: Proc Conf Empirical Methods Natural Lang Process contributor: fullname: richard – year: 2018 ident: ref12 article-title: Bert: Pre-training of deep bidirectional transformers for language understanding contributor: fullname: jacob – year: 2019 ident: ref17 article-title: Bert post-training for review reading comprehension and aspect-based sentiment analysis publication-title: arXiv 1904 02232 contributor: fullname: xu – year: 2018 ident: ref14 article-title: Improving language understanding with unsupervised learning contributor: fullname: alec – start-page: 6294 year: 2017 ident: ref10 article-title: Learned in translation: Contextualized word vectors publication-title: Proc Adv Neural Inf Process Syst contributor: fullname: mccann – ident: ref16 doi: 10.1007/978-3-030-32381-3_16 – start-page: 5446 year: 2018 ident: ref4 article-title: DISAN: Directional self-attention network for rnn/CNN-free language understanding publication-title: Proc AAAI Conf Artif Intell (AAAI) contributor: fullname: tao – start-page: 1 year: 2017 ident: ref6 article-title: Adversarial training methods for semi-supervised text classification publication-title: Proc Int Conf Learn Represent (ICLR)ICLR contributor: fullname: miyato |
SSID | ssj0000816957 |
Score | 2.4294767 |
Snippet | General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 176600 |
SubjectTerms | bidirectional encoder representations from transformer Bit error rate Classification Context modeling Datasets Domains language model Natural language processing neural networks Sentences State vectors Task analysis Text categorization text classification Training Training data |
SummonAdditionalLinks | – databaseName: Directory of Open Access Journals dbid: DOA link: http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV05T8MwFLaACQYEFES55IGR0PhIYo-9EBISA20Fm-UjFpUgRdBK8O95dtKqiIGFNUqc-Ht5p56_h9ClMVoIC4rki9wkPEtNIrRjCRXGMVfwjPs4OmFU3D-JwTDQ5KxGfYWesJoeuAauk4EWsLz0TDLDneE6kwbWJr4ET8tFfXQvzdeSqWiDBcllVjQ0QySVnW6_DzsKvVzymsrAx5r-cEWRsb8ZsfLLLkdnc7OHdpsoEXfrr9tHG2V1gHbWuANbaLIqB-De8GGc9MAdOTwGW4vjoMvQAhRRx4_T-TPuLj6nL1P9_oVHgYUTRI115fBg9qqnFb5bVtYO0eRmOO7fJs2MhMTyVMwTAQEEbKewJLOgnKxwklLIq4SV3gsO0QrxlruCGCogGIHsRXDivQG1lEw4w47QVjWrymOEdebhqdKG6eic0RLgpdR6SjXRvPC2ja6WcKm3mgpDxRQilapGVwV0VYNuG_UCpKtbA491vADSVY101V_SbaNWEMhqESFTxghro7OlgFSjcx-K8kB2RnOZn_zHq0_RdthOXW45Q1vz90V5jjY_3OIi_mvfowPS7g priority: 102 providerName: Directory of Open Access Journals |
Title | Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge |
URI | https://ieeexplore.ieee.org/document/8903313 https://www.proquest.com/docview/2455632696 https://doaj.org/article/588736ef393b4db4a59bc381fe893489 |
Volume | 7 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NbxMxEB1BeoEDLRREaIl84Nhtd21vbB_zVVUCcSCp4Gb5U0QqG9QmEvx7xo6zAsGF22q1Xtl-HntmbL8H8M5aI6VDQ4pibCve1raSxrOKSuuZF7zlMUsnLMXHL3K-SDQ5F_1dmBBCPnwWLtNj3sv3G7dLqbIrqWrGkkTtERXoNw_gaLGczj_0OZUkIqFaUciFmlpdTWYzbEc6waUuqUosrPUfC1Dm6S_CKn_NxnmJuT7-v8qdwLPiSpLJHvvn8Ch0L-DpbwSDp3Db5wzIdPFpVU1xzfJkhRMyyWqY6ZxQhoZ8Xm-_ksnux_pube5_kmWi6sTxQEznyXzzzaw78v6QfnsJt9eL1eymKkIKleO13FYSvQxsvXBN69CCmfCKUgy-pFMxSo4uTRMd96KxVKLHgiGO5E2MFm1XMektewWDbtOF10BMG7FUcElCnTMaFOOUukipaQwX0Q3h4tC7-vueL0PnOKNWeg-GTmDoAsYQpgmB_tNEdp1fYNfqYju6xYmQjUPEyljuLTetsji8mhjQ2eJSDeE0wdH_pCAxhPMDnroY5oOmPDGi0bEav_l3qTN4kiq4z7Kcw2B7vwtv4fGD341yvD4qg26Urwr-AjRp09M |
link.rule.ids | 315,782,786,798,866,2106,4028,27642,27932,27933,27934,54767,54942 |
linkProvider | IEEE |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB7R7YH2wKugLhTqA8emTWxnYx_3VRV12wO7FdwsP8VKkEXtrgT_nrHXG4Hgwi2K4sj257FnJpPvA3hvjBbCoiGFZmAKXpemENqxggrjmGt4zUOSTpg3t5_FZBppcs66f2G896n4zJ_Hy_Qt363sJqbKLoQsGYsStfsY1XDag_3pfDSZdTmVKCIh6yaTC1WlvBiOxziOWMElz6mMLKzlHwdQ4unPwip_7cbpiLl8-n-dewZPsitJhlvsn8Mj376Aw98IBo_grssZkNH046IY4ZnlyAI3ZJLUMGOdUIKGfFquv5Dh5sfy61Lf_yTzSNWJ64Ho1pHJ6ptetuR6l357CXeX08X4qshCCoXlpVgXAr0MHH1jq9qiBbPGSUox-BJWhiA4ujRVsNw1laECPRYMcQSvQjBou5IJZ9gr6LWr1h8D0XXAVt5GCXXOqJeMU2oDpbrSvAm2D2e72VXft3wZKsUZpVRbMFQEQ2Uw-jCKCHSPRrLrdAOnVmXbUTVuhGzgA3bGcGe4rqXB5VUFj84WF7IPRxGO7iUZiT6c7PBU2TAfFOWREY0O5OD1v1udwuOrxc1MzT7cXr-Bg9jZbcblBHrr-41_C3sPbvMuL71fUuDUyg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+BERT-Based+Text+Classification+With+Auxiliary+Sentence+and+Domain+Knowledge&rft.jtitle=IEEE+access&rft.au=Yu%2C+Shanshan&rft.au=Su%2C+Jindian&rft.au=Luo%2C+Da&rft.date=2019&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=7&rft.spage=176600&rft.epage=176612&rft_id=info:doi/10.1109%2FACCESS.2019.2953990&rft.externalDocID=8903313 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |