Towards Faithful Model Explanation in NLP: A Survey

End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand. This has given rise to numerous efforts towards model explainability in recent years. One desideratum of model explanation is faithfulness, i.e. an explanation should accurately represent the reasonin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lyu, Qing, Apidianaki, Marianna, Callison-Burch, Chris
Format:	Journal Article
Language:	English
Published:	22-09-2022
Subjects:	Computer Science - Computation and Language
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand. This has given rise to numerous efforts towards model explainability in recent years. One desideratum of model explanation is faithfulness, i.e. an explanation should accurately represent the reasoning process behind the model's prediction. In this survey, we review over 110 model explanation methods in NLP through the lens of faithfulness. We first discuss the definition and evaluation of faithfulness, as well as its significance for explainability. We then introduce recent advances in faithful explanation, grouping existing approaches into five categories: similarity-based methods, analysis of model-internal structures, backpropagation-based methods, counterfactual intervention, and self-explanatory models. For each category, we synthesize its representative studies, strengths, and weaknesses. Finally, we summarize their common virtues and remaining challenges, and reflect on future work directions towards faithful explainability in NLP.
AbstractList	End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand. This has given rise to numerous efforts towards model explainability in recent years. One desideratum of model explanation is faithfulness, i.e. an explanation should accurately represent the reasoning process behind the model's prediction. In this survey, we review over 110 model explanation methods in NLP through the lens of faithfulness. We first discuss the definition and evaluation of faithfulness, as well as its significance for explainability. We then introduce recent advances in faithful explanation, grouping existing approaches into five categories: similarity-based methods, analysis of model-internal structures, backpropagation-based methods, counterfactual intervention, and self-explanatory models. For each category, we synthesize its representative studies, strengths, and weaknesses. Finally, we summarize their common virtues and remaining challenges, and reflect on future work directions towards faithful explainability in NLP.
Author	Apidianaki, Marianna Callison-Burch, Chris Lyu, Qing
Author_xml	– sequence: 1 givenname: Qing surname: Lyu fullname: Lyu, Qing – sequence: 2 givenname: Marianna surname: Apidianaki fullname: Apidianaki, Marianna – sequence: 3 givenname: Chris surname: Callison-Burch fullname: Callison-Burch, Chris
BackLink	https://doi.org/10.48550/arXiv.2209.11326$$DView paper in arXiv
BookMark	eNotzslOwzAUhWEvYAGFB2CFXyDheojjsKuqFpDCIJF9dGNfC0vBqdyB9u2Bwtn8u6Pvkp2lKRFjNwJKbasK7jAf4r6UEppSCCXNBVPd9IXZb_gK4_Yj7Eb-PHka-fKwHjHhNk6Jx8Rf2rd7Pufvu7yn4xU7Dzhu6Pq_M9atlt3isWhfH54W87ZAU5tiEE6aSmvyJNEEQ1Q7qDGEwQcH2kiwzc9gqNAKYxsXcFCka6-lBUegZuz27_ak7tc5fmI-9r_6_qRX3y99QD8
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by-nc-nd/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by-nc-nd/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2209.11326
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2209_11326
GroupedDBID	AKY GOX
ID	FETCH-LOGICAL-a676-b1c26544ede2a6f6ee7c07affbdfc04620899990b5a81689cfab3e47d4280ce03
IEDL.DBID	GOX
IngestDate	Thu Jan 18 17:38:05 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a676-b1c26544ede2a6f6ee7c07affbdfc04620899990b5a81689cfab3e47d4280ce03
OpenAccessLink	https://arxiv.org/abs/2209.11326
ParticipantIDs	arxiv_primary_2209_11326
PublicationCentury	2000
PublicationDate	2022-09-22
PublicationDateYYYYMMDD	2022-09-22
PublicationDate_xml	– month: 09 year: 2022 text: 2022-09-22 day: 22
PublicationDecade	2020
PublicationYear	2022
Score	1.8583511
SecondaryResourceType	preprint
Snippet	End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand. This has given rise to numerous efforts towards model...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Computation and Language
Title	Towards Faithful Model Explanation in NLP: A Survey
URI	https://arxiv.org/abs/2209.11326
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED7RTiwIBKg85YE1IjhOTNgqaOiAClIzdIv8FJGqCiVNBf-eOycIFlb7lrN1d599jw_gRmlnZW4QuancR0J7sjmRo11Z9JgpBmRBzcnzpVys7p9mNCaH_fTCqOaz3vXzgXV7y3mcE-sIz0Yw4pxKtp5fV31yMoziGuR_5RBjhqU_QaI4hIMB3bFpfx1HsOc2x5CUoTS1ZYWqt---WzMiIFszKn9T_Wccqzds8fL2wKZs2TU793UCZTErH-fRwFUQqUxmkb4zPEuFcNZxlfnMOWliqbzX1hvq_6TsGjp-nSoiusiNVzpxQlpE_zFRdp3CGJ_7bgJMKodR3PhUJFJYwzGeIsxxiNMQi2mfnsEkaFh99OMoKlK-Csqf_791AfucCvcpn8IvYbxtOncFo9Z21-FMvwFZY3NX
link.rule.ids	228,230,782,887
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+Faithful+Model+Explanation+in+NLP%3A+A+Survey&rft.au=Lyu%2C+Qing&rft.au=Apidianaki%2C+Marianna&rft.au=Callison-Burch%2C+Chris&rft.date=2022-09-22&rft_id=info:doi/10.48550%2Farxiv.2209.11326&rft.externalDocID=2209_11326