Towards Faithful Model Explanation in NLP: A Survey
End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand. This has given rise to numerous efforts towards model explainability in recent years. One desideratum of model explanation is faithfulness, i.e. an explanation should accurately represent the reasonin...
Saved in:
Main Authors: | , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
22-09-2022
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | End-to-end neural Natural Language Processing (NLP) models are notoriously
difficult to understand. This has given rise to numerous efforts towards model
explainability in recent years. One desideratum of model explanation is
faithfulness, i.e. an explanation should accurately represent the reasoning
process behind the model's prediction. In this survey, we review over 110 model
explanation methods in NLP through the lens of faithfulness. We first discuss
the definition and evaluation of faithfulness, as well as its significance for
explainability. We then introduce recent advances in faithful explanation,
grouping existing approaches into five categories: similarity-based methods,
analysis of model-internal structures, backpropagation-based methods,
counterfactual intervention, and self-explanatory models. For each category, we
synthesize its representative studies, strengths, and weaknesses. Finally, we
summarize their common virtues and remaining challenges, and reflect on future
work directions towards faithful explainability in NLP. |
---|---|
AbstractList | End-to-end neural Natural Language Processing (NLP) models are notoriously
difficult to understand. This has given rise to numerous efforts towards model
explainability in recent years. One desideratum of model explanation is
faithfulness, i.e. an explanation should accurately represent the reasoning
process behind the model's prediction. In this survey, we review over 110 model
explanation methods in NLP through the lens of faithfulness. We first discuss
the definition and evaluation of faithfulness, as well as its significance for
explainability. We then introduce recent advances in faithful explanation,
grouping existing approaches into five categories: similarity-based methods,
analysis of model-internal structures, backpropagation-based methods,
counterfactual intervention, and self-explanatory models. For each category, we
synthesize its representative studies, strengths, and weaknesses. Finally, we
summarize their common virtues and remaining challenges, and reflect on future
work directions towards faithful explainability in NLP. |
Author | Apidianaki, Marianna Callison-Burch, Chris Lyu, Qing |
Author_xml | – sequence: 1 givenname: Qing surname: Lyu fullname: Lyu, Qing – sequence: 2 givenname: Marianna surname: Apidianaki fullname: Apidianaki, Marianna – sequence: 3 givenname: Chris surname: Callison-Burch fullname: Callison-Burch, Chris |
BackLink | https://doi.org/10.48550/arXiv.2209.11326$$DView paper in arXiv |
BookMark | eNotzslOwzAUhWEvYAGFB2CFXyDheojjsKuqFpDCIJF9dGNfC0vBqdyB9u2Bwtn8u6Pvkp2lKRFjNwJKbasK7jAf4r6UEppSCCXNBVPd9IXZb_gK4_Yj7Eb-PHka-fKwHjHhNk6Jx8Rf2rd7Pufvu7yn4xU7Dzhu6Pq_M9atlt3isWhfH54W87ZAU5tiEE6aSmvyJNEEQ1Q7qDGEwQcH2kiwzc9gqNAKYxsXcFCka6-lBUegZuz27_ak7tc5fmI-9r_6_qRX3y99QD8 |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by-nc-nd/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by-nc-nd/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2209.11326 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2209_11326 |
GroupedDBID | AKY GOX |
ID | FETCH-LOGICAL-a676-b1c26544ede2a6f6ee7c07affbdfc04620899990b5a81689cfab3e47d4280ce03 |
IEDL.DBID | GOX |
IngestDate | Thu Jan 18 17:38:05 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a676-b1c26544ede2a6f6ee7c07affbdfc04620899990b5a81689cfab3e47d4280ce03 |
OpenAccessLink | https://arxiv.org/abs/2209.11326 |
ParticipantIDs | arxiv_primary_2209_11326 |
PublicationCentury | 2000 |
PublicationDate | 2022-09-22 |
PublicationDateYYYYMMDD | 2022-09-22 |
PublicationDate_xml | – month: 09 year: 2022 text: 2022-09-22 day: 22 |
PublicationDecade | 2020 |
PublicationYear | 2022 |
Score | 1.8583511 |
SecondaryResourceType | preprint |
Snippet | End-to-end neural Natural Language Processing (NLP) models are notoriously
difficult to understand. This has given rise to numerous efforts towards model... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language |
Title | Towards Faithful Model Explanation in NLP: A Survey |
URI | https://arxiv.org/abs/2209.11326 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED7RTiwIBKg85YE1IjhOTNgqaOiAClIzdIv8FJGqCiVNBf-eOycIFlb7lrN1d599jw_gRmlnZW4QuancR0J7sjmRo11Z9JgpBmRBzcnzpVys7p9mNCaH_fTCqOaz3vXzgXV7y3mcE-sIz0Yw4pxKtp5fV31yMoziGuR_5RBjhqU_QaI4hIMB3bFpfx1HsOc2x5CUoTS1ZYWqt---WzMiIFszKn9T_Wccqzds8fL2wKZs2TU793UCZTErH-fRwFUQqUxmkb4zPEuFcNZxlfnMOWliqbzX1hvq_6TsGjp-nSoiusiNVzpxQlpE_zFRdp3CGJ_7bgJMKodR3PhUJFJYwzGeIsxxiNMQi2mfnsEkaFh99OMoKlK-Csqf_791AfucCvcpn8IvYbxtOncFo9Z21-FMvwFZY3NX |
link.rule.ids | 228,230,782,887 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+Faithful+Model+Explanation+in+NLP%3A+A+Survey&rft.au=Lyu%2C+Qing&rft.au=Apidianaki%2C+Marianna&rft.au=Callison-Burch%2C+Chris&rft.date=2022-09-22&rft_id=info:doi/10.48550%2Farxiv.2209.11326&rft.externalDocID=2209_11326 |