Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments
As machine learning models evolve, maintaining transparency demands more human-centric explainable AI techniques. Counterfactual explanations, with roots in human reasoning, identify the minimal input changes needed to obtain a given output and, hence, are crucial for supporting decision-making. Des...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
28-10-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | As machine learning models evolve, maintaining transparency demands more
human-centric explainable AI techniques. Counterfactual explanations, with
roots in human reasoning, identify the minimal input changes needed to obtain a
given output and, hence, are crucial for supporting decision-making. Despite
their importance, the evaluation of these explanations often lacks grounding in
user studies and remains fragmented, with existing metrics not fully capturing
human perspectives. To address this challenge, we developed a diverse set of 30
counterfactual scenarios and collected ratings across 8 evaluation metrics from
206 respondents. Subsequently, we fine-tuned different Large Language Models
(LLMs) to predict average or individual human judgment across these metrics.
Our methodology allowed LLMs to achieve an accuracy of up to 63% in zero-shot
evaluations and 85% (over a 3-classes prediction) with fine-tuning across all
metrics. The fine-tuned models predicting human ratings offer better
comparability and scalability in evaluating different counterfactual
explanation frameworks. |
---|---|
AbstractList | As machine learning models evolve, maintaining transparency demands more
human-centric explainable AI techniques. Counterfactual explanations, with
roots in human reasoning, identify the minimal input changes needed to obtain a
given output and, hence, are crucial for supporting decision-making. Despite
their importance, the evaluation of these explanations often lacks grounding in
user studies and remains fragmented, with existing metrics not fully capturing
human perspectives. To address this challenge, we developed a diverse set of 30
counterfactual scenarios and collected ratings across 8 evaluation metrics from
206 respondents. Subsequently, we fine-tuned different Large Language Models
(LLMs) to predict average or individual human judgment across these metrics.
Our methodology allowed LLMs to achieve an accuracy of up to 63% in zero-shot
evaluations and 85% (over a 3-classes prediction) with fine-tuning across all
metrics. The fine-tuned models predicting human ratings offer better
comparability and scalability in evaluating different counterfactual
explanation frameworks. |
Author | Magnifico, Giacomo Barbu, Eduard Vicente, Raul Veski, Rasmus Moorits Tulver, Kadi Valja, Julius Domnich, Marharyta |
Author_xml | – sequence: 1 givenname: Marharyta surname: Domnich fullname: Domnich, Marharyta – sequence: 2 givenname: Julius surname: Valja fullname: Valja, Julius – sequence: 3 givenname: Rasmus Moorits surname: Veski fullname: Veski, Rasmus Moorits – sequence: 4 givenname: Giacomo surname: Magnifico fullname: Magnifico, Giacomo – sequence: 5 givenname: Kadi surname: Tulver fullname: Tulver, Kadi – sequence: 6 givenname: Eduard surname: Barbu fullname: Barbu, Eduard – sequence: 7 givenname: Raul surname: Vicente fullname: Vicente, Raul |
BackLink | https://doi.org/10.48550/arXiv.2410.21131$$DView paper in arXiv |
BookMark | eNqFjr2OwkAMhLc4Co7jAajwC8AREiR03SkKouA6ro6s4I1W2niRnQ0_T0-I6Glsz3hG-j7NBwcmY2bJapltN5vVN8rVdct11hvrJEmTsbkfwwXlpPDPzt4c11B06CO2LjAEC3mI3JJYrNqIHorr2SMPX_2BA3UkWD9bB5Sa-sl1xP74CyfyCjYI7GODvMiJW3EV_KqSatMr_TIji15p-toTM98Vx3y_GCjLs7gG5VY-acuBNn2feABjfU5o |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2410.21131 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2410_21131 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2410_211313 |
IEDL.DBID | GOX |
IngestDate | Wed Oct 30 12:12:14 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2410_211313 |
OpenAccessLink | https://arxiv.org/abs/2410.21131 |
ParticipantIDs | arxiv_primary_2410_21131 |
PublicationCentury | 2000 |
PublicationDate | 2024-10-28 |
PublicationDateYYYYMMDD | 2024-10-28 |
PublicationDate_xml | – month: 10 year: 2024 text: 2024-10-28 day: 28 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8807008 |
SecondaryResourceType | preprint |
Snippet | As machine learning models evolve, maintaining transparency demands more
human-centric explainable AI techniques. Counterfactual explanations, with
roots in... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Artificial Intelligence Computer Science - Computation and Language |
Title | Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments |
URI | https://arxiv.org/abs/2410.21131 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV09T8MwED3RTiwIBKh838BqSPPllK2ClA4VDHToFtmJLVVCKapphfj1nM-lsHTJ4FiWnUR5z-d77wBuZaZ1k6ZSqCSLRJo2uRjoOBN5o7O-ri2RBi8UHr_Jl1nxVHqbHPzVwqjl13wd_IG1uyd4ie5oi-KF0p049ilbz6-zcDjJVlyb_n_9iGNy0z-QGB3CwYbd4TC8jiPYM-0xfE85NdUh8TtWFWG5tdjGhUUvC_fFohVrOdBnxakQo3MPODH0rXElIZz4pG26hgAj-ipm7w6JdCJH4gUHauc1Drdmm-4Ebkbl9HEseLbVR7CWqPxCKl5IcgrddtGaHmBmVUFER1uZ9NPaNoM6zwmHTaQs4Y-RZ9DbNcr57lsXsB8TQPv_cFxcQvdzuTJX0HHN6pqf8g9l8oIN |
link.rule.ids | 228,230,782,887 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+Unifying+Evaluation+of+Counterfactual+Explanations%3A+Leveraging+Large+Language+Models+for+Human-Centric+Assessments&rft.au=Domnich%2C+Marharyta&rft.au=Valja%2C+Julius&rft.au=Veski%2C+Rasmus+Moorits&rft.au=Magnifico%2C+Giacomo&rft.date=2024-10-28&rft_id=info:doi/10.48550%2Farxiv.2410.21131&rft.externalDocID=2410_21131 |