Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments

As machine learning models evolve, maintaining transparency demands more human-centric explainable AI techniques. Counterfactual explanations, with roots in human reasoning, identify the minimal input changes needed to obtain a given output and, hence, are crucial for supporting decision-making. Des...

Full description

Saved in:

Bibliographic Details
Main Authors:	Domnich, Marharyta, Valja, Julius, Veski, Rasmus Moorits, Magnifico, Giacomo, Tulver, Kadi, Barbu, Eduard, Vicente, Raul
Format:	Journal Article
Language:	English
Published:	28-10-2024
Subjects:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	As machine learning models evolve, maintaining transparency demands more human-centric explainable AI techniques. Counterfactual explanations, with roots in human reasoning, identify the minimal input changes needed to obtain a given output and, hence, are crucial for supporting decision-making. Despite their importance, the evaluation of these explanations often lacks grounding in user studies and remains fragmented, with existing metrics not fully capturing human perspectives. To address this challenge, we developed a diverse set of 30 counterfactual scenarios and collected ratings across 8 evaluation metrics from 206 respondents. Subsequently, we fine-tuned different Large Language Models (LLMs) to predict average or individual human judgment across these metrics. Our methodology allowed LLMs to achieve an accuracy of up to 63% in zero-shot evaluations and 85% (over a 3-classes prediction) with fine-tuning across all metrics. The fine-tuned models predicting human ratings offer better comparability and scalability in evaluating different counterfactual explanation frameworks.
AbstractList	As machine learning models evolve, maintaining transparency demands more human-centric explainable AI techniques. Counterfactual explanations, with roots in human reasoning, identify the minimal input changes needed to obtain a given output and, hence, are crucial for supporting decision-making. Despite their importance, the evaluation of these explanations often lacks grounding in user studies and remains fragmented, with existing metrics not fully capturing human perspectives. To address this challenge, we developed a diverse set of 30 counterfactual scenarios and collected ratings across 8 evaluation metrics from 206 respondents. Subsequently, we fine-tuned different Large Language Models (LLMs) to predict average or individual human judgment across these metrics. Our methodology allowed LLMs to achieve an accuracy of up to 63% in zero-shot evaluations and 85% (over a 3-classes prediction) with fine-tuning across all metrics. The fine-tuned models predicting human ratings offer better comparability and scalability in evaluating different counterfactual explanation frameworks.
Author	Magnifico, Giacomo Barbu, Eduard Vicente, Raul Veski, Rasmus Moorits Tulver, Kadi Valja, Julius Domnich, Marharyta
Author_xml	– sequence: 1 givenname: Marharyta surname: Domnich fullname: Domnich, Marharyta – sequence: 2 givenname: Julius surname: Valja fullname: Valja, Julius – sequence: 3 givenname: Rasmus Moorits surname: Veski fullname: Veski, Rasmus Moorits – sequence: 4 givenname: Giacomo surname: Magnifico fullname: Magnifico, Giacomo – sequence: 5 givenname: Kadi surname: Tulver fullname: Tulver, Kadi – sequence: 6 givenname: Eduard surname: Barbu fullname: Barbu, Eduard – sequence: 7 givenname: Raul surname: Vicente fullname: Vicente, Raul
BackLink	https://doi.org/10.48550/arXiv.2410.21131$$DView paper in arXiv
BookMark	eNqFjr2OwkAMhLc4Co7jAajwC8AREiR03SkKouA6ro6s4I1W2niRnQ0_T0-I6Glsz3hG-j7NBwcmY2bJapltN5vVN8rVdct11hvrJEmTsbkfwwXlpPDPzt4c11B06CO2LjAEC3mI3JJYrNqIHorr2SMPX_2BA3UkWD9bB5Sa-sl1xP74CyfyCjYI7GODvMiJW3EV_KqSatMr_TIji15p-toTM98Vx3y_GCjLs7gG5VY-acuBNn2feABjfU5o
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2410.21131
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2410_21131
GroupedDBID	AKY GOX
ID	FETCH-arxiv_primary_2410_211313
IEDL.DBID	GOX
IngestDate	Wed Oct 30 12:12:14 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2410_211313
OpenAccessLink	https://arxiv.org/abs/2410.21131
ParticipantIDs	arxiv_primary_2410_21131
PublicationCentury	2000
PublicationDate	2024-10-28
PublicationDateYYYYMMDD	2024-10-28
PublicationDate_xml	– month: 10 year: 2024 text: 2024-10-28 day: 28
PublicationDecade	2020
PublicationYear	2024
Score	3.8807008
SecondaryResourceType	preprint
Snippet	As machine learning models evolve, maintaining transparency demands more human-centric explainable AI techniques. Counterfactual explanations, with roots in...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Title	Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments
URI	https://arxiv.org/abs/2410.21131
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV09T8MwED3RTiwIBKh838BqSPPllK2ClA4VDHToFtmJLVVCKapphfj1nM-lsHTJ4FiWnUR5z-d77wBuZaZ1k6ZSqCSLRJo2uRjoOBN5o7O-ri2RBi8UHr_Jl1nxVHqbHPzVwqjl13wd_IG1uyd4ie5oi-KF0p049ilbz6-zcDjJVlyb_n_9iGNy0z-QGB3CwYbd4TC8jiPYM-0xfE85NdUh8TtWFWG5tdjGhUUvC_fFohVrOdBnxakQo3MPODH0rXElIZz4pG26hgAj-ipm7w6JdCJH4gUHauc1Drdmm-4Ebkbl9HEseLbVR7CWqPxCKl5IcgrddtGaHmBmVUFER1uZ9NPaNoM6zwmHTaQs4Y-RZ9DbNcr57lsXsB8TQPv_cFxcQvdzuTJX0HHN6pqf8g9l8oIN
link.rule.ids	228,230,782,887
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+Unifying+Evaluation+of+Counterfactual+Explanations%3A+Leveraging+Large+Language+Models+for+Human-Centric+Assessments&rft.au=Domnich%2C+Marharyta&rft.au=Valja%2C+Julius&rft.au=Veski%2C+Rasmus+Moorits&rft.au=Magnifico%2C+Giacomo&rft.date=2024-10-28&rft_id=info:doi/10.48550%2Farxiv.2410.21131&rft.externalDocID=2410_21131