The FIX Benchmark: Extracting Features Interpretable to eXperts
Feature-based methods are commonly used to explain model predictions, but these methods often implicitly assume that interpretable features are readily available. However, this is often not the case for high-dimensional data, and it can be hard even for domain experts to mathematically specify which...
Saved in:
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
20-09-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Feature-based methods are commonly used to explain model predictions, but
these methods often implicitly assume that interpretable features are readily
available. However, this is often not the case for high-dimensional data, and
it can be hard even for domain experts to mathematically specify which features
are important. Can we instead automatically extract collections or groups of
features that are aligned with expert knowledge? To address this gap, we
present FIX (Features Interpretable to eXperts), a benchmark for measuring how
well a collection of features aligns with expert knowledge. In collaboration
with domain experts, we propose FIXScore, a unified expert alignment measure
applicable to diverse real-world settings across cosmology, psychology, and
medicine domains in vision, language and time series data modalities. With
FIXScore, we find that popular feature-based explanation methods have poor
alignment with expert-specified knowledge, highlighting the need for new
methods that can better identify features interpretable to experts. |
---|---|
DOI: | 10.48550/arxiv.2409.13684 |