Interpretable semantic textual similarity: Finding and explaining differences between sentences
•We address interpretability, the ability of machines to explain their reasoning.•We formalize it for textual similarity as graded typed alignment between 2 sentences.•We release an annotated dataset and build and evaluate a high performance system.•We show that the output of the system can be used...
Saved in:
Published in: | Knowledge-based systems Vol. 119; pp. 186 - 199 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Amsterdam
Elsevier B.V
01-03-2017
Elsevier Science Ltd Elsevier |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We address interpretability, the ability of machines to explain their reasoning.•We formalize it for textual similarity as graded typed alignment between 2 sentences.•We release an annotated dataset and build and evaluate a high performance system.•We show that the output of the system can be used to produce explanations.•2 user studies show preliminary evidence that explanations help humans perform better.
User acceptance of artificial intelligence agents might depend on their ability to explain their reasoning to the users. We focus on a specific text processing task, the Semantic Textual Similarity task (STS), where systems need to measure the degree of semantic equivalence between two sentences. We propose to add an interpretability layer (iSTS for short) formalized as the alignment between pairs of segments across the two sentences, where the relation between the segments is labeled with a relation type and a similarity score. This way, a system performing STS could use the interpretability layer to explain to users why it returned that specific score for the given sentence pair. We present a publicly available dataset of sentence pairs annotated following the formalization. We then develop an iSTS system trained on this dataset, which given a sentence pair finds what is similar and what is different, in the form of graded and typed segment alignments. When evaluated on the dataset, the system performs better than an informed baseline, showing that the dataset and task are well-defined and feasible. Most importantly, two user studies show how the iSTS system output can be used to automatically produce explanations in natural language. Users performed the two tasks better when having access to the explanations, providing preliminary evidence that our dataset and method to automatically produce explanations do help users understand the output of STS systems better. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2016.12.013 |