Pragmatically Appropriate Diversity for Dialogue Evaluation
Linguistic pragmatics state that a conversation's underlying speech acts can constrain the type of response which is appropriate at each turn in the conversation. When generating dialogue responses, neural dialogue agents struggle to produce diverse responses. Currently, dialogue diversity is a...
Saved in:
Main Authors: | , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
05-04-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Linguistic pragmatics state that a conversation's underlying speech acts can
constrain the type of response which is appropriate at each turn in the
conversation. When generating dialogue responses, neural dialogue agents
struggle to produce diverse responses. Currently, dialogue diversity is
assessed using automatic metrics, but the underlying speech acts do not inform
these metrics.
To remedy this, we propose the notion of Pragmatically Appropriate Diversity,
defined as the extent to which a conversation creates and constrains the
creation of multiple diverse responses. Using a human-created multi-response
dataset, we find significant support for the hypothesis that speech acts
provide a signal for the diversity of the set of next responses. Building on
this result, we propose a new human evaluation task where creative writers
predict the extent to which conversations inspire the creation of multiple
diverse responses. Our studies find that writers' judgments align with the
Pragmatically Appropriate Diversity of conversations. Our work suggests that
expectations for diversity metric scores should vary depending on the speech
act. |
---|---|
DOI: | 10.48550/arxiv.2304.02812 |