Human or Machine: Automating Human Likeliness Evaluation of NLG Texts
Automatic evaluation of various text quality criteria produced by data-driven intelligent methods is very common and useful because it is cheap, fast, and usually yields repeatable results. In this paper, we present an attempt to automate the human likeliness evaluation of the output text samples co...
Saved in:
Main Authors: | , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
04-06-2020
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Automatic evaluation of various text quality criteria produced by data-driven
intelligent methods is very common and useful because it is cheap, fast, and
usually yields repeatable results. In this paper, we present an attempt to
automate the human likeliness evaluation of the output text samples coming from
natural language generation methods used to solve several tasks. We propose to
use a human likeliness score that shows the percentage of the output samples
from a method that look as if they were written by a human. Instead of having
human participants label or rate those samples, we completely automate the
process by using a discrimination procedure based on large pretrained language
models and their probability distributions. As follow up, we plan to perform an
empirical analysis of human-written and machine-generated texts to find the
optimal setup of this evaluation approach. A validation procedure involving
human participants will also check how the automatic evaluation correlates with
human judgments. |
---|---|
DOI: | 10.48550/arxiv.2006.03189 |