Representing word meaning in context via lexical substitutes

Representing the meaning of individual words is crucial for most natural language processing (NLP) tasks. This, however, is a challenge because word meaning often depends on the context. Recent approaches to representing word meaning in context rely on lexical substitution (LS), where a word is repr...

Full description

Saved in:

Bibliographic Details
Published in:	Automatika Vol. 62; no. 2; pp. 239 - 248
Main Authors:	Alagić, Domagoj, Šnajder, Jan
Format:	Journal Article Paper
Language:	English
Published:	Ljubljana Taylor & Francis 03-04-2021 Taylor & Francis Ltd KoREMA - Hrvatsko društvo za komunikacije,računarstvo, elektroniku, mjerenja i automatiku Taylor & Francis Group
Subjects:	Context Correlation analysis Dolphins & porpoises Empirical analysis lexical substitution machine learning Natural language processing Representations Substitutes word meaning in context word sense induction Words (language)
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Representing the meaning of individual words is crucial for most natural language processing (NLP) tasks. This, however, is a challenge because word meaning often depends on the context. Recent approaches to representing word meaning in context rely on lexical substitution (LS), where a word is represented with a set of meaning-preserving substitutes. While face valid, it is not clear to what extent substitute-based representation corresponds to the more established sense-based representation required for many NLP tasks. We present an empirical study that addresses this question by quantifying the correspondence between substitute- and sense-based meaning representations. We compile a high-quality dataset annotated with lexical substitutes and sense labels from two well-established sense inventories, and conduct a correlation analysis using a number of substitute-based similarity measures. Furthermore, as recent work has demonstrated the efficacy of system-produced substitutes for word meaning representation, we compare human- and system-produced substitutes to determine the performance gap between the two. Lastly, we investigate to what extent the results translate to the fundamental semantic task of word sense induction (WSI). Our experiments show the validity of LS for word meaning in context representation and justify the use of system-produced substitutes for WSI.
Bibliography:	269830
ISSN:	0005-1144 1848-3380
DOI:	10.1080/00051144.2021.1928437