Coreferential Relations in Basque: The Annotation Process

In this paper we present the coreferential tagging of part of the EPEC Corpus of Basque. Although coreference is a pragmatic linguistic phenomenon highly dependent on the situational context, it shows some language-specific patterns that vary according to the features of each language. Due to the fa...

Full description

Saved in:
Bibliographic Details
Published in:Journal of psycholinguistic research Vol. 47; no. 2; pp. 325 - 342
Main Authors: Ceberio, Klara, Aduriz, Itziar, Díaz de Ilarraza, Arantza, Garcia-Azkoaga, Ines
Format: Journal Article
Language:English
Published: New York Springer US 01-04-2018
Springer
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper we present the coreferential tagging of part of the EPEC Corpus of Basque. Although coreference is a pragmatic linguistic phenomenon highly dependent on the situational context, it shows some language-specific patterns that vary according to the features of each language. Due to the fact that Basque is not an Indo-European language, it differs considerably in grammar from the languages spoken in surrounding areas. We will explain these features and the decisions made in each case. After describing the criteria defined for coreferential tagging in Basque, the annotation process will be explained. Our annotation is based on a morphologically and syntactically annotated corpus that provides us with a manageable environment, in which the specific structures that are part of a reference chain can be more easily identified. A part of the corpus was tagged by two annotators who marked up the same text independently, and by another annotator that acted as judge, solving problems in case of disagreement. All this process has been automatized as a result of previous studies carried out in this field. The automatic detection of mentions (Soraluze et al., in: Proceedings of Konvens,  2012 ) has provided us with a better working environment, and given us the possibility to build a first significant corpus for a later computational treatment of automatic coreferential resolution.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0090-6905
1573-6555
DOI:10.1007/s10936-018-9559-6