Semi-automatic construction of word-formation networks

The article presents a semi-automatic method for the construction of word-formation networks focusing particularly on derivation. The proposed approach applies a sequential pattern mining technique to construct useful morphological features in an unsupervised manner. The features take the form of re...

Full description

Saved in:
Bibliographic Details
Published in:Language resources and evaluation Vol. 55; no. 1; pp. 3 - 32
Main Authors: Lango, Mateusz, Žabokrtský, Zdeněk, Ševčíková, Magda
Format: Journal Article
Language:English
Published: Dordrecht Springer Netherlands 01-03-2021
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The article presents a semi-automatic method for the construction of word-formation networks focusing particularly on derivation. The proposed approach applies a sequential pattern mining technique to construct useful morphological features in an unsupervised manner. The features take the form of regular expressions and later they are used to feed a machine-learned ranking model. The network is constructed by applying the learned model to sort the lists of possible base words and selecting the most probable ones. This approach, besides relatively small training set and a lexicon, does not require any additional language resources such as a list of vowel and consonant alternations, part-of-speech tags etc. The proposed approach is evaluated on lexeme sets of four languages, namely Polish, Spanish, Czech, and French. The conducted experiments demonstrate the ability of the proposed method to construct linguistically adequate word-formation networks from small training sets. Furthermore, the performed feasibility study shows that the method can further benefit from the interaction with a human language expert within the active learning framework.
ISSN:1574-020X
1574-0218
DOI:10.1007/s10579-019-09484-2