THINKER - Entity Linking System for Turkish Language
Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been propos...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering Vol. 30; no. 2; pp. 367 - 380 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
New York
IEEE
01-02-2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system-THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2017.2761743 |