Cross-domain NER in the data-poor scenarios for human mobility knowledge

In recent years, the exploration of knowledge in large-scale human mobility has gained significant attention. In order to achieve a semantic understanding of human behavior and uncover patterns in large-scale human mobility, Named Entity Recognition (NER) is a crucial technology. The rapid advanceme...

Full description

Saved in:
Bibliographic Details
Published in:GeoInformatica Vol. 28; no. 4; pp. 535 - 557
Main Authors: Jiang, Yutong, Jin, Fusheng, Chen, Mengnan, Liu, Guoming, Pang, He, Yuan, Ye
Format: Journal Article
Language:English
Published: New York Springer US 01-10-2024
Springer Nature B.V
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, the exploration of knowledge in large-scale human mobility has gained significant attention. In order to achieve a semantic understanding of human behavior and uncover patterns in large-scale human mobility, Named Entity Recognition (NER) is a crucial technology. The rapid advancements in IoT and CPS technologies have led to the collection of massive human mobility data from various sources. Therefore, there’s a need for Cross-domain NER which can transfer entity information from the source domain to automatically identify and classify entities in different target domain texts. In the situation of the data-poor, how could we transfer human mobility knowledge over time and space is particularly significant, therefore this paper proposes an Adaptive Text Sequence Enhancement Module (at-SAM) to help the model enhance the association between entities in sentences in the data-poor target domains. This paper also proposes a Predicted Label-Guided Dual Sequence Aware Information Module (Dual-SAM) to improve the transferability of label information. Experiments were conducted in domains that contain hidden knowledge about human mobility, the results show that this method can transfer task knowledge between multiple different domains in the data-poor scenarios and achieve SOTA performance.
ISSN:1384-6175
1573-7624
DOI:10.1007/s10707-024-00513-z