Record linkage for character-based surnames: Evidence from chinese exclusion

This paper proposes a novel pre-processing technique to improve record linkage for historical Chinese populations. Current matching approaches are relatively ineffective due to Chinese-specific naming conventions and enumeration errors. This paper develops a three-step process that both triples the...

Full description

Saved in:
Bibliographic Details
Published in:Explorations in economic history Vol. 87; p. 101493
Main Author: Postel, Hannah M.
Format: Journal Article
Language:English
Published: United States Elsevier Inc 01-01-2023
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a novel pre-processing technique to improve record linkage for historical Chinese populations. Current matching approaches are relatively ineffective due to Chinese-specific naming conventions and enumeration errors. This paper develops a three-step process that both triples the match rate over baseline and improves match accuracy. The procedures developed in this paper can be applied in part or in full to other sources of historical data, and/or modified for use with other character-based languages such as Japanese. More broadly, this approach suggests the promise of language-specific linkage procedures to boost match rates for ethnic minority groups.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0014-4983
1090-2457
DOI:10.1016/j.eeh.2022.101493