Exploiting sequence labeling framework to extract document-level relations from biomedical texts

Both intra- and inter-sentential semantic relations in biomedical texts provide valuable information for biomedical research. However, most existing methods either focus on extracting intra-sentential relations and ignore inter-sentential ones or fail to extract inter-sentential relations accurately...

Full description

Saved in:

Bibliographic Details
Published in:	BMC bioinformatics Vol. 21; no. 1; p. 125
Main Authors:	Li, Zhiheng, Yang, Zhihao, Xiang, Yang, Luo, Ling, Sun, Yuanyuan, Lin, Hongfei
Format:	Journal Article
Language:	English
Published:	England BioMed Central Ltd 27-03-2020 BioMed Central BMC
Subjects:	Annotations Biomedical Research Classification Data mining Data Mining - methods Design Document-level relation Electronic health records Feature extraction Insulin resistance Labeling Medical research Methodology Methods Relation extraction Semantics Sequence labeling Texts Relation extraction Document-level relation Sequence labeling
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Both intra- and inter-sentential semantic relations in biomedical texts provide valuable information for biomedical research. However, most existing methods either focus on extracting intra-sentential relations and ignore inter-sentential ones or fail to extract inter-sentential relations accurately and regard the instances containing entity relations as being independent, which neglects the interactions between relations. We propose a novel sequence labeling-based biomedical relation extraction method named Bio-Seq. In the method, sequence labeling framework is extended by multiple specified feature extractors so as to facilitate the feature extractions at different levels, especially at the inter-sentential level. Besides, the sequence labeling framework enables Bio-Seq to take advantage of the interactions between relations, and thus, further improves the precision of document-level relation extraction. Our proposed method obtained an F1-score of 63.5% on BioCreative V chemical disease relation corpus, and an F1-score of 54.4% on inter-sentential relations, which was 10.5% better than the document-level classification baseline. Also, our method achieved an F1-score of 85.1% on n2c2-ADE sub-dataset. Sequence labeling method can be successfully used to extract document-level relations, especially for boosting the performance on inter-sentential relation extraction. Our work can facilitate the research on document-level biomedical text mining.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-020-3457-2