Hierarchical Text Classification As Sub-Hierarchy Sequence Generation
Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12933-12941 (2023) Hierarchical text classification (HTC) is essential for various real applications. However, HTC models are challenging to develop because they often require processing a large volume of documents and labels wit...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
07-11-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Proceedings of the AAAI Conference on Artificial Intelligence,
37(11), 12933-12941 (2023) Hierarchical text classification (HTC) is essential for various real
applications. However, HTC models are challenging to develop because they often
require processing a large volume of documents and labels with hierarchical
taxonomy. Recent HTC models based on deep learning have attempted to
incorporate hierarchy information into a model structure. Consequently, these
models are challenging to implement when the model parameters increase for a
large-scale hierarchy because the model structure depends on the hierarchy
size. To solve this problem, we formulate HTC as a sub-hierarchy sequence
generation to incorporate hierarchy information into a target label sequence
instead of the model structure. Subsequently, we propose the Hierarchy DECoder
(HiDEC), which decodes a text sequence into a sub-hierarchy sequence using
recursive hierarchy decoding, classifying all parents at the same level into
children at once. In addition, HiDEC is trained to use hierarchical path
information from a root to each leaf in a sub-hierarchy composed of the labels
of a target document via an attention mechanism and hierarchy-aware masking.
HiDEC achieved state-of-the-art performance with significantly fewer model
parameters than existing models on benchmark datasets, such as RCV1-v2, NYT,
and EURLEX57K. |
---|---|
DOI: | 10.48550/arxiv.2111.11104 |