Multi-Encoder Transformer for Korean Abstractive Text Summarization

In this paper, we propose a Korean abstractive text summarization approach that uses a multi-encoder transformer. Recently, in many natural language processing (NLP) tasks, the use of the pre-trained language models (PLMs) for transfer learning has achieved remarkable performance. In particular, tra...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access Vol. 11; p. 1
Main Author:	Shin, Youhyun
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01-01-2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	abstractive text summarization bidirectional encoder representations from transformer Bit error rate Coders Datasets Decoding Encoding Encyclopedias natural language generation Natural language processing Neural networks Text processing Transformers Vocabulary
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this paper, we propose a Korean abstractive text summarization approach that uses a multi-encoder transformer. Recently, in many natural language processing (NLP) tasks, the use of the pre-trained language models (PLMs) for transfer learning has achieved remarkable performance. In particular, transformer-based models such as Bidirectional Encoder Representations from Transformers (BERT) are used for pre-training and applied to downstream tasks, showing state-of-the-art performance including abstractive text summarization. However, existing text summarization models usually use one pre-trained model per model architecture, meaning that it becomes necessary to choose one PLM at a time. For PLMs applicable to Korean abstractive text summarization, there are publicly available BERT-based pre-trained Korean models that offer different advantages such as Multilingual BERT, KoBERT, HanBERT, and KorBERT. We assume that if these PLMs could be leveraged simultaneously, better performance would be obtained. We propose a model that uses multiple encoders which are capable of leveraging multiple pre-trained models to create an abstractive summary. We evaluate our method using three benchmark Korean abstractive summarization datasets, each named Law (AI-Hub), News (AI-Hub), and News (NIKL) datasets. Experimental results show that the proposed multi-encoder model variations outperform single-encoder models. We find the empirically best summarization model by determining the optimal input combination when leveraging multiple PLMs with the multi-encoder method.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3277754