Pre-Training-Based Grammatical Error Correction Model for the Written Language of Chinese Hearing Impaired Students

Grammatical error correction has been considered as an application closely related to daily life and an important shared task in many prestigious competitions and workshops. The neural machine translation with an encoder-decoder architecture containing language models has been the fundamental soluti...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access Vol. 10; pp. 35061 - 35072
Main Authors:	Chen, Binbin, Zhang, Jingyu
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Ablation Coders Decoding encoder-decoder Encoders-Decoders Error analysis Error correction Error correction & detection grammatical error correction Hearing Hearing impaired student Hearing loss Machine translation pre-training self-attention Semantics Sentences Students Task analysis Training Training data Transformers Words (language)
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Grammatical error correction has been considered as an application closely related to daily life and an important shared task in many prestigious competitions and workshops. The neural machine translation with an encoder-decoder architecture containing language models has been the fundamental solution for the grammatical error correction. Whereas Grammatical error correction task on texts of hearing impaired people or its solution has not been seen yet, and common Grammatical error correction tasks are suffering several challenges, such as insufficient training data, insufficient accuracy due to the unsatisfactory capacity of extracting semantic and grammatical patterns. Under these circumstances, we proposed a novel encoder-decoder architecture based on multi-head self-attention along with multiple strategies, which excels at extracting deep representations from the corrupted sentences of hearing impaired students and further reconstructing the sentences into grammatical ones. Via the re-ranking strategy, our model can correct various kinds of errors including spelling and complex syntax errors. The ablation experiments prove that the semantic extracting of self-attention mechanism excluding the position encoding with the word order shuffle operation can significantly learn the hearing impaired students' sentence patterns whose word order is quite different from the ones of hearing people and improve the correction scores. The pre-training can enhance the restoring efficiency of sentence structure in the decoding process. The comparison experiments with baseline models show that our model obtains superior performance either in the hearing impaired students' grammatical error correction or in a common grammatical error correction shared task.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3159676