CoMod: An Abstractive Approach to Discourse Context Identification

Generative text summarization can condense large volumes of information into a concise summary. It helps users quickly grasp the main points of a text without having to read the entire document. Machine learning (ML) plays a pivotal role in this domain, offering significant advantages in information...

Full description

Saved in:
Bibliographic Details
Published in:Access, IEEE Vol. 11; pp. 82744 - 82770
Main Authors: Guetari, Ramzi, Kraiem, Naoufel
Format: Standard
Language:English
Published: IEEE 2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Generative text summarization can condense large volumes of information into a concise summary. It helps users quickly grasp the main points of a text without having to read the entire document. Machine learning (ML) plays a pivotal role in this domain, offering significant advantages in information processing and comprehension. In this paper we present CoMod, an abstractive method for generating the context of a document, from its content and that of the referenced documents, if any. CoMod analyzes the intricate patterns and relationships within a document's content, thereby extracting and inferring the underlying context. The context generation process involves using a word linearization process as well as a Markov model, specifically a Bigram model, to predict the likelihood of word sequences. The Markov model is trained on a corpus of text and used to generate coherent sentences based on the probabilities of transitioning from one word to another. Markov tables allows to adapt the generated context to a specific domain and can be built on the fly in CoMod. The approach was compared to other methods and demonstrated very encouraging capabilities by outperforming other approaches tested on the same datasets. It thus confirms the potential of generative methods in the field of automatic text summarization and their ability of leveraging the power of machine learning for context generation to revolutionize information management, boosting productivity, scalability and knowledge discovery in various domains.
DOI:10.1109/ACCESS.2023.3302179