Using minimal generators for composite isolated point extraction and conceptual binary relation coverage: Application for extracting relevant textual features

•We present a new approach called “MinGenCoverage” for conceptual binary relation covering.•We start by locating the isolated points and we extract their corresponding mandatory concepts.•In case of properties composition, we consider only the minimal generators as candidates for isolated point extr...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences Vol. 336; pp. 129 - 144
Main Authors: Elloumi, S., Ferjani, F., Jaoua, A.
Format: Journal Article
Language:English
Published: Elsevier Inc 01-04-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We present a new approach called “MinGenCoverage” for conceptual binary relation covering.•We start by locating the isolated points and we extract their corresponding mandatory concepts.•In case of properties composition, we consider only the minimal generators as candidates for isolated point extraction.•We applied our approach for textual data and we considered the concept’s labels associated to isolated points as the selected textual features. In recent years, several mathematical concepts have been successfully explored in the computer science domain as a basis for finding original solutions for complex problems related to knowledge engineering, data mining, and information retrieval. Hence, relational algebra (RA) and formal concept analysis (FCA) may be considered as useful mathematical foundations that unify data and knowledge into information retrieval systems. For example, some elements in a fringe relation (related to the (RA) domain) called isolated points have been successfully used in FCA as formal concept labels or composite labels. Once associated with words in a textual document, these labels constitute relevant features of a text. This paper proposes the MinGenCoverage algorithm for covering a Formal Context (as a formal representation of a text) based on isolated labels and using these labels (or text features) for categorization, corpus structuring, and micro–macro browsing as an advanced information retrieval functionality. The main thrust of the approach introduced here relies heavily on the close connection between isolated points and minimal generators (MGs). MGs stand at the antipodes of the closures within their respective equivalence classes. By using the fact that the minimal generators are the smallest elements within an equivalence class, their detection and traversal is greatly eased and the coverage can be swiftly built. Extensive experiments provide empirical evidence for the performance of the proposed approach.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2015.12.013