Hybridized term-weighting method for Dark Web classification
The role of intelligence and security informatics based on statistical computations is becoming more significant in detecting terrorism activities proactively as the extremist groups are misusing many of the obtainable facilities on the Internet to incite violence and hatred. However, the performanc...
Saved in:
Published in: | Neurocomputing (Amsterdam) Vol. 173; pp. 1908 - 1926 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
15-01-2016
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The role of intelligence and security informatics based on statistical computations is becoming more significant in detecting terrorism activities proactively as the extremist groups are misusing many of the obtainable facilities on the Internet to incite violence and hatred. However, the performance of statistical methods is limited due to the inadequate accuracy produced by the inability of these methods to comprehend the texts created by humans. In this paper, we propose a hybridized feature selection method based on the basic term-weighting techniques for accurate terrorism activities detection in textual contexts. The proposed method combines the feature sets selected based on different individual feature selection methods into one feature space for effective web pages classification. UNION and Symmetric Difference combination functions are proposed for dimensionality reduction of the combined feature space. The method is tested on a selected dataset from the Dark Web Forum Portal and benchmarked using various famous text classifiers. Experimental results show that the hybridized method efficiently identifies the terrorist activities content and outperforms the individual methods. Furthermore, the results revealed that the classification performance achieved by hybridizing few feature sets is relatively competitive in the number of features used for classification with higher hybridization levels. Moreover, the experiments of hybridizing functions show that the dimensionality of the feature sets is significantly reduced by applying the Symmetric Difference function for feature sets combination.
•A hybrid text classifications method with term-weighting techniques is proposed.•The proposed method combines various feature sets for effective classification.•We proposed a method to reduce the dimension of feature sets for classification.•The method is tested on a selected dataset from the Dark Web Portal Forum.•Experimental results show that the proposed method outperforms other methods. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2015.09.063 |