Building an Effective Classifier for Phishing Web Pages Detection: A Quantum-Inspired Biomimetic Paradigm Suitable for Big Data Analytics of Cyber Attacks

To combat malicious domains, which serve as a key platform for a wide range of attacks, domain name service (DNS) data provide rich traces of Internet activities and are a powerful resource. This paper presents new research that proposes a model for finding malicious domains by passively analyzing D...

Full description

Saved in:
Bibliographic Details
Published in:Biomimetics (Basel, Switzerland) Vol. 8; no. 2; p. 197
Main Authors: Darwish, Saad M, Farhan, Dheyauldeen A, Elzoghabi, Adel A
Format: Journal Article
Language:English
Published: Switzerland MDPI AG 09-05-2023
MDPI
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To combat malicious domains, which serve as a key platform for a wide range of attacks, domain name service (DNS) data provide rich traces of Internet activities and are a powerful resource. This paper presents new research that proposes a model for finding malicious domains by passively analyzing DNS data. The proposed model builds a real-time, accurate, middleweight, and fast classifier by combining a genetic algorithm for selecting DNS data features with a two-step quantum ant colony optimization (QABC) algorithm for classification. The modified two-step QABC classifier uses -means instead of random initialization to place food sources. In order to overcome ABCs poor exploitation abilities and its convergence speed, this paper utilizes the metaheuristic QABC algorithm for global optimization problems inspired by quantum physics concepts. The use of the Hadoop framework and a hybrid machine learning approach (K-mean and QABC) to deal with the large size of uniform resource locator (URL) data is one of the main contributions of this paper. The major point is that blacklists, heavyweight classifiers (those that use more features), and lightweight classifiers (those that use fewer features and consume the features from the browser) may all be improved with the use of the suggested machine learning method. The results showed that the suggested model could work with more than 96.6% accuracy for more than 10 million query-answer pairs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2313-7673
2313-7673
DOI:10.3390/biomimetics8020197