Malware detection based on string length histogram using machine learning

In the past decade, the creation of new malicious programs has been on the rapid rise [1], with thousands of malware released each day. The challenge is to analyze such a huge number in a timely efficient manner and detect malicious program accurately. As the nature of the malware keeps changing, an...

Full description

Saved in:
Bibliographic Details
Published in:2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) pp. 1836 - 1841
Main Authors: Sawaisarje, Snehalkumar K., Pachghare, Vinod K., Kshirsagar, Deepak D.
Format: Conference Proceeding
Language:English
Published: IEEE 01-05-2018
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the past decade, the creation of new malicious programs has been on the rapid rise [1], with thousands of malware released each day. The challenge is to analyze such a huge number in a timely efficient manner and detect malicious program accurately. As the nature of the malware keeps changing, any solution should be flexible enough to easily adapt to these changes. The paper proposes malware detection approach based on frequency distribution of length of the printable strings for efficient detection of malware such as worm and backdoor. This method is fast, simple and easily adaptable to new types of malware. The proposed malware detection approach consists of static binary file scanning, feature extraction, machine learning classifiers and malware detection. The proposed implemented system tested on VX heavens dataset which provides the highest accuracy of 89% for decision tree classifier as compared to 79.14% accuracy of KNN.
DOI:10.1109/RTEICT42901.2018.9012242