DroidEnsemble: Detecting Android Malicious Applications With Ensemble of String and Structural Static Features

Android platform has dominated the operating system of mobile devices. However, the dramatic increase of Android malicious applications (malapps) has caused serious software failures to Android system and posed a great threat to users. The effective detection of Android malapps has thus become an em...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 6; pp. 31798 - 31807
Main Authors: Wang, Wei, Gao, Zhenzhen, Zhao, Meichen, Li, Yidong, Liu, Jiqiang, Zhang, Xiangliang
Format: Journal Article
Language:English
Published: Piscataway IEEE 01-01-2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Android platform has dominated the operating system of mobile devices. However, the dramatic increase of Android malicious applications (malapps) has caused serious software failures to Android system and posed a great threat to users. The effective detection of Android malapps has thus become an emerging yet crucial issue. Characterizing the behaviors of Android applications (apps) is essential to detecting malapps. Most existing works on detecting Android malapps were mainly based on string static features, such as permissions and API usage extracted from apps. There also exists work on the detection of Android malapps with structural features, such as control flow graph and data flow graph. As Android malapps have become increasingly polymorphic and sophisticated, using only one type of static features may result in false negatives. In this paper, we propose DroidEnsemble that takes advantages of both string features and structural features to systematically and comprehensively characterize the static behaviors of Android apps and thus build a more accurate detection model for the detection of Android malapps. We extract each app's string features, including permissions, hardware features, filter intents, restricted API calls, used permissions, code patterns, as well as structural features like function call graph. We then use three machine learning algorithms, namely, support vector machine, k-nearest neighbor, and random forest, to evaluate the performance of these two types of features and of their ensemble. In the experiments, we evaluate our methods and models with 1386 benign apps and 1296 malapps. Extensive experimental results demonstrate the effectiveness of DroidEnsemble. It achieves the detection accuracy as 95.8% with only string features and as 90.68% with only structural features. DroidEnsemble reaches the detection accuracy as 98.4% with the ensemble of both types of features, reducing 9 false positives and 12 false negatives compared to the results with only string features.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2018.2835654