Automatic protocol reverse engineering for industrial control systems with dynamic taint analysis

Proprietary (or semi-proprietary) protocols are widely adopted in industrial control systems (ICSs). Inferring protocol format by reverse engineering is important for many network security applications, e.g., program tests and intrusion detection. Conventional protocol reverse engineering methods ha...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers of information technology & electronic engineering Vol. 23; no. 3; pp. 351 - 360
Main Authors: Ma, Rongkuan, Zheng, Hao, Wang, Jingyi, Wang, Mufeng, Wei, Qiang, Wang, Qingxian
Format: Journal Article
Language:English
Published: Hangzhou Zhejiang University Press 01-03-2022
Springer Nature B.V
State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China%Zhejiang University NGICS Platform,Hangzhou 310000,China
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Proprietary (or semi-proprietary) protocols are widely adopted in industrial control systems (ICSs). Inferring protocol format by reverse engineering is important for many network security applications, e.g., program tests and intrusion detection. Conventional protocol reverse engineering methods have been proposed which are considered time-consuming, tedious, and error-prone. Recently, automatical protocol reverse engineering methods have been proposed which are, however, neither effective in handling binary-based ICS protocols based on network traffic analysis nor accurate in extracting protocol fields from protocol implementations. In this paper, we present a framework called the industrial control system protocol reverse engineering framework (ICSPRF) that aims to extract ICS protocol fields with high accuracy. ICSPRF is based on the key insight that an individual field in a message is typically handled in the same execution context, e.g., basic block (BBL) group. As a result, by monitoring program execution, we can collect the tainted data information processed in every BBL group in the execution trace and cluster it to derive the protocol format. We evaluate our approach with six open-source ICS protocol implementations. The results show that ICSPRF can identify individual protocol fields with high accuracy (on average a 94.3% match ratio). ICSPRF also has a low coarse-grained and overly fine-grained match ratio. For the same metric, ICSPRF is more accurate than AutoFormat (88.5% for all evaluated protocols and 80.0% for binary-based protocols).
ISSN:2095-9184
2095-9230
DOI:10.1631/FITEE.2000709