A graph representation framework for encrypted network traffic classification

Network Traffic Classification (NTC) is crucial for ensuring internet security, but encryption presents significant challenges to this task. While Machine Learning (ML) and Deep Learning (DL) methods have shown promise, issues such as limited representativeness leading to sub-optimal generalizations...

Full description

Saved in:
Bibliographic Details
Published in:Computers & security Vol. 148; p. 104134
Main Authors: Okonkwo, Zulu, Foo, Ernest, Hou, Zhe, Li, Qinyi, Jadidi, Zahra
Format: Journal Article
Language:English
Published: Elsevier Ltd 01-01-2025
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Network Traffic Classification (NTC) is crucial for ensuring internet security, but encryption presents significant challenges to this task. While Machine Learning (ML) and Deep Learning (DL) methods have shown promise, issues such as limited representativeness leading to sub-optimal generalizations and performance remain prevalent. These problems become more pronounced with advanced obfuscation, network security, and privacy technologies, indicating a need for improved model robustness. To address these issues, we focus on feature extraction and representation in NTC by leveraging the expressive power of graphs to represent network traffic at various granularity levels. By modeling network traffic as interconnected graphs, we can analyze both flow-level and packet-level data. Our graph representation method for encrypted NTC effectively preserves crucial information despite encryption and obfuscation. We enhance the robustness of our approach by using cosine similarity to exploit correlations between encrypted network flows and packets, defining relationships between abstract entities. This graph structure enables the creation of structural embeddings that accurately define network traffic across different encryption levels. Our end-to-end process demonstrates significant improvements where traditional NTC methods struggle, such as in Tor classification, which employs anonymization to further obfuscate traffic. Our packet-level classification approach consistently outperforms existing methods, achieving accuracies exceeding 96%.
ISSN:0167-4048
DOI:10.1016/j.cose.2024.104134