PCA-based network-wide correlated anomaly event detection and diagnosis

High-performance computing environments supporting large-scale distributed computing applications need multi-domain network performance measurements from open frameworks such as perfSONAR. Network-wide correlated anomaly events that can potentially impact data throughput performance need to be quick...

Full description

Saved in:
Bibliographic Details
Published in:2015 11th International Conference on the Design of Reliable Communication Networks (DRCN) pp. 149 - 156
Main Authors: Yuanxun Zhang, Calyam, Prasad, Debroy, Saptarshi, Sridharan, Mukundan
Format: Conference Proceeding
Language:English
Published: IEEE 01-03-2015
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High-performance computing environments supporting large-scale distributed computing applications need multi-domain network performance measurements from open frameworks such as perfSONAR. Network-wide correlated anomaly events that can potentially impact data throughput performance need to be quickly and accurately notified for smooth computing environment operations. Since network topology is not always available along with the measurements data, it is challenging to identify and locate network-wide correlated anomaly events that impact data throughput performance. In this paper, we present a novel PCA-based correlated anomaly event detection scheme that can fuse multiple time-series of measurements and transform them using principal component analysis. We demonstrate using actual perfSONAR one-way delay measurement datasets that our scheme can: (a) effectively distinguish between correlated and uncorrelated anomalies, (b) leverage a source-side vantage point to diagnose whether a correlated anomaly event location is local or in an external domain, (c) act as a "black-box" correlation analysis tool for key insights in eventual root-cause identification.
DOI:10.1109/DRCN.2015.7149006