Fast outlier detection using a GPU

The availability of cost-effective data collections and storage hardware has allowed organizations to accumulate very large data sets, which are a potential source of previously unknown valuable information. The process of discovering interesting patterns in such large data sets is referred to as da...

Full description

Saved in:
Bibliographic Details
Published in:2013 International Conference on High Performance Computing & Simulation (HPCS) pp. 143 - 150
Main Authors: Angiulli, Fabrizio, Basta, Stefano, Lodi, Stefano, Sartori, Claudio
Format: Conference Proceeding
Language:English
Published: IEEE 01-07-2013
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The availability of cost-effective data collections and storage hardware has allowed organizations to accumulate very large data sets, which are a potential source of previously unknown valuable information. The process of discovering interesting patterns in such large data sets is referred to as data mining. Outlier detection is a data mining task consisting in the discovery of observations which deviate substantially from the rest of the data, and has many important practical applications. Outlier detection in very large data sets is however computationally very demanding and currently requires highperformance computing facilities. We propose a family of parallel algorithms for Graphic Processing Units (GPU), derived from two distance-based outlier detection algorithms: the BruteForce and the SolvingSet. We analyze their performance with an extensive set of experiments, comparing the GPU implementations with the base CPU versions and obtaining significant speedups.
ISBN:9781479908363
1479908363
DOI:10.1109/HPCSim.2013.6641405