Discovering Anomalies on Mixed-Type Data Using a Generalized Student- Based Approach
Anomaly detection in mixed-type data is an important problem that has not been well addressed in the machine learning field. Existing approaches focus on computational efficiency and their correlation modeling between mixed-type attributes is heuristically driven, lacking a statistical foundation. I...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering Vol. 28; no. 10; pp. 2582 - 2595 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
01-10-2016
|
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Anomaly detection in mixed-type data is an important problem that has not been well addressed in the machine learning field. Existing approaches focus on computational efficiency and their correlation modeling between mixed-type attributes is heuristically driven, lacking a statistical foundation. In this paper, we propose MIxed-Type Robust dEtection (MITRE), a robust error buffering approach for anomaly detection in mixed-type datasets. Because of its non-Gaussian design, the problem is analytically intractable. Two novel Bayesian inference approaches are utilized to solve the intractable inferences: Integrated-nested Laplace Approximation (INLA), and Expectation Propagation (EP) with Variational Expectation-Maximization (EM). A set of algorithmic optimizations is implemented to improve the computational efficiency. A comprehensive suite of experiments was conducted on both synthetic and real world data to test the effectiveness and efficiency of MITRE. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2016.2583429 |