EncodKNN: Augmenting KNN with Autoencoder for Computational Cost Reduction

The K Nearest Neighbor (kNN) is a classification method that's easy to understand a nd commonly used in statistical data mining. Typically, kNN operates by analyzing the nearest instances to a given data point, relying on a distance metric for comparison. However, its efficacy diminishes notabl...

Full description

Saved in:
Bibliographic Details
Published in:2024 Intelligent Methods, Systems, and Applications (IMSA) pp. 641 - 646
Main Authors: El-Feky, Shereen Fathy, Mohamed, Ali Khater, Ammar, Ammar Mohammed
Format: Conference Proceeding
Language:English
Published: IEEE 13-07-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The K Nearest Neighbor (kNN) is a classification method that's easy to understand a nd commonly used in statistical data mining. Typically, kNN operates by analyzing the nearest instances to a given data point, relying on a distance metric for comparison. However, its efficacy diminishes notably i n high-dimensional spaces as the number of input features increases. Consequently, the computational cost of the algorithm becomes prohibitively high, posing a significant challenge in practical applications. To tackle this challenge, this paper proposes an unsupervised learning approach that integrates a deep autoencoder for dimensionality reduction. This method involves embedding the training data into lower-dimensional latent feature spaces, effectively reducing computational complexity while retaining essential information for accurate classification. Furthermore, the paper proposes a differential evolution optimization technique to determine the best embedding dimension of latent space of the autoencoder. Experimental findings a cross diverse d atasets demonstrate that this approach significantly reduces computational overheads while maintaining performance levels comparable to standard kNN. Additionally, the optimization method reduces feature dimensions ranging from 69.2% to 84.2%.
DOI:10.1109/IMSA61967.2024.10652805