Parallel Implementation of Random Forest using MPI
Computer vision is a quickly rising field in machine learning. It has many use cases and implementations across various industries. Powerful classifiers can be trained to recognize visual patterns and detect people and objects. However, since these models use images as inputs, they often end up bein...
Saved in:
Published in: | 2023 4th International Conference on Data Analytics for Business and Industry (ICDABI) pp. 518 - 522 |
---|---|
Main Authors: | , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
25-10-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Computer vision is a quickly rising field in machine learning. It has many use cases and implementations across various industries. Powerful classifiers can be trained to recognize visual patterns and detect people and objects. However, since these models use images as inputs, they often end up being trained with very complex architectures, with a high number of epochs and very large data sizes. This often results in very long training times, especially for weaker machines. This study investigated the use of parallel processing to speed up machine learning model training. Message-passing Interface (MPI) is used to parallelize the training of Random Forest, achieving speed-ups of up to 5 and an efficiency of up to 1.1. The study concludes that when dealing with big and complex data and building complex models, parallel processing can significantly reduce the training time of the model with minimal loss in accuracy. |
---|---|
DOI: | 10.1109/ICDABI60145.2023.10629534 |