Investigating the impact of algorithms and hardware on machine learning models in HPC systems
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
The development and effectiveness of machine learning (ML) applications rely on the support from underlying computing systems. This project investigates the impact of algorithmic techniques and high-performance computing (HPC) system components on ML performance. Following standardized image data preprocessing, the Synthetic Minority Over-sampling Technique (SMOTE) is employed for class balancing, and the Recursive Feature Elimination with Cross-Validation (RFECV) technique is employed for optimal feature selection. An HPC cluster featuring hundreds of central processing unit (CPU) cores, multiple graphics processing unit (GPU) accelerators, several terabytes of random-access memory (RAM), and running the CentOS Linux distribution is used to investigate the training time and prediction accuracy of various ML models, including Support Vector Machine (SVM), Convolutional Neural Network (CNN), Random Forests (RF), and Extreme Gradient Boosting (XGBoost). Per fair-share scheduling policy, this study uses up to four CPU cores, two GPU accelerators, and 150 gigabytes of RAM. Simulation results show that the CNN model outperforms the other models. Using the top 50% of balanced features reduces the CNN model's training time significantly, up to 90.39%, with a slight increase in accuracy. Allocating four CPU cores and two GPU accelerators, rather than relying on a single CPU core without GPU support, cut the training time up to 56.18%, while maintaining comparable accuracy. The impact of hardware support on ML models can be extended to investigate how resource allocation affects ML inference time.
Table of Contents
Description
Publisher
Journal
Book Title
Series
PubMed ID
ISSN
2377-6943

