Analysis of Fat Big Data Using Factor Models and Penalization Techniques: A Monte Carlo Simulation and Application
This article assesses the predictive accuracy of factor models utilizing Partial·Least·Squares (PLS) and Principal·Component·Analysis (PCA) in comparison to autometrics and penalization techniques. The simulation exercise examines three types of scenarios by introducing the issues of multicollineari...
Saved in:
Published in: | Axioms Vol. 13; no. 7; p. 418 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Basel
MDPI AG
01-07-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This article assesses the predictive accuracy of factor models utilizing Partial·Least·Squares (PLS) and Principal·Component·Analysis (PCA) in comparison to autometrics and penalization techniques. The simulation exercise examines three types of scenarios by introducing the issues of multicollinearity, heteroscedasticity, and autocorrelation. The number of predictors and sample size are adjusted to observe the effects. The accuracy of the models is evaluated by calculating the Root·Mean·Square·Error (RMSE) and the Mean·Absolute·Error (MAE). In the presence of severe multicollinearity, the factor approach utilizing (PLS demonstrates exceptional performance in comparison. Autometrics achieves the lowest RMSE and MAE values across all levels of heteroscedasticity. Autometrics provides better forecasts with low and moderate autocorrelation. However, Elastic·Smoothly·Clipped·Absolute·Deviation (E-SCAD) forecasts well with severe autocorrelation. In addition to the simulation, we employ a popular Pakistani macroeconomic dataset for empirical research. The dataset contains 79 monthly variables from January 2013 to December 2020. The competing approaches perform differently compared to the simulation datasets, although “The PLS factor approach outperforms its competing approaches in forecasting, with lower RMSE and MAE”. It is more probable that the actual dataset exhibits a high degree of multicollinearity. |
---|---|
ISSN: | 2075-1680 2075-1680 |
DOI: | 10.3390/axioms13070418 |