Incidence in the forecast by applying variable reduction. A practical example

Authors

DOI:

https://doi.org/10.36825/RITI.08.15.006

Keywords:

Variable Reduction, Factorial Analysis, Classification Methods, Neural Networks

Abstract

This paper shows how the forecast affects reducing the number of variables with which the prediction is made, for this the factor analysis was applied. The variables were grouped into three factors. The following classification methods were used: discriminant analysis, logistic regression and neural networks. We worked with three groups of data, the first includes the original variables, the second the variables belonging to factors one, two and three and the third is composed only of those of factor one. Confusion matrices and ROC curves were used to determine the accuracy of the forecast models. The results obtained for each group are shown, where it is appreciated that the reduction of variables is very convenient to reach excellent prediction results using fewer resources; An example of this is the case of logistic regression where the difference in the accuracy of the model between the first two groups is less than three percent.

References

Cortez, P. (2009). Wine Quality Data Set. UCI-Machine leaning repository. Recuperado de: https://archive.ics.uci.edu/ml/datasets/Wine+Quality

Rayo Llerena, I., Marín Huerta, E. (1998). Vino y Corazón. Revista Española de Cardiología, 51 (6), 435- 449. Recuperado de: https://www.revespcardiol.org/es-vino-corazon-articulo-X0300893298002947?redirect=true

Doll, R., Peto, R., Hall, E., Wheatley, K., Gray, R. (1994). Mortality in relation to consumption of alcohol: 13 years observations on male British doctors. BMJ, 309 (6959), 911-918. doi: https://doi.org/10.1136/bmj.309.6959.911

Moreno Padilla, R. D. (2019). La llegada de la inteligencia artificial a la educación. Revista de Investigación en Tecnologia de la Información (RITI), 7 (14), 260-270. doi: https://doi.org/10.36825/RITI.07.14.022

Hair, J. F., Anderson, R. E., Tatham, R. L., Black, W. C. (1999). Análisis Multivariante. Madrid: Prentice Hall.

Valencia Ramírez, J. P. (2019). Contratos inteligentes. Revista de Investigación en Tecnologia de la Información (RITI), 7 (14), 1-10. doi: https://doi.org/10.36825/RITI.07.14.001

Hosmer, D. W., Lemesbow, S. (1980). A Goodness-of-Fit Tests for the Multiple Logistic Regression Model. Communications in Statistics - Theory and Methods, 9 (10), 1043-1069. doi: https://doi.org/10.1080/03610928008827941

Pearson, R. K. (2018). Exploratory Data Analysis Using R. Boca Raton, US: CRC Press-Taylor & Francis Group.

Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47 (4), 547-553. doi: https://doi.org/10.1016/j.dss.2009.05.016

Aldás, J., Uriel, E. (2017). Análisis Multivariante aplicado con R (2da Ed.). Madrid, España: Ediciones Paraninfo .

Hodnett, M., Wiley, J. F. (2018). R Deep Learning Essentials (2da Ed.). UK: Packt Publishing Ltd.

Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrika, 39, 31-36. doi: https://doi.org/10.1007/BF02291575

Kaiser, H. F., Rice, J. (1974). Litte Jiffy, Mark IV. Educational and Psychological Measurement, 34, 111-117. doi: https://doi.org/10.1177/001316447403400115

Published

2020-03-06

How to Cite

del Castillo Collazo, N. (2020). Incidence in the forecast by applying variable reduction. A practical example. Revista De Investigación En Tecnologías De La Información, 8(15), 50–69. https://doi.org/10.36825/RITI.08.15.006

Issue

Section

Artículos