Incidence in the forecast by applying variable reduction. A practical example
DOI:
https://doi.org/10.36825/RITI.08.15.006Keywords:
Variable Reduction, Factorial Analysis, Classification Methods, Neural NetworksAbstract
This paper shows how the forecast affects reducing the number of variables with which the prediction is made, for this the factor analysis was applied. The variables were grouped into three factors. The following classification methods were used: discriminant analysis, logistic regression and neural networks. We worked with three groups of data, the first includes the original variables, the second the variables belonging to factors one, two and three and the third is composed only of those of factor one. Confusion matrices and ROC curves were used to determine the accuracy of the forecast models. The results obtained for each group are shown, where it is appreciated that the reduction of variables is very convenient to reach excellent prediction results using fewer resources; An example of this is the case of logistic regression where the difference in the accuracy of the model between the first two groups is less than three percent.
References
Cortez, P. (2009). Wine Quality Data Set. UCI-Machine leaning repository. Recuperado de: https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Rayo Llerena, I., Marín Huerta, E. (1998). Vino y Corazón. Revista Española de Cardiología, 51 (6), 435- 449. Recuperado de: https://www.revespcardiol.org/es-vino-corazon-articulo-X0300893298002947?redirect=true
Doll, R., Peto, R., Hall, E., Wheatley, K., Gray, R. (1994). Mortality in relation to consumption of alcohol: 13 years observations on male British doctors. BMJ, 309 (6959), 911-918. doi: https://doi.org/10.1136/bmj.309.6959.911
Moreno Padilla, R. D. (2019). La llegada de la inteligencia artificial a la educación. Revista de Investigación en Tecnologia de la Información (RITI), 7 (14), 260-270. doi: https://doi.org/10.36825/RITI.07.14.022
Hair, J. F., Anderson, R. E., Tatham, R. L., Black, W. C. (1999). Análisis Multivariante. Madrid: Prentice Hall.
Valencia Ramírez, J. P. (2019). Contratos inteligentes. Revista de Investigación en Tecnologia de la Información (RITI), 7 (14), 1-10. doi: https://doi.org/10.36825/RITI.07.14.001
Hosmer, D. W., Lemesbow, S. (1980). A Goodness-of-Fit Tests for the Multiple Logistic Regression Model. Communications in Statistics - Theory and Methods, 9 (10), 1043-1069. doi: https://doi.org/10.1080/03610928008827941
Pearson, R. K. (2018). Exploratory Data Analysis Using R. Boca Raton, US: CRC Press-Taylor & Francis Group.
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47 (4), 547-553. doi: https://doi.org/10.1016/j.dss.2009.05.016
Aldás, J., Uriel, E. (2017). Análisis Multivariante aplicado con R (2da Ed.). Madrid, España: Ediciones Paraninfo .
Hodnett, M., Wiley, J. F. (2018). R Deep Learning Essentials (2da Ed.). UK: Packt Publishing Ltd.
Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrika, 39, 31-36. doi: https://doi.org/10.1007/BF02291575
Kaiser, H. F., Rice, J. (1974). Litte Jiffy, Mark IV. Educational and Psychological Measurement, 34, 111-117. doi: https://doi.org/10.1177/001316447403400115
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 Revista de Investigación en Tecnologías de la Información
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Esta revista proporciona un acceso abierto a su contenido, basado en el principio de que ofrecer al público un acceso libre a las investigaciones ayuda a un mayor intercambio global del conocimiento.
El texto publicado en la Revista de Investigación en Tecnologías de la Información (RITI) se distribuye bajo la licencia Creative Commons (CC BY-NC), que permite a terceros utilizar lo publicado citando a los autores del trabajo y a RITI, pero sin hacer uso del material con propósitos comerciales.