Study of preferences for white and red wine using binary classification methods
DOI:
https://doi.org/10.36825/RITI.08.16.003Keywords:
Data Mining, Classification Methods, Neural Networks, Discriminating Analysis, Logistic RegressionAbstract
The application of data mining methods allows us to detect a series of patterns that may exist in the data we analyze but are not easy to detect a simple view. In this case, we apply some techniques to predict the frequencies of the taste of wine from a series of physical - chemical characteristics of its composition, both wine and white wine, drinks that have been liked by many people internationally for a long time The data set that was used in this work was taken from Green Wine from the North of Portugal. These data had a group of variables that allowed applying classification methods to predict the flavor specifications of the wine based on the criteria given by the customers. The methods were used to achieve this objective: discriminant analysis, logistic regression and neural networks. The results showed that for the two data sets the results are very similar when the three specific methods are applied. The discriminant capacity of the models makes it possible to clearly distinguish the separation of the two groups for classification.
References
Rayo Llerena, I., Marín Huerta, E. (1998). Vino y corazón. Revista Española de Cardiología, 51 (6), 435-449. Recuperado de: https://www.revespcardiol.org/es-vino-corazon-articulo-X0300893298002947?redirect=true
Doll, R., Peto, R., Hall, E., Wheatley, K., Gray, R. (1994). Mortality in relation to consumption of alcohol: 13 years observations on male British doctors. The BMJ (Clinical research ed.), 309 (6959), 911-918. doi: https://doi.org/10.1136/bmj.309.6959.911
Kannel, W. B., Curtis Ellison R. (1996). Alcohol and coronary heart disease: the evidence for a protective effect. Clinica Chimica Acta, 246 (1-2), 59-76. doi: https://doi.org/10.1016/0009-8981(96)06227-4
Hosmer, D. W., Lemesbow, S. (1980). A Goodness-of-Fit Tests for the Multiple Logistic Regression Model. Communications in Statistics-Theory and Methods, 9 (10), 1043-1069. doi: https://doi.org/10.1080/03610928008827941
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47 (4), 547-553. doi: https://doi.org/10.1016/j.dss.2009.05.016
Cortez, P. (2009). UCI-Machine leaning repository. Recuperado de: https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Hair, J. F., Anderson, R. E., Tatham, R. L., Black, W. C. (1999). Análisis Multivariante. Madrid: Prentice Hall.
Pearson, R. K. (2018). Exploratory Data Analysis Using R. Boca Raton, US: CRC Press-Taylor & Francis Group.
Henao Zuluaga, K. J., Correa Morales, J. C. (2018). Regresión Logística Bivariable para Tablas de Contingencia Usando Metodología GSK. Revista Comunicaciones en Estadística, 11 (2), 153–170.
Wiley, M. H. (2018). R Deep Learning Essentials. UK: Packt Publishing Ltd.
Aldás, J., Uriel, E. (2017). Análisis Multivariante aplicado con R (2da. Ed.). Madrid, España: Ediciones Paraninfo .
del Castillo Collazo, N. (2020). Predicción en el diagnóstico de tumores de cáncer de mama empleando métodos de clasificación. Revista de Investigación en Tecnología de la Información (RITI), 8 (15), 96-104. doi: https://doi.org/10.36825/RITI.08.15.009
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 Revista de Investigación en Tecnologías de la Información
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Esta revista proporciona un acceso abierto a su contenido, basado en el principio de que ofrecer al público un acceso libre a las investigaciones ayuda a un mayor intercambio global del conocimiento.
El texto publicado en la Revista de Investigación en Tecnologías de la Información (RITI) se distribuye bajo la licencia Creative Commons (CC BY-NC), que permite a terceros utilizar lo publicado citando a los autores del trabajo y a RITI, pero sin hacer uso del material con propósitos comerciales.