Clustering of poems by suicidal and not suicidal authors using K-means and particle swarm optimization
DOI:
https://doi.org/10.36825/RITI.09.18.002Keywords:
Clustering, Metaheuristics, K-Means, Particle Swarm Optimization, Suicidal Ideation DetectionAbstract
Suicide is considered a public health issue, and its early detection and treatment may contribute to its prevention. Automatic detection of suicidal ideation indicators within texts can be a useful tool to prevent it. In this work a corpus was compiled, which consists of poems written by twelve different poets, where six of them committed suicide. Two vector representations were experimented on, one with the total number of words and another with words related to negative emotional concepts. The vectors were clustered using two algorithms: K-Means and a K-Means with Particle Swarm Optimization hybrid. The efficiency of the vector representations and the used algorithms were compared, obtaining as result that, through the hybrid algorithm and the negative emotional concepts vocabulary, the groups of poets with suicidal ideation and without it could be distinguished with an accuracy of 0.98.
References
Instituto Nacional de Estadística, Geografía e Informática. (2019). Estadísticas a propósito del día mundial para la prevención del suicidio. Recuperado de: https://www.inegi.org.mx/contenidos/saladeprensa/aproposito/2019/suicidios2019_Nal.pdf
Ji, S., Pan, S., Li, X., Cambria, E., Long, G., Huang, Z. (2021). Suicidal ideation detection: A review of machine learning methods and applications. IEEE Transactions on Computational Social Systems, 8 (1), 214-226. doi: https://doi.org/10.1109/TCSS.2020.3021467
Pestian, J., Nasrallah, H., Matykiewicz, P., Bennett, A., Leenaars, A. (2010). Suicide note classification using natural language processing: A content analysis. Biomedical informatics insights, 3, 19-28. doi: https://doi.org/10.4137/BII.S4706
Mulholland, M., Quinn, J. (2013). Suicidal tendencies: The automatic classification of suicidal and non-suicidal lyricists using NLP. Trabajo presentado en Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japón. Recuperado de: https://www.aclweb.org/anthology/I13-1079.pdf
Zhang, L., Gao, J. (2017). A comparative study to understanding about poetics based on natural language processing. Open Journal of Modern Linguistics, 7 (5), 229-237. doi: https://doi.org/10.4236/ojml.2017.75017
Rebala, G., Ravi, A., Churiwala, S. (2019). An introduction to machine learning (1era. Ed.). Switzerland: Springer. doi: https://doi.org/10.1007/978-3-030-15729-6
Van der Merwe, D. W., Engelbrecht, A. P. (2003). Data clustering using particle swarm optimization. Trabajo presentado en Congress on Evolutionary Computation, Canberra, ACT, Australia. doi: https://doi.org/10.1109/CEC.2003.1299577
Kennedy, J., Eberhart, R. (1995). Particle swarm optimization. Trabajo presentado en International Conference on Neural Networks, Perth, WA, Australia. doi: https://doi.org/10.1109/ICNN.1995
Perez-Rosas, V., Banea, C., Mihalcea, R. (2012). Learning Sentiment Lexicons in Spanish. Trabajo presentado en 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey. Recuperado de: http://lrec-conf.org/proceedings/lrec2012/pdf/1081_Paper.pdf
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Revista de Investigación en Tecnologías de la Información

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Esta revista proporciona un acceso abierto a su contenido, basado en el principio de que ofrecer al público un acceso libre a las investigaciones ayuda a un mayor intercambio global del conocimiento.
El texto publicado en la Revista de Investigación en Tecnologías de la Información (RITI) se distribuye bajo la licencia Creative Commons (CC BY-NC), que permite a terceros utilizar lo publicado citando a los autores del trabajo y a RITI, pero sin hacer uso del material con propósitos comerciales.