Property-based software testing automation through prompt engineering and artificial intelligence
DOI:
https://doi.org/10.36825/RITI.13.31.005Keywords:
Prompt, Properties, Test, IA, LLMAbstract
The increasing complexity of modern software development demands more efficient, comprehensive, and adaptive testing methodologies that ensure the reliability, robustness, and quality of applications, while also optimizing the costs and time associated with system maintenance and evolution. Although traditional testing approaches remain widely used, they present limitations in terms of coverage, scalability, and adaptability, especially when dealing with dynamic and constantly evolving systems. In this context, the integration of large language models (LLMs) through prompt engineering techniques emerges as a promising and innovative alternative for automating, enhancing, and expanding software testing processes. This work presents a tool that combines the generative capabilities of LLMs, accessible through specialized APIs, with the rigor of property-based testing (PBT). This synergy enables the automatic generation of test properties and the intelligent validation of code, facilitating early error detection and contributing to the development of more robust and reliable software from the early stages of the development lifecycle. Through prompt engineering, the tool guides the precise formulation of test properties and orchestrates the generation of diverse and relevant data. This approach aims to overcome the limitations of traditional methodologies by improving test coverage, reducing manual effort, and increasing scalability. The result is a more optimized verification process that promotes higher standards of software quality and reliability. This proposal represents a step forward in the intelligent automation of testing, integrating artificial intelligence with formal validation methodologies and opening new possibilities for its application in software engineering.
References
Baresi, L., Pezze, M. (2006). An introduction to software testing. Electronic Notes in Theoretical Computer Science, 148 (1), 89–111. https://doi.org/10.1016/j.entcs.2005.12.014
Fink, G., Bishop, M. (1997). Property-based testing: A new approach to testing for assurance. ACM SIGSOFT Software Engineering Notes, 22 (4), 74–80. https://doi.org/10.1145/263244.263267
Kaner, C., Bach, J., Pettichord, B. (2002). Testing computer software. Wiley.
Beizer, B. (1990). Software testing techniques. Van Nostrand Reinhold Company. https://dl.acm.org/doi/10.5555/79060
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ..., Amodei, D. (2020). Language models are few-shot learners. 34th International Conference on Neural Information Processing Systems. Vancouver BC, Canada. https://dl.acm.org/doi/abs/10.5555/3495724.3495883
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55 (9), 1–35. https://doi.org/10.1145/3560815
White, J., Fu., Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., Schmidt, D. C. (2023). Prompt engineering techniques for large language models: A survey. arXiv preprint. https://arxiv.org/abs/2302.11382
Zhou, X., Schärli, N., Hou, L. Wei, J., Scales, N., Wang, X., Schuurmans, D., Cui, C., Bousquet, O. Le, Q., Chi, E. (2022). Least-to-most prompting enables complex reasoning in large language models. arXiv preprint. https://doi.org/10.48550/arXiv.2205.10625
Díaz Benito, G. (2024). Análisis sobre la utilización de transformers y modelos generativos para generación de anuncios (Trabajo de Fin de Grado). Universidad Politécnica de Madrid, España. https://oa.upm.es/82703/
Jason ZK. (2024). Ingeniería de prompts y prompts de IA: Conceptos, diseño y optimización. https://es.blog.jasonzk.com/ai/aipromptengineering/
Kippel01. (2024). La importancia del “prompt engineering” en la calidad de respuestas de herramientas de inteligencia artificial. https://www.kippel01.com/tecnologia/importancia-prompt-engineering-calidad-respuestas-herramientas-inteligencia-artificial
MacIver, D. R., Hatfield-Doods, Z. (2018). Hypothesis: A new approach to property-based testing. Journal of Open Source Software, 4 (43). https://doi.org/10.21105/joss.01891
Goldstein, H., Cutler, J. W., Dickstein, D., Pierce, B. C., Head, A. (2024). Property-Based Testing in Practice. IEEE/ACM 46th International Conference on Software Engineering (ICSE). Lisbon, Portugal. https://doi.org/10.1145/3597503.3639581
Claessen, K., Hughes, J. (2000). QuickCheck: A lightweight tool for random testing of Haskell programs. Fifth ACM SIGPLAN international conference on Functional programming. Singapore. https://doi.org/10.1145/351240.351266
Papadakis, M., Sagonas, K. (2011). A PropEr integration of types and function specifications with property-based testing. 10th ACM SIGPLAN workshop on Erlang. Tokyo, Japan. https://doi.org/10.1145/2034654.2034663
keploy. (2024). Property-based testing: A comprehensive guide. https://dev.to/keploy/property-based-testing-a-comprehensive-guide-lc2
Classen, A., Heymans, P., Schobbens, P.-Y., Legay, A., Raskin, J.-F. (2010). Model checking lots of systems: Efficient verification of temporal properties in software product lines. 32nd ACM/IEEE International Conference on Software Engineering - Volume 1. Cape Town South Africa. https://doi.org/10.1145/1806799.1806850
Austin, J., Odena, A., Nye, M., Bosma, M., Michalewski, H., Dohan, D., Jiang, E., Cai, C., Terry, M., Le, Q., Sutton, C. (2023). Program synthesis with large language models. arXiv preprint. https://doi.org/10.48550/arXiv.2108.07732
Chen, M., Tworek, J., Jun, H., ... (2021). Evaluating large language models trained on code. arXiv preprint. https://doi.org/10.48550/arXiv.2107.03374
Kazemitabaar, M., Williams, J., Drosos, I., Grossman, T., Henley, A. Z., Negreanu, C., Sarkar, A. (2024). Improving steering and verification in AI-assisted data analysis with interactive task decomposition. 37th Annual ACM Symposium on User Interface Software and Technology (UIST). Pittsburgh PA, USA. https://doi.org/10.1145/3654777.3676345
Claessen, K., Hughes, J. (2000). QuickCheck: A lightweight tool for random testing of Haskell programs. ACM SIGPLAN Notices, 35 (9), 268–279. https://doi.org/10.1145/357766.351266
Higginbotham, G. Z., Matthews, N. S. (2024). Prompting and in-context learning: Optimizing prompts for Mistral Large. https://doi.org/10.21203/rs.3.rs-4430993/v1
Mistral AI Team. (2024). Au large. https://mistral.ai/news/mistral-large
Zheng, Q., Guo, Y., ... (2023). Codegeex: A pre-trained model for code generation with multilingual benchmarking on humaneval-x. arXiv preprint. https://doi.org/10.48550/arXiv.2303.17568
Yu, Z., Wang, Z., ... (2024). Humaneval pro and mbpp pro: Evaluating large language models on self-invoking code generation. arXiv preprint. https://doi.org/10.48550/arXiv.2412.21199
Dohmke, T. (2024). Introducing GitHub models: A new generation of AI engineers building on GitHub. https://github.blog/news-insights/product-news/introducing-github-models/
Pérula, R., Calleja, T., Hernández de la Cruz, J. M. (2024). Versus: OpenAI GPT-4 vs. Google Gemini Pro vs. Mistral AI Large. https://www.paradigmadigital.com/dev/versus-openai-gpt4-google-gemini-pro-mistral-ai-large/
Visual Studio. (2025). Your first extension. https://code.visualstudio.com/api/get-started/your-first-extension
Hou, X., Zhao, Y., Wang, S., Wang, H. (2025). Model context protocol (mcp): Landscape, security threats, and future research directions. arXiv preprint. https://doi.org/10.48550/arXiv.2503.23278
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Revista de Investigación en Tecnologías de la Información

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Esta revista proporciona un acceso abierto a su contenido, basado en el principio de que ofrecer al público un acceso libre a las investigaciones ayuda a un mayor intercambio global del conocimiento.
El texto publicado en la Revista de Investigación en Tecnologías de la Información (RITI) se distribuye bajo la licencia Creative Commons (CC BY-NC![]()
), que permite a terceros utilizar lo publicado citando a los autores del trabajo y a RITI, pero sin hacer uso del material con propósitos comerciales.
