Comparative Study of AI Code Generation Tools: Quality Assessment and Performance Analysis

Authors

  • Michael Alexander Florez Muñoz Universidad del Quindío. Author
  • Juan Camilo Jaramillo De La Torre Universidad del Quindío. Author
  • Stefany Pareja López Universidad del Quindío. Author
  • Stiven Herrera Universidad del Quindío. Author
  • Christian Andrés Candela Uribe Universidad del Quindío. Author

DOI:

https://doi.org/10.62486/latia2024104

Keywords:

Artificial Intelligence, Coding Assistants, Code Generation

Abstract

Artificial intelligence (AI) code generation tools are crucial in software development, processing natural language to improve programming efficiency. Their increasing integration in various industries highlights their potential to transform the way programmers approach and execute software projects. The present research was conducted with the purpose of determining the accuracy and quality of code generated by artificial intelligence (AI) tools. The study began with a systematic mapping of the literature to identify applicable AI tools. Databases such as ACM, Engineering Source, Academic Search Ultimate, IEEE Xplore and Scopus were consulted, from which 621 papers were initially extracted. After applying inclusion criteria, such as English-language papers in computing areas published between 2020 and 2024, 113 resources were selected. A further screening process reduced this number to 44 papers, which identified 11 AI tools for code generation. The method used was a comparative study in which ten programming exercises of varying levels of difficulty were designed and the results obtained from 4 of them are presented. The identified tools generated code for these exercises in different programming languages. The quality of the generated code was evaluated using the SonarQube static analyzer, considering aspects such as safety, reliability and maintainability. The results showed significant variations in code quality among the AI tools. Bing as a code generation tool showed slightly superior performance compared to others, although none stood out as a noticeably superior AI. In conclusion, the research evidenced that, although AI tools for code generation are promising, they still require a pilot to reach their full potential, giving evidence that there is still a long way to go.

References

Alvarado Rojas, M. E. (2015). UNA MIRADA A LA INTELIGENCIA ARTIFICIAL.

Revista Ingeniería, Matemáticas y Ciencias de La Información, 2(3). http://ojs.urepublicana.edu.co/index.php/ingenieria/article/view/234

Azaiz, I., Deckarm, O., & Strickroth, S. (2023). AI-Enhanced Auto-Correction of Programming Exercises: How Effective is GPT-3.5? International Journal of Engineering Pedagogy (IJEP), 13(8), 67–83. https://doi.org/10.3991/ijep.v13i8.45621 DOI: https://doi.org/10.3991/ijep.v13i8.45621

Carrizo, D., & Moller, C. (2018). Estructuras metodológicas de revisiones sistemáticas de literatura en Ingeniería de Software: un estudio de mapeo sistemático. Ingeniare. DOI: https://doi.org/10.4067/S0718-33052018000500045

Revista Chilena de Ingeniería, 26, 45–54. https://doi.org/10.4067/S0718-33052018000500045 DOI: https://doi.org/10.4067/S0718-33052018000500045

Chemnitz, L., Reichenbach, D., Aldebes, H., Naveed, M., Narasimhan, K., & Mezini, M. (2023). Towards Code Generation from BDD Test Case Specifications: A Vision. 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN), 139–144. https://doi.org/10.1109/CAIN58948.2023.00031 DOI: https://doi.org/10.1109/CAIN58948.2023.00031

De Giusti, L. C., Villarreal, G. L., Ibañez, E. J., & De Giusti, A. E. (2023). Aprendizaje y enseñanza de programación: El desafío de herramientas de Inteligencia Artificial como ChatGPT. Web

Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., & Prather, J. (2022). The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. Proceedings of the 24th Australasian Computing Education Conference, 10–19. https://doi.org/10.1145/3511861.3511863 DOI: https://doi.org/10.1145/3511861.3511863

Gruson, D. (2021). Big Data, inteligencia artificial y medicina de laboratorio: la hora de la integración. Advances in Laboratory Medicine / Avances En Medicina de Laboratorio, 2(1), 5–7. https://doi.org/10.1515/almed-2021-0014 DOI: https://doi.org/10.1515/almed-2021-0014

Hernández-Pinilla, D. V., & Mendoza-Moreno, J. F. (n.d.). La Inteligencia Artificial como una herramienta para potenciar el desarrollo de software.

Kitchenham, B., Pretorius, R., Budgen, D., Brereton, Pearl., Turner, M., Niazi, M., & Linkman, S. (2010). Systematic literature reviews in software engineering – A tertiary study. Information and Software Technology, 52(8), 792–805. https://doi.org/10.1016/j.infsof.2010.03.006 DOI: https://doi.org/10.1016/j.infsof.2010.03.006

Koziolek, H., Gruener, S., & Ashiwal, V. (2023). ChatGPT for PLC/DCS Control Logic Generation. 2023 IEEE 28th International Conference on Emerging Technologies and Factory Automation (ETFA), 1–8. https://doi.org/10.1109/ETFA54631.2023.10275411 DOI: https://doi.org/10.1109/ETFA54631.2023.10275411

Lee, R. S. T. (2020). Artificial Intelligence in Daily Life. Springer Singapore. https://doi.org/10.1007/978-981-15-7695-9 DOI: https://doi.org/10.1007/978-981-15-7695-9

Llanos Mosquera, J. M., Hidalgo Suarez, C. G., & Bucheli Guerrero, V. A. (2021). Una revisión sistemática sobre aula invertida y aprendizaje colaborativo apoyados en inteligencia artificial para el aprendizaje de programación. Tecnura, 25(69), 196–214. https://doi.org/10.14483/22487638.16934 DOI: https://doi.org/10.14483/22487638.16934

Macchi, D., & Solari, M. (2012). Mapeo sistemático de la literatura sobre la Adopción de Inspecciones de Software.

Marar, H. W. (2024). Advancements in software engineering using AI. Computer Software and Media Applications, 6(1), 3906. https://doi.org/10.24294/csma.v6i1.3906 DOI: https://doi.org/10.24294/csma.v6i1.3906

Pasquinelli, M., Cafassi, E., Monti, C., Peckaitis, H., & Zarauza, G. (2022). Cómo una máquina aprende y falla – Una gramática del error para la Inteligencia Artificial. Hipertextos, 10(17), 13–29. https://doi.org/10.24215/23143924e054 DOI: https://doi.org/10.24215/23143924e054

Ruiz Baquero, P. E. (2018). Avances en inteligencia artificial y su impacto en la sociedad. https://repository.upb.edu.co/handle/20.500.11912/4942

Tseng, A., Hahnemann, L., Humberto, M. E., Gaitan, P., Antonio, M., & Acosta, P. (2023). ChatGPT desarrollando código fuente de software para historias de usuarios en una consultora, Lima, 2023. Repositorio Institucional - UCV. https://repositorio.ucv.edu.pe/handle/20.500.12692/133838

Wolfschwenger, P., Sabitzer, B., & Lavicza, Z. (2023). Integrating Cloud-Based AI in Software Engineers’ Professional Training and Development. 2023 IEEE Frontiers in Education Conference (FIE), 1–5. https://doi.org/10.1109/FIE58773.2023.10343391 DOI: https://doi.org/10.1109/FIE58773.2023.10343391

Yadav, R. K., & Pandey, M. (2020). Artificial Intelligence and its Application in Various Fields. International Journal of Engineering and Advanced Technology, 9(5), 1336–1339. https://doi.org/10.35940/ijeat.D9144.069520 DOI: https://doi.org/10.35940/ijeat.D9144.069520

Yan, D., Gao, Z., & Liu, Z. (2023). A Closer Look at Different Difficulty Levels Code Generation Abilities of ChatGPT. 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), 1887–1898. https://doi.org/10.1109/ASE56229.2023.00096 DOI: https://doi.org/10.1109/ASE56229.2023.00096

Published

2024-07-27

Issue

Section

Original

How to Cite

1.
Florez Muñoz MA, Jaramillo De La Torre JC, Pareja López S, Herrera S, Candela Uribe CA. Comparative Study of AI Code Generation Tools: Quality Assessment and Performance Analysis. LatIA [Internet]. 2024 Jul. 27 [cited 2025 Aug. 17];2:104. Available from: https://latia.ageditor.uy/index.php/latia/article/view/104