Evaluating LLMs for Arabic Code Summarization: Challenges and Insights from GPT-4

EasyChair Preprint 15569

6 pages•Date: December 13, 2024

Ahmed Aljohani, Raed Alharbi, Asma Alkhaldi and Wajdi Aljedaani

Abstract

GPT-4 —the backbone of ChatGPT—has demonstrated remarkable performance in both natural language and source code tasks. Recently, Large Language Models (LLMs) like GPT-4 have significantly advanced software engineering tasks such as code summarization. These advancements boost developer productivity and help address often neglected tasks like code documentation. While code summarization and commenting are essential for maintaining code quality and facilitating communication among developers, writing comments manually is time-consuming. Although several studies have proposed and evaluated deep learning-based approaches and LLMs to automate comment generation, these efforts primarily focus on the English language, leaving a gap for other languages, particularly Arabic. In this study, we evaluate the ability of GPT-4 to generate accurate Arabic comments. We support our evaluation with both manual and automatic analysis to measure the correctness and nature of the generated comments. Our findings reveal that while GPT-4 generally produces correct Arabic summaries, they often do not align with the developer's intent as reflected in the BERT-Similarity, ROUGE, and BLEU scores. We also show that GPT-4's comments are more verbose due to the morphological richness of the Arabic language and a systematic approach that tends to describe each code component in detail. Finally, the readability of these comments is moderate, with scores ranging from 30.29 to 100.

Keyphrases: Arabic language, Code Summarization, GPT-4, LLMs

Links:

https://easychair.org/publications/preprint/2z1C

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:15569,
  author    = {Ahmed Aljohani and Raed Alharbi and Asma Alkhaldi and Wajdi Aljedaani},
  title     = {Evaluating LLMs for Arabic Code Summarization: Challenges and Insights from GPT-4},
  howpublished = {EasyChair Preprint 15569},
  year      = {EasyChair, 2024}}

Download PDF Open PDF in browser