| ||||
| ||||
![]() Title:A Step Forward for Medical LLMs in Brazilian Portuguese: Establishing a Benchmark and a Strong Baseline Authors:Gabriel Lino Garcia, João Renato Ribeiro Manesco, Pedro Henrique Paiola, Pedro Henrique Crespan Ribeiro, Ana Lara Alves Garcia and João Paulo Papa Conference:IEEE CBMS 2025 Tags:Benchmark, Large Language Models, Machine Learning and Medicine Abstract: The application of large language models in healthcare presents unique challenges, particularly in non-English contexts where linguistic and cultural nuances significantly impact model effectiveness. Due to the lack of standardized evaluation protocols in less-represented languages, such as Brazilian Portuguese, model performance is often assessed through qualitative analysis, making systematic comparisons impossible. In this work, we introduce a novel benchmark for evaluating medical language models in Brazilian Portuguese, addressing a critical gap in AI assessment for healthcare applications. This benchmark is built upon Brazilian medical aptitude tests spanning 2011-2024, enabling extensive evaluation of both specialist and general large language models. Our findings demonstrate that despite advancements in language model capabilities, significant gaps remain in their ability to reason effectively about medical knowledge in Brazilian Portuguese. This benchmark establishes a proper foundation for evaluating and advancing medical language models in Portuguese, creating a standardized framework to guide development toward more effective, equitable, and culturally appropriate AI systems for healthcare in Brazil. A Step Forward for Medical LLMs in Brazilian Portuguese: Establishing a Benchmark and a Strong Baseline ![]() A Step Forward for Medical LLMs in Brazilian Portuguese: Establishing a Benchmark and a Strong Baseline | ||||
Copyright © 2002 – 2025 EasyChair |