Potential of Large Language Models in Improving Health Literacy - Exploring Evaluation Metrics for Plain Language Medical Text Adaptations

Dec 2, 2024·
Primož Kocbek

Abstract:

My research during the DAAD short-term research stay explores the potential of using LLMs for text adaptations/simplification of complex medical texts with a focus on appropriate evaluation metrics for a specific level of readability, i.e. NIH recommends health material is written at < K8 or 13–14 year old student reading level. Generally, such approaches, that can be precise and personalized, could improve health literacy, which is crucial for informed health-related decisions. Recent advancements in AI, particularly generative Large Language Models (LLMs), offer promising avenues for creating customized, scalable health literacy interventions. The research involves a selection of most suitable LLMs and evaluation metrics. It will include a baseline dataset (plain language adapted biomedical abstracts), where prompt engineered and/or RAG and/or FT LLMs will be selected according to appropriate evaluation metrics, such as Flesch–Kincaid grade level, SMOG, semantic similarity metrics such as BERTScore and LLM-based metrics (GPTScore, G-Eval). LLMs and evaluation metrics will be quantitative and qualitative evaluated in two small sample studies, where additionally potential data contamination by LLMs will be explored and taken into consideration.

About Primož:

Primož Kocbek is a senior professional research assistant at University of Maribor, Faculty of Health Sciences and a PhD student of biostatistics at University of Ljubljana. His research interest includes statistical modeling and machine learning techniques with applications in healthcare. His specific areas of interest include Large Language Model adaptations in healthcare, temporal data analysis, interpretability of prediction models as well as advanced machine learning methods on massive datasets in general. He is currently involved in two international project involving AI, a Europe Horizon project Synergy for Healthy Longevity (SynHealth) and a joint Belgian-Slovenian (FWO-ARIS) project Enriched conversational XAI methods for healthcare. He is DAAD AInet Fellow and an ASEF (American Slovenian Education Foundation) Junior Fellow. Additionally, he is an editorial board member at PLOS One, was proceedings co-chair at SIAM 2023, organization committee member at AIME 2023 and is a PC member at multiple scientific conferences such as AIME, SIAM, PAKDD, AMIA AS/IS/CIC, IEEE ICHI.