Natural Language Processing for Digital Health in the Era of Large Language Models. Review uri icon

Overview

abstract

  • OBJECTIVES: Large language models (LLMs) are revolutionizing the natural language pro-cessing (NLP) landscape within healthcare, prompting the need to synthesize the latest ad-vancements and their diverse medical applications. We attempt to summarize the current state of research in this rapidly evolving space. METHODS: We conducted a review of the most recent studies on biomedical NLP facilitated by LLMs, sourcing literature from PubMed, the Association for Computational Linguistics Anthology, IEEE Explore, and Google Scholar (the latter particularly for preprints). Given the ongoing exponential growth in LLM-related publications, our survey was inherently selective. We attempted to abstract key findings in terms of (i) LLMs customized for medical texts, and (ii) the type of medical text being leveraged by LLMs, namely medical literature, electronic health records (EHRs), and social media. In addition to technical details, we touch upon topics such as privacy, bias, interpretability, and equitability. RESULTS: We observed that while general-purpose LLMs (e.g., GPT-4) are most popular, there is a growing trend in training or customizing open-source LLMs for specific biomedi-cal texts and tasks. Several promising open-source LLMs are currently available, and appli-cations involving EHRs and biomedical literature are more prominent relative to noisier data sources such as social media. For supervised classification and named entity recogni-tion tasks, traditional (encoder only) transformer-based models still outperform new-age LLMs, and the latter are typically suited for few-shot settings and generative tasks such as summarization. There is still a paucity of research on evaluation, bias, privacy, reproduci-bility, and equitability of LLMs. CONCLUSIONS: LLMs have the potential to transform NLP tasks within the broader medical domain. While technical progress continues, biomedical application focused research must prioritize aspects not necessarily related to performance such as task-oriented evaluation, bias, and equitable use.

publication date

  • April 8, 2025

Research

keywords

  • Natural Language Processing

Identity

Digital Object Identifier (DOI)

  • 10.1055/s-0044-1800750

PubMed ID

  • 40199310

Additional Document Info

volume

  • 33

issue

  • 1