The Evolution of Radiology Image Annotation in the Era of Large Language Models. Review uri icon

Overview

abstract

  • Although there are relatively few diverse, high-quality medical imaging datasets on which to train computer vision artificial intelligence models, even fewer datasets contain expertly classified observations that can be repurposed to train or test such models. The traditional annotation process is laborious and time-consuming. Repurposing annotations and consolidating similar types of annotations from disparate sources has never been practical. Until recently, the use of natural language processing to convert a clinical radiology report into labels required custom training of a language model for each use case. Newer technologies such as large language models have made it possible to generate accurate and normalized labels at scale, using only clinical reports and specific prompt engineering. The combination of automatically generated labels extracted and normalized from reports in conjunction with foundational image models provides a means to create labels for model training. This article provides a short history and review of the annotation and labeling process of medical images, from the traditional manual methods to the newest semiautomated methods that provide a more scalable solution for creating useful models more efficiently. Keywords: Feature Detection, Diagnosis, Semi-supervised Learning © RSNA, 2025.

publication date

  • July 1, 2025

Research

keywords

  • Artificial Intelligence
  • Data Curation
  • Natural Language Processing
  • Radiology
  • Radiology Information Systems

Identity

PubMed Central ID

  • PMC12319696

Scopus Document Identifier

  • 105013550572

Digital Object Identifier (DOI)

  • 10.1148/ryai.240631

PubMed ID

  • 40304582

Additional Document Info

volume

  • 7

issue

  • 4