A neural-symbolic AI agent system for biomedical concept mapping.

Concept mapping links free-text descriptions to standardized concepts in biomedical terminologies and ontologies. Existing approaches fall into two main categories: rule-based systems, which are interpretable but struggle with ambiguity and scalability, and learning-based methods, which leverage contextual signals but perform poorly on long-tail concepts due to limited training data and offer limited explainability. To address these limitations, we propose Medical Concept Mapping (MCM), a novel agentic workflow that uses language models to rephrase ambiguous mentions into explicit, standardized descriptions prior to concept linking. This reformulation substantially improves mapping accuracy for underrepresented and abbreviated concepts. Across multiple benchmarks, MedMentions, ST21pv, and MCN, MCM outperforms state-of-the-art baselines, including KrissBERT, SciSpaCy, and USAGI, achieving Recall@1 scores of 63.3, 60.0, and 67.9, respectively. On zero-shot abbreviated mentions, MCM maintains strong performance, exceeding baseline Recall@1 by up to 24.8 points. Human evaluation confirms the quality of LLM-generated expansions: 79.4% were rated as reasonable and useful, and GPT-OSS received the highest approval (85.1%). These results demonstrate that MCM enables more accurate, interpretable, and robust long-tail concept normalization for biomedical applications.

VIVO Weill Cornell Medical College