A multimodal approach for few-shot biomedical named entity recognition in low-resource languages.
Academic Article
Overview
abstract
In this study, we revisit named entity recognition (NER) in the biomedical domain from a multimodal perspective, with a particular focus on applications in low-resource languages. Existing research primarily relies on unimodal methods for NER, which limits the potential for capturing diverse information. To address this limitation, we propose a novel method that integrates a cross-modal generation module to transform unimodal data into multimodal data, thereby enabling the use of enriched multimodal information for NER. Additionally, we design a cross-modal filtering module to mitigate the adverse effects of text-image mismatches in multimodal NER. We validate our proposed method on two biomedical datasets specifically curated for low-resource languages. Experimental results demonstrate that our method significantly enhances the performance of NER, highlighting its effectiveness and potential for broader applications in biomedical research and low-resource language contexts.