mTREE: Multi-Level Text-Guided Representation End-to-End Learning for Whole Slide Image Analysis.

Overview

abstract

Multi-modal learning adeptly integrates visual and textual data, but its application to histopathology image and text analysis remains challenging, particularly with large, high-resolution images like gigapixel Whole Slide Images (WSIs). Current methods typically rely on manual region labeling or multi-stage learning to assemble local representations (e.g., patch-level) into global features (e.g., slide-level). However, there is no effective way to integrate multi-scale image representations with text data in a seamless end-to-end process. In this study, we introduce Multi-Level Text-Guided Representation End-to-End Learning (mTREE). This novel text-guided approach effectively captures multi-scale WSI representations by utilizing information from accompanying textual pathology information. mTREE innovatively combines - the localization of key areas ("global-to-local") and the development of a WSI-level image-text representation ("local-to-global") - into a unified, end-to-end learning framework. In this model, textual information serves a dual purpose: firstly, functioning as an attention map to accurately identify key areas, and secondly, acting as a conduit for integrating textual features into the comprehensive representation of the image. Our study demonstrates the effectiveness of mTREE through quantitative analyses in two image-related tasks: classification and survival prediction, showcasing its remarkable superiority over baselines. Code and trained models are made available at https://github.com/hrlblab/mTREE.

authors

Liu, Quan

Deng, Ruining
Cui, Can
Yao, Tianyuan
Yang, Yuechen
Nath, Vishwesh
Li, Bingshan
Chen, You
Tang, Yucheng
Huo, Yuankai

publication date

January 1, 2025

published in

IS&T International Symposium on Electronic Imaging Journal

Identity

PubMed Central ID

PMC12662735

Scopus Document Identifier

105000833089

Digital Object Identifier (DOI)

10.2352/ei.2025.37.12.hpci-183

PubMed ID

41323017

Additional Document Info

volume

37

VIVO Weill Cornell Medical College

mTREE: Multi-Level Text-Guided Representation End-to-End Learning for Whole Slide Image Analysis. Academic Article

Overview

abstract

authors

publication date

published in

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

volume