Semi-Supervised, Attention-Based Deep Learning for Predicting TMPRSS2:ERG Fusion Status in Prostate Cancer Using Whole Slide Images.
Academic Article
Overview
abstract
Prostate cancer (PCa) harbors several genetic alterations, the most prevalent of which is TMPRSS2:ERG gene fusion, affecting nearly half of all cases. Capitalizing on the increasing availability of whole-slide images (WSIs), this study introduces a deep learning (DL) model designed to detect TMPRSS2:ERG fusion from H&E-stained WSIs of radical prostatectomy specimens. Leveraging the TCGA prostate adenocarcinoma cohort, which comprises 436 WSIs from 393 patients, we developed a robust DL model, trained across ten different splits, each consisting of distinct training, validation, and testing sets. The model's best performance achieved an Area Under the ROC curve (AUC) of 0.84 during training, and 0.72 on the TCGA test set. This model was subsequently validated on an independent cohort comprising 314 WSIs from a different institution, in which it has a robust performance at predicting TMPRSS2:ERG fusion with an AUC of 0.73. Importantly, the model identifies highly-attended tissue regions associated with TMPRSS2:ERG fusion, characterized by higher neoplastic cell content and altered immune and stromal profiles compared to fusion-negative cases. Multivariate survival analysis revealed that these morphological features correlate with poorer survival outcomes, independent of Gleason grade and tumor stage. This study underscores the potential of DL in deducing genetic alterations from routine slides and identifying their underlying morphological features which might harbor prognostic information. Implications: Our study illuminates the potential of deep learning in effectively inferring key prostate cancer genetic alterations from the tissue morphology depicted in routinely available histology slides, offering a cost-effective method that could revolutionize diagnostic strategies in oncology.