Deep learning [<sup>18</sup>F]-FDG-PET/CT‑based algorithm for tumor burden estimation in metastatic melanoma patients under immunotherapy.

Overview

abstract

BACKGROUND AND PURPOSE: Artificial intelligence is increasingly used in radiation oncology, yet its application for tumor burden (TB) estimation remains limited. This study evaluated the performance of a [¹⁸F]-fluorodeoxyglucose positron emission tomography/computerized tomography ([¹⁸F]-FDG-PET/CT)-based deep learning model, PET-Assisted Reporting System ("PARS", Siemens Healthineers), for lesion detection, segmentation, and TB estimation in patients with metastatic melanoma undergoing immunotherapy. MATERIALS AND METHODS: This retrospective study included 165 stage IV melanoma patients who underwent [¹⁸F]-FDG-PET/CT imaging prior to immunotherapy. Gross tumor volumes were segmented using PARS and compared with manual delineations performed by radiation oncologists. Performance was assessed through lesion detection metrics (precision and recall), individual lesion volume agreement, and overall TB estimation accuracy. RESULTS: PARS demonstrated an overall recall (sensitivity) of 68.9 %, though with modest precision (46.8 %). Performance was location-dependent, with highest precision observed for lung lesions (74.0 %) and lowest for bone lesions (32.9 %). For lesions detected by both methods, PARS tended to underestimate lesion volumes by an average (median) of 0.9 cc (median relative percentage difference (MRPD) = -34.3 %), with a good agreement (intraclass correlations coefficient (ICC) = 0.77). The global TB in the whole cohort was overestimated by 28.3 %, but patient-level TB was on average (median) underestimated by 1.1 cc (MRPD = -18.4 %) with high variability with a median absolute relative percentage difference (MARPD) = 68.6 %) and poor agreement (intraclass correlation coefficient (ICC) = 0.28). CONCLUSIONS: PARS shows potential for treatment decision support with moderate accuracy in lesion detection and lesion volume estimation, but demonstrates significant variability in TB estimation, highlighting the need for further model refinements before clinical adoption.