Nested block self-attention multiple resolution residual network for multiorgan segmentation from CT. Academic Article uri icon

Overview

abstract

  • BACKGROUND: Fast and accurate multiorgans segmentation from computed tomography (CT) scans is essential for radiation treatment planning. Self-attention(SA)-based deep learning methodologies provide higher accuracies than standard methods but require memory and computationally intensive calculations, which restricts their use to relatively shallow networks. PURPOSE: Our goal was to develop and test a new computationally fast and memory-efficient bidirectional SA method called nested block self-attention (NBSA), which is applicable to shallow and deep multiorgan segmentation networks. METHODS: A new multiorgan segmentation method combining a deep multiple resolution residual network with computationally efficient SA called nested block SA (MRRN-NBSA) was developed and evaluated to segment 18 different organs from head and neck (HN) and abdomen organs. MRRN-NBSA combines features from multiple image resolutions and feature levels with SA to extract organ-specific contextual features. Computational efficiency is achieved by using memory blocks of fixed spatial extent for SA calculation combined with bidirectional attention flow. Separate models were trained for HN (n = 238) and abdomen (n = 30) and tested on set aside open-source grand challenge data sets for HN (n = 10) using a public domain database of computational anatomy and blinded testing on 20 cases from Beyond the Cranial Vault data set with overall accuracy provided by the grand challenge website for abdominal organs. Robustness to two-rater segmentations was also evaluated for HN cases using the open-source data set. Statistical comparison of MRRN-NBSA against Unet, convolutional network-based SA using criss-cross attention (CCA), dual SA, and transformer-based (UNETR) methods was done by measuring the differences in the average Dice similarity coefficient (DSC) accuracy for all HN organs using the Kruskall-Wallis test, followed by individual method comparisons using paired, two-sided Wilcoxon-signed rank tests at 95% confidence level with Bonferroni correction used for multiple comparisons. RESULTS: MRRN-NBSA produced an average high DSC of 0.88 for HN and 0.86 for the abdomen that exceeded current methods. MRRN-NBSA was more accurate than the computationally most efficient CCA (average DSC of 0.845 for HN, 0.727 for abdomen). Kruskal-Wallis test showed significant difference between evaluated methods (p=0.00025). Pair-wise comparisons showed significant differences between MRRN-NBSA than Unet (p=0.0003), CCA (p=0.030), dual (p=0.038), and UNETR methods (p=0.012) after Bonferroni correction. MRRN-NBSA produced less variable segmentations for submandibular glands (0.82 ± 0.06) compared to two raters (0.75 ± 0.31). CONCLUSIONS: MRRN-NBSA produced more accurate multiorgan segmentations than current methods on two different public data sets. Testing on larger institutional cohorts is required to establish feasibility for clinical use.

publication date

  • June 8, 2022

Research

keywords

  • Image Processing, Computer-Assisted
  • Tomography, X-Ray Computed

Identity

PubMed Central ID

  • PMC9908007

Scopus Document Identifier

  • 85131374186

Digital Object Identifier (DOI)

  • 10.1002/mp.15765

PubMed ID

  • 35598077

Additional Document Info

volume

  • 49

issue

  • 8