Artificial intelligence for automated identification of total shoulder arthroplasty implants.
Academic Article
Overview
abstract
BACKGROUND: Accurate and rapid identification of implant manufacturer and model is critical in the evaluation and management of patients requiring revision total shoulder arthroplasty (TSA). Failure to correctly identify implant designs in these circumstances may lead to delay in care, unexpected intraoperative challenges, increased morbidity, and excess health care costs. Deep learning (DL) permits automated image processing and holds the potential to mitigate such challenges while improving the value of care rendered. The purpose of this study was to develop an automated DL algorithm to identify shoulder arthroplasty implants from plain radiographs. METHODS: A total of 3060 postoperative images from patients who underwent TSA between 2011 and 2021 performed by 26 fellowship-trained surgeons at 2 independent tertiary academic hospitals in the Pacific Northwest and Mid-Atlantic Northeast were included. A DL algorithm was trained using transfer learning and data augmentation to classify 22 different reverse TSA and anatomic TSA prostheses from 8 implant manufacturers. Images were split into training and testing cohorts (2448 training and 612 testing images). Optimized model performance was assessed using standardized metrics including the multiclass area under the receiver operating characteristic curve (AUROC) and compared with a reference standard of implant data from operative reports. RESULTS: The algorithm classified implants at a mean speed of 0.079 seconds (±0.002 seconds) per image. The optimized model discriminated between 8 manufacturers (22 unique implants) with AUROCs of 0.994-1.000, accuracy of 97.1%, and sensitivities between 0.80 and 1.00 on the independent testing set. In the subset of single-institution implant predictions, a DL model identified 6 specific implants with AUROCs of 0.999-1.000, accuracy of 99.4%, and sensitivity >0.97 for all implants. Saliency maps revealed key differentiating features across implant manufacturers and designs recognized by the algorithm for classification. CONCLUSION: A DL model demonstrated excellent accuracy in identifying 22 unique TSA implants from 8 manufacturers. This algorithm may provide a clinically meaningful adjunct in assisting with preoperative planning for the failed TSA and allows for scalable expansion with additional radiographic data and validation efforts.