External validation of the Norwegian anterior cruciate ligament reconstruction revision prediction model using patients from the STABILITY 1 Trial.
Academic Article
Overview
abstract
PURPOSE: A machine learning-based anterior cruciate ligament (ACL) revision prediction model has been developed using Norwegian Knee Ligament Register (NKLR) data, but lacks external validation outside Scandinavia. This study aimed to assess the external validity of the NKLR model (https://swastvedt.shinyapps.io/calculator_rev/) using the STABILITY 1 randomized clinical trial (RCT) data set. The hypothesis was that model performance would be similar. METHODS: The NKLR Cox Lasso model was selected for external validation owing to its superior performance in the original study. STABILITY 1 patients with all five predictors required by the Cox Lasso model were included. The STABILITY 1 RCT was a prospective study which randomized patients to receive either a hamstring tendon autograft (HT) alone or HT plus a lateral extra-articular tenodesis (LET). Since all patients in the STABILITY 1 trial received HT ± LET, three configurations were tested: 1: all patients coded as HT, 2: HT + LET group coded as bone-patellar tendon-bone (BPTB) autograft, 3: HT + LET group coded as unknown/other graft choice. Model performance was assessed via concordance and calibration. RESULTS: In total, 591/618 (95.6%) STABILITY 1 patients were eligible for inclusion, with 39 undergoing revisions within 2 years (6.6%). Model performance was best when patients receiving HT + LET were coded as BPTB. Concordance was similar to the original NKLR prediction model for 1- and 2-year revision prediction (STABILITY: 0.71; NKLR: 0.68-0.69). Concordance 95% confidence interval (CI) ranged from 0.63 to 0.79. The model was well calibrated for 1-year prediction while the 2-year prediction demonstrated evidence of miscalibration. CONCLUSION: When patients in STABILITY 1 who received HT + LET were coded as BPTB in the NKLR prediction model, concordance was similar to the index study. However, due to a wide 95% CI, the true performance of the prediction model with this Canadian and European cohort is unclear and a larger data set is required to definitively determine the external validity. Further, better calibration for 1-year predictions aligns with general prediction modelling challenges over longer periods. While not a large enough sample size to elicit the true accuracy and external validity of the prediction model when applied to North American patients, this analysis provides more support for the notion that HT plus LET performs similarly to BPTB reconstruction. In addition, despite the wide confidence interval, this study suggests optimism regarding the accuracy of the model when applied outside of Scandinavia. LEVEL OF EVIDENCE: Level 3, cohort study.