Characterization of an artificial intelligence model for ranking static images of blastocyst stage embryos.
Academic Article
Overview
abstract
OBJECTIVE: To perform a series of analyses characterizing an artificial intelligence (AI) model for ranking blastocyst-stage embryos. The primary objective was to evaluate the benefit of the model for predicting clinical pregnancy, whereas the secondary objective was to identify limitations that may impact clinical use. DESIGN: Retrospective study. SETTING: Consortium of 11 assisted reproductive technology centers in the United States. PATIENT(S): Static images of 5,923 transferred blastocysts and 2,614 nontransferred aneuploid blastocysts. INTERVENTION(S): None. MAIN OUTCOME MEASURE(S): Prediction of clinical pregnancy (fetal heartbeat). RESULT(S): The area under the curve of the AI model ranged from 0.6 to 0.7 and outperformed manual morphology grading overall and on a per-site basis. A bootstrapped study predicted improved pregnancy rates between +5% and +12% per site using AI compared with manual grading using an inverted microscope. One site that used a low-magnification stereo zoom microscope did not show predicted improvement with the AI. Visualization techniques and attribution algorithms revealed that the features learned by the AI model largely overlap with the features of manual grading systems. Two sources of bias relating to the type of microscope and presence of embryo holding micropipettes were identified and mitigated. The analysis of AI scores in relation to pregnancy rates showed that score differences of ≥0.1 (10%) correspond with improved pregnancy rates, whereas score differences of <0.1 may not be clinically meaningful. CONCLUSION(S): This study demonstrates the potential of AI for ranking blastocyst stage embryos and highlights potential limitations related to image quality, bias, and granularity of scores.