Predicting Cumulative Live Birth Rate for Patients Undergoing In Vitro Fertilization (IVF)/Intracytoplasmic Sperm Injection (ICSI) for Tubal and Male Infertility: A Machine Learning Approach Using XGBoost
Infertility is a growing global concern, and the use of assisted reproductive technology (ART) has surged in recent years. Among ART methods, in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) are widely utilized to address infertility issues. However, the success rates of these treatments are influenced by numerous factors, and patients often face high costs and potential risks, such as ovarian hyperstimulation syndrome, infections, and multiple pregnancies. Accurate prediction of ART outcomes is crucial to optimize treatment strategies and improve patient counseling. Traditional statistical models, such as logistic regression, have been employed to predict outcomes like ovarian stimulation, pregnancy, and adverse obstetric events. Despite their widespread use, these models often suffer from low predictive efficacy, highlighting the need for more advanced approaches.
The rapid advancement of computer technology has facilitated the integration of artificial intelligence (AI) and machine learning (ML) into medical research. These methods have demonstrated superior performance compared to conventional statistical techniques. Among ML algorithms, eXtreme Gradient Boosting (XGBoost) has gained recognition for its exceptional ability to analyze complex datasets. XGBoost, a decision-tree-based algorithm, has been successfully applied in various medical prediction tasks, including disease diagnosis and prognosis. Its robustness in handling missing data and its capacity to integrate multiple weak predictive models into a strong classifier make it particularly suitable for complex medical datasets.
This study aimed to develop a prediction model using XGBoost to estimate the cumulative live birth rate (CLBR) for patients undergoing IVF/ICSI treatment for tubal or male infertility. The study also sought to compare the performance of the XGBoost model with a conventional logistic regression model to evaluate its clinical utility.
The study retrospectively analyzed data from 3,012 patients who underwent IVF/ICSI treatment at Peking Union Medical College Hospital in China between July 2014 and March 2018. Patients with donor oocyte or sperm use, endometriosis, endocrine diseases (e.g., hyperandrogenism, diabetes, or thyroid diseases), or missing data were excluded. The dataset included clinical characteristics, sex hormone levels, and controlled ovarian hyperstimulation (COH) features. Key variables included age, body mass index (BMI), infertility type, infertility duration, and COH protocol (e.g., gonadotropin-releasing hormone antagonist [GnRH-a] long protocol, GnRH-a ultra-long protocol, GnRH-a short protocol, GnRH antagonist protocol, and mini-stimulation protocol). Sex hormone levels, including follicle-stimulating hormone (FSH), estrogen (E2), luteinizing hormone (LH), prolactin (PRL), and testosterone (T), were measured at two time points: baseline (day 0) and the second day after trigger (day 1). The primary outcome was live birth, defined as the delivery of a live newborn after 28 weeks of gestation. The cumulative outcome included the first fresh cycle and all subsequent freeze-thaw cycles from the same ovarian stimulation.
Statistical analyses were performed using R and EmpowerStats software. A conventional logistic regression model was developed using backward stepwise variable selection with bootstrap resampling. The XGBoost model was constructed using the open-source XGBoost package, which analyzed feature importance and determined the probability threshold for live births. The predictive efficacy of both models was evaluated based on sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUC). Calibration curves and decision curve analysis (DCA) were used to assess model performance and clinical utility.
The dataset included 2,101 IVF cases and 911 ICSI cases. The XGBoost model identified age, estrogen levels on the second day after trigger (E21), PRL levels on the second day after trigger (PRL1), basal LH levels (LH0), LH levels on the second day after trigger (LH1), basal estrogen levels (E20), basal PRL levels (PRL0), and total FSH consumption as the most important features for predicting live birth. The conventional logistic regression model selected age, secondary infertility, ICSI, number of previous IVF cycles, total FSH consumption, basal FSH levels (FSH0), basal testosterone levels (T0), PRL1, LH1, E21, progesterone levels on the second day after trigger (P1), and testosterone levels on the second day after trigger (T1) as significant predictors.
The XGBoost model demonstrated superior predictive performance compared to the conventional logistic regression model. The AUC for the XGBoost model was 0.901 (95% confidence interval [CI]: 0.890–0.912), significantly higher than the AUC of 0.724 (95% CI: 0.708–0.741) for the conventional model (P < 0.001). Both models showed good calibration for the probability of live birth. The DCA indicated that the XGBoost model provided a larger net benefit than the conventional model, underscoring its potential clinical value.
The findings of this study highlight the advantages of using machine learning, particularly XGBoost, for predicting IVF/ICSI outcomes. The XGBoost model’s higher discriminative ability and net benefit suggest that it could be a valuable tool for personalized patient counseling and treatment planning. Accurate prediction of live birth rates can help patients make informed decisions and optimize their chances of success while minimizing risks and costs.
Previous studies have explored various models to estimate the chances of live birth following ART. The McLernon model, based on UK national data, is one of the most widely used models, offering a personalized estimate of cumulative live birth rates with a C-index of 0.72–0.73. However, this model does not include certain factors, such as anti-Müllerian hormone (AMH) and BMI, which are potential predictors of live birth. Other studies have employed machine learning algorithms, such as random forests, to predict ART outcomes. For example, Qiu et al. used XGBoost to predict live birth rates with an AUC of 0.73, while Amini et al. reported that random forests achieved the best performance with an AUC of 0.81. However, none of these studies directly compared the performance of machine learning models with conventional prediction models, as done in this study.
Despite its promising results, this study has several limitations. Its retrospective design introduces potential biases, and the single-center recruitment limits the generalizability of the findings. Additionally, the lack of external validation restricts the model’s immediate clinical application. Future research should focus on validating the model in larger, multi-center cohorts to confirm its robustness and applicability across diverse populations.
In conclusion, this study developed a prediction model using XGBoost to estimate the cumulative live birth rate for patients undergoing IVF/ICSI treatment for tubal or male infertility. The XGBoost model demonstrated higher discriminative ability and net benefit compared to a conventional logistic regression model, highlighting its potential clinical value. By providing accurate predictions of ART outcomes, this model can support personalized patient counseling and treatment strategies, ultimately improving the chances of successful live births while minimizing risks and costs.
doi.org/10.1097/CM9.0000000000001874
Was this helpful?
0 / 0