Prediction of Fatal Adverse Prognosis in Patients with Fever-Related Diseases Based on Machine Learning: A Retrospective Study
Fever is one of the most common chief complaints among emergency department (ED) patients and is a significant pathophysiological process associated with numerous febrile diseases. It can be a symptom of both infectious and non-infectious conditions, including sepsis, malignancy, tissue ischemia, cerebrovascular accidents, and autoimmune diseases. Early identification of patients at an increased risk of death due to fever-related illnesses is crucial to prevent adverse outcomes. However, the complexity of fever-related diseases makes it challenging to diagnose and predict clinical outcomes using traditional methods. This study aimed to establish an early prediction model for fatal adverse prognosis in fever patients by leveraging big data technology and machine learning algorithms.
Background and Significance
Fever is a common symptom that can indicate a wide range of underlying conditions. In recent years, public health events such as severe acute respiratory syndrome (SARS) and other infectious diseases have highlighted the importance of early detection and management of fever-related illnesses. However, no single biomarker can definitively diagnose sepsis or predict its clinical outcome. Traditional illness severity scoring systems, such as the Acute Physiology and Chronic Health Evaluation II (APACHE II), are often too complex and not specific to fever patients. Machine learning, with its ability to analyze large and complex datasets, has shown promise in outperforming traditional clinical decision rules for predicting in-hospital mortality in emergency department patients with sepsis. This study aimed to explore key factors associated with adverse prognosis in fever patients and develop an effective prediction model using machine learning techniques.
Methods
Study Design and Data Collection
This retrospective study analyzed clinical data from 28,400 patients admitted to the emergency room of the Chinese People’s Liberation Army General Hospital between November 2014 and March 2018. The inclusion criteria were patients with a body temperature of 37.3°C or higher and aged 12 years or older. Patients who died within four hours of admission or failed to complete laboratory examinations were excluded. The patients were divided into two groups: the fatal adverse prognosis group (those who experienced cardiopulmonary resuscitation or died during emergency treatment) and the good prognosis group (those who did not die or require advanced interventions during treatment).
Data were extracted from the Emergency Rescue Database of the hospital, including demographic information, vital signs, laboratory results, and other clinical indicators. Only the first set of data obtained within 24 hours of the ED visit was used for analysis. Variables with more than 30% missing data were removed, resulting in a final dataset of 39 variables for 3,682 patients (3474 in the good prognosis group and 208 in the adverse prognosis group).
Data Analysis and Feature Selection
Data cleaning and preprocessing were performed to handle missing values and errors. Baseline descriptive analysis was conducted, with continuous variables expressed as mean ± standard deviation or median (interquartile range), depending on their distribution. Differential hypothesis testing was performed using t-tests or Mann-Whitney U tests for numerical variables.
Recursive feature elimination (RFE) was used to determine the optimal number of variables for the prediction model. The decision tree algorithm was employed to calculate the prediction accuracy of all subsets of variable combinations, and the subset with the highest accuracy was selected. Pearson correlation coefficients were used to analyze the relationships between variables, and a heatmap was generated to visualize the results.
Machine Learning Models
Four machine learning algorithms were selected for model training: logistic regression, random forest, AdaBoost, and bagging. The dataset was split into training and testing sets in a 7:3 ratio. Model performance was evaluated using accuracy, F1-score, precision, sensitivity, and the area under the receiver operating characteristic curve (ROC-AUC). Ten-fold cross-validation was performed to validate the models. Additionally, external validation was conducted using emergency room data from December 2018 to December 2019, following the same inclusion and exclusion criteria.
Results
Baseline Analysis
The baseline analysis revealed significant differences in several variables between the adverse prognosis and good prognosis groups. Key variables included heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), diastolic blood pressure (DBP), pulse oxygen saturation (SPO2), temperature (T), creatine kinase myocardial isoenzyme (CK-MB), lactate dehydrogenase (LDH), serum amylase (AMY), cardiac troponin T (CTnT), serum potassium (K), total protein (TP), and albumin (ALB). Patients in the adverse prognosis group had higher levels of CK-MB, LDH, AMY, CTnT, and K, and lower levels of TP and ALB compared to the good prognosis group.
Feature Selection
The RFE method identified 15 key variables for the prediction model: HR, RR, SBP, DBP, SPO2, T, CK-MB, total bilirubin (TBIL), LDH, AMY, serum lipase (LIP), CTnT, K, TP, and ALB. The top six variables with the highest correlation to adverse prognosis were CTnT, LDH, SPO2, CK-MB, HR, and T. When the number of variables was reduced to 11, the co-existing factors were RR, SPO2, T, sodium (Na), chloride (Cl), CTnT, K, ALB, calcium (Ca), phosphorus (P), and red blood cell count (RBC).
Model Performance
In the training phase, the logistic regression model achieved the highest accuracy (0.951), while the bagging model had the highest ROC-AUC (0.885). The F1-scores for logistic regression, decision tree, AdaBoost, and bagging were 0.938, 0.933, 0.930, and 0.930, respectively. The precision values were 0.943, 0.938, 0.937, and 0.937, and the sensitivity values were 0.571, 0.524, 0.524, and 0.762, respectively. Ten-fold cross-validation confirmed the robustness of the logistic regression and bagging models, with ROC-AUC values of 0.80 and 0.87, respectively.
In the external validation phase, the decision tree model achieved the highest accuracy (0.901) and F1-score (0.915), while the bagging model had the highest sensitivity (0.605) and ROC-AUC (0.863).
Key Variables and Their Significance
The logistic regression model identified CTnT, T, RR, K, SPO2, and ALB as the top six variables with the highest coefficients and odds ratios (OR). CTnT had the highest coefficient (0.346) and OR (1.413), indicating its strong association with adverse prognosis. In the bagging model, the top six variables by weight were CTnT, RR, LDH, AMY, HR, and SBP.
Discussion
This study demonstrated the potential of machine learning in predicting fatal adverse prognosis in fever patients. The models identified key clinical indicators, including CTnT, RR, SPO2, T, ALB, and K, which should be closely monitored in clinical practice. CTnT, a marker of myocardial injury, was the most significant predictor of adverse prognosis, highlighting the importance of cardiac function in fever-related illnesses. Hypoproteinemia, as indicated by low ALB levels, was also a critical factor, emphasizing the need for timely correction of protein levels in fever patients.
Vital signs such as HR, RR, SBP, and DBP were found to have significant clinical value in predicting adverse outcomes. SPO2, an independent protective factor, underscores the importance of maintaining adequate oxygenation in fever patients. Additionally, serum sodium and potassium levels were identified as independent risk factors, suggesting the need for careful electrolyte management in critically ill patients.
The logistic regression and bagging models showed the best overall performance, with high accuracy and ROC-AUC values. The bagging model, in particular, demonstrated superior sensitivity, making it a valuable tool for early identification of high-risk patients. The reduction of variables to 11 did not significantly compromise model performance, making it more practical for clinical application.
Limitations
This study has two main limitations. First, the cohort included patients with diverse causes of fever, and the accuracy of the model for specific fever etiologies was not evaluated. Future research should focus on subgroup analysis based on the underlying causes of fever. Second, the retrospective nature of the study and the relatively small dataset may limit the generalizability of the findings. Larger, prospective studies are needed to validate the model and improve its predictive power.
Conclusion
This study successfully established a machine learning-based prediction model for fatal adverse prognosis in fever patients. The model identified key clinical indicators, including CTnT, RR, SPO2, T, ALB, and K, and demonstrated high diagnostic accuracy and reliability. The logistic regression and bagging models were particularly effective, offering valuable tools for early identification of critical patients. The application of big data analysis and machine learning in medical research has the potential to improve the diagnosis and treatment of fever-related illnesses, ultimately enhancing patient outcomes.
doi.org/10.1097/CM9.0000000000000675
Was this helpful?
0 / 0