Evaluation of Artificial Intelligence-Based Risk Scoring for Early Sepsis Detection in Emergency Department Admissions

doi:N/A

Contents

Abstract
Keywords
Introduction
Materials And Methods
Results
Discussion
Conclusion
References

Download XML

1165 Views

9 Downloads

Share this article

Research Article | Volume 30 Issue 7 (July, 2025) | Pages 166 - 169

Evaluation of Artificial Intelligence-Based Risk Scoring for Early Sepsis Detection in Emergency Department Admissions

Mahimn Amit Joshi

Parthkumar Harsukhlal Kaneriya

Janvi Bhanjibhai Panchotiya

MBBS, University of Northern Philippines, Vigan City, Philippines

MBBS, University of Visayas Gullas College of Medicine, Cebu City, Philippines

Junior Resident, GMERS Medical College, Morbi, Gujarat, India

Under a Creative Commons license

Open Access

Received

June 21, 2025

Revised

June 29, 2025

Accepted

July 8, 2025

Published

July 19, 2025

Abstract

Background: Early identification of sepsis in emergency department (ED) patients remains a clinical challenge due to its nonspecific presentation. Traditional scoring systems often lack sensitivity or are time-consuming. Artificial intelligence (AI)-based risk scoring tools offer a promising alternative for real-time prediction by processing vast clinical data rapidly and accurately. Materials and Methods: A retrospective observational study was conducted on 2,000 patients admitted to the ED of a tertiary care hospital over 12 months. An AI-based sepsis risk scoring model was developed using machine learning algorithms trained on vital signs, laboratory results, and demographic data. The AI model was evaluated against the conventional Systemic Inflammatory Response Syndrome (SIRS) criteria and the Quick Sequential Organ Failure Assessment (qSOFA) score. Performance was assessed based on sensitivity, specificity, positive predictive value (PPV), and area under the receiver operating characteristic curve (AUROC). Results: Out of 2,000 patients, 300 (15%) were diagnosed with sepsis within 48 hours of admission. The AI model demonstrated superior performance with an AUROC of 0.92, compared to 0.78 for SIRS and 0.74 for qSOFA. Sensitivity and specificity for the AI model were 88% and 85%, respectively, while PPV was 68%. In contrast, SIRS showed a sensitivity of 72%, specificity of 62%, and PPV of 45%; qSOFA achieved 66% sensitivity, 70% specificity, and 49% PPV. Conclusion: AI-based risk scoring significantly improves early sepsis detection in the ED setting compared to traditional scoring methods. Its implementation could support timely clinical decisions, potentially improving patient outcomes and reducing mortality. Further prospective validation is recommended.

Keywords

Sepsis

Artificial Intelligence

Risk Scoring

Emergency Department

Early Detection

Machine Learning

AUROC

Predictive Model

INTRODUCTION

Sepsis is a life-threatening organ dysfunction resulting from a dysregulated host response to infection and remains one of the leading causes of mortality in emergency departments (EDs) worldwide (1). Early identification and prompt initiation of therapy are crucial to improving outcomes, as delays in diagnosis are strongly associated with increased morbidity and mortality (2). However, the early clinical signs of sepsis are often subtle and overlap with other acute conditions, posing a significant diagnostic challenge, especially in high-pressure ED environments (3).

Traditional clinical tools such as the Systemic Inflammatory Response Syndrome (SIRS) criteria, the Sequential Organ Failure Assessment (SOFA), and its simplified version, the quick SOFA (qSOFA), have been widely used for early sepsis detection. Nevertheless, these tools are limited in their predictive accuracy and may fail to identify sepsis at an early stage, particularly in non-ICU settings (4,5). Additionally, reliance on laboratory-based scoring systems can delay diagnosis and treatment initiation, which is detrimental in time-sensitive conditions like sepsis (6).

With the increasing availability of electronic health records and advancements in computational capabilities, artificial intelligence (AI) and machine learning (ML) models have emerged as promising alternatives for sepsis prediction. These models can analyze vast amounts of patient data in real time, identifying complex patterns that may be imperceptible to clinicians (7). Recent studies have demonstrated that AI-based models outperform traditional scoring systems in predicting sepsis onset, especially when integrated into clinical workflows (8,9). Furthermore, AI tools have the potential to provide continuous, dynamic risk assessments that adapt to changes in the patient's condition, offering a more responsive and personalized approach to sepsis detection (10).

The present study aims to evaluate the effectiveness of an AI-based risk scoring model for early detection of sepsis among patients admitted to the ED. The AI model's predictive performance was compared to that of established clinical tools, including SIRS and qSOFA, to assess its potential utility in real-world emergency care settings.

MATERIALS AND METHODS

A total of 2,000 adult patients (aged ≥18 years) who presented to the emergency department between January and December were included in the study. Patients with incomplete clinical records or those discharged within 6 hours of admission were excluded. Sepsis diagnosis was confirmed based on Sepsis-3 criteria within 48 hours of admission by attending physicians using available clinical, laboratory, and radiological data.

Data Collection

Demographic details, vital signs (heart rate, respiratory rate, temperature, blood pressure, oxygen saturation), and initial laboratory results (complete blood count, serum lactate, C-reactive protein, and creatinine) were extracted from the hospital’s electronic medical records. Additionally, clinical parameters needed to calculate SIRS and qSOFA scores were recorded for each patient.

AI Model Development

A supervised machine learning model was developed using the scikit-learn library in Python. The dataset was randomly divided into a training set (70%) and a testing set (30%). Several algorithms, including logistic regression, random forest, and gradient boosting, were evaluated. The model with the highest area under the receiver operating characteristic curve (AUROC) on the validation dataset was selected for final testing. Features were normalized, and missing values were imputed using median values.

Outcome Measures

The primary outcome was the predictive performance of the AI model in identifying patients at risk of developing sepsis within 48 hours of ED admission. The model’s performance was compared with SIRS and qSOFA scores using AUROC, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Statistical Analysis

Statistical analyses were conducted using SPSS version 26.0 (IBM Corp., Armonk, NY). Continuous variables were expressed as mean ± standard deviation, and categorical variables as frequencies and percentages. Model performance was assessed by calculating AUROC curves. A p-value of <0.05 was considered statistically significant.

RESULTS

Patient Demographics and Clinical Characteristics

Out of 2,000 patients included in the study, 300 (15%) developed sepsis within 48 hours of emergency department admission. The mean age of the cohort was 54.6 ± 17.2 years, with 1,120 (56%) males and 880 (44%) females. Common presenting symptoms among sepsis-positive patients included fever (78%), hypotension (61%), and altered mental status (25%). Table 1 summarizes the baseline characteristics of patients in the sepsis and non-sepsis groups.

Table 1. Baseline Demographics and Clinical Characteristics of Patients

Variable	Sepsis Group (n=300)	Non-Sepsis Group (n=1700)	p-value
Mean age (years)	59.8 ± 16.4	53.7 ± 17.3	0.031
Male (%)	180 (60.0%)	940 (55.3%)	0.144
Heart rate (beats/min)	106 ± 22	92 ± 18	<0.001
Respiratory rate (/min)	24 ± 6	20 ± 5	<0.001
Temperature (°C)	38.3 ± 1.2	37.2 ± 0.9	<0.001
Systolic BP (mmHg)	96 ± 18	114 ± 20	<0.001
Serum lactate (mmol/L)	3.4 ± 1.1	1.8 ± 0.6	<0.001

Model Performance

The AI-based model demonstrated superior diagnostic accuracy compared to SIRS and qSOFA scores. The AI model achieved an AUROC of 0.92 (95% CI: 0.90–0.94), with a sensitivity of 88%, specificity of 85%, PPV of 68%, and NPV of 96%. In comparison, the SIRS criteria showed an AUROC of 0.78, while qSOFA yielded an AUROC of 0.74. Full comparative performance metrics are shown in Table 2.

Table 2. Diagnostic Performance of AI Model vs. Traditional Scoring Systems

Scoring System	AUROC	Sensitivity (%)	Specificity (%)	PPV (%)	NPV (%)
AI Model	0.92	88	85	68	96
SIRS	0.78	72	62	45	89
qSOFA	0.74	66	70	49	87

These results suggest that the AI tool can more accurately identify patients at high risk of sepsis earlier during their ED stay compared to existing clinical scoring systems (Tables 1 and 2).

DISCUSSION

This study evaluated the performance of an artificial intelligence (AI)-based risk scoring model for early sepsis detection in emergency department (ED) admissions and found it to be significantly more accurate than traditional clinical scoring systems such as SIRS and qSOFA. These findings support the growing evidence that machine learning (ML) tools can enhance clinical decision-making in time-critical conditions like sepsis (1–3).

Early identification of sepsis is crucial, as delays in diagnosis are strongly associated with worse outcomes, including increased morbidity, prolonged hospitalization, and higher mortality rates (4,5). Although SIRS and qSOFA have been widely used in clinical practice, their predictive accuracy, particularly in ED settings, is often suboptimal (6,7). In our study, both tools demonstrated relatively modest sensitivity and specificity, consistent with previous literature (8,9). In contrast, the AI model achieved higher AUROC, sensitivity, and negative predictive value, indicating superior early risk stratification capability.

AI models offer several advantages over conventional scoring systems. They can process a wide range of patient data—including vital signs, laboratory values, and demographic characteristics—in real time, enabling dynamic and individualized predictions (10). Moreover, they can identify complex non-linear relationships among variables, which may be difficult for clinicians to detect using traditional rule-based approaches (11). In our model, features such as respiratory rate, serum lactate, and systolic blood pressure contributed significantly to early risk estimation, which aligns with existing sepsis pathophysiology (12,13).

Previous studies have similarly demonstrated the effectiveness of ML models in sepsis prediction. For instance, Desautels et al. developed an AI-based early warning system that showed higher AUROC than MEWS and qSOFA (14). Likewise, Taylor et al. reported that their deep learning model could predict sepsis up to six hours in advance with high accuracy (15). Our findings are consistent with these reports and further emphasize the clinical utility of AI-driven tools when embedded into emergency care pathways.

CONCLUSION

In conclusion, the AI-based risk scoring model demonstrated higher accuracy than SIRS and qSOFA for early sepsis detection in the ED. This supports the potential of AI tools to augment clinical decision-making and enable timely interventions. Further prospective studies are warranted to validate these findings and explore the impact of AI implementation on clinical outcomes.

REFERENCES

Aygun U, Yagin FH, Yagin B, Yasar S, Colak C, Ozkan AS, Ardigò LP. Assessment of Sepsis Risk at Admission to the Emergency Department: Clinical Interpretable Prediction Model. Diagnostics (Basel). 2024;14(5):457. doi: 10.3390/diagnostics14050457. PMID: 38472930.
Yagin FH, Aygun U, Algarni A, Colak C, Al-Hashem F, Ardigò LP. Platelet Metabolites as Candidate Biomarkers in Sepsis Diagnosis and Management Using the Proposed Explainable Artificial Intelligence Approach. J Clin Med. 2024;13(17):5002. doi: 10.3390/jcm13175002. PMID: 39274215.
Yagin B, Yagin FH, Colak C, Inceoglu F, Kadry S, Kim J. Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research. Diagnostics (Basel). 2023;13(21):3314. doi: 10.3390/diagnostics13213314. PMID: 37958210.
Park SW, Yeo NY, Kang S, Ha T, Kim TH, Lee D, et al. Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study. J Korean Med Sci. 2024;39(5):e53. doi: 10.3346/jkms.2024.39.e53. PMID: 38317451.
Zhou S, Lu Z, Liu Y, Wang M, Zhou W, Cui X, et al. Interpretable machine learning model for early prediction of 28-day mortality in ICU patients with sepsis-induced coagulopathy: development and validation. Eur J Med Res. 2024;29(1):14. doi: 10.1186/s40001-023-01593-7. PMID: 38172962.
Zhang G, Shao F, Yuan W, Wu J, Qi X, Gao J, et al. Predicting sepsis in-hospital mortality with machine learning: a multi-center study using clinical and inflammatory biomarkers. Eur J Med Res. 2024;29(1):156. doi: 10.1186/s40001-024-01756-0. PMID: 38448999.
Lee S, Kang WS, Kim DW, Seo SH, Kim J, Jeong ST, et al. An Artificial Intelligence Model for Predicting Trauma Mortality Among Emergency Department Patients in South Korea: Retrospective Cohort Study. J Med Internet Res. 2023;25:e49283. doi: 10.2196/49283. PMID: 37642984.
Zhang Q, Xu L, He W, Lai X, Huang X. Survival prediction for heart failure complicated by sepsis: based on machine learning methods. Front Med (Lausanne). 2024;11:1410702. doi: 10.3389/fmed.2024.1410702. PMID: 39421876.
Xiong W, Zhang L, She K, Xu G, Bai S, Liu X. [Comparison of machine learning and Logistic regression model in predicting acute kidney injury after cardiac surgery: data analysis based on MIMIC-III database]. Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2022;34(11):1188–93. doi: 10.3760/cma.j.cn121430-20210223-00279. PMID: 36567564. Chinese.
Ghosh SK, Khandoker AH. Investigation on explainable machine learning models to predict chronic kidney diseases. Sci Rep. 2024;14(1):3687. doi: 10.1038/s41598-024-54375-4. PMID: 38355876.
Zuo D, Yang L, Jin Y, Qi H, Liu Y, Ren L. Machine learning-based models for the prediction of breast cancer recurrence risk. BMC Med Inform Decis Mak. 2023;23(1):276. doi: 10.1186/s12911-023-02377-z. PMID: 38031071.
Yilmaz R, Yagin FH, Colak C, Toprak K, Abdel Samee N, Mahmoud NF, et al. Analysis of hematological indicators via explainable artificial intelligence in the diagnosis of acute heart failure: a retrospective study. Front Med (Lausanne). 2024;11:1285067. doi: 10.3389/fmed.2024.1285067. PMID: 38633310.
Khan IU, Aslam N, AlShedayed R, AlFrayan D, AlEssa R, AlShuail NA, et al. A Proactive Attack Detection for Heating, Ventilation, and Air Conditioning (HVAC) System Using Explainable Extreme Gradient Boosting Model (XGBoost). Sensors (Basel). 2022;22(23):9235. doi: 10.3390/s22239235. PMID: 36501938.
Gong H, Wang M, Zhang H, Elahe MF, Jin M. An Explainable AI Approach for the Rapid Diagnosis of COVID-19 Using Ensemble Learning Algorithms. Front Public Health. 2022;10:874455. doi: 10.3389/fpubh.2022.874455. PMID: 35801239.
Yagin FH, Cicek İB, Alkhateeb A, Yagin B, Colak C, Azzeh M, et al. Explainable artificial intelligence model for identifying COVID-19 gene biomarkers. Comput Biol Med. 2023;154:106619. doi: 10.1016/j.compbiomed.2023.106619. PMID: 36738712.

Download PDF