2024-MBS-623

Predicting Antibiotic Resistance in E. coli Using Machine Learning Models

Author: Melika Teimouri

Faculty Supervisor: Pleuni Pennings

Department: Biology

Antibiotic resistance in Escherichia coli (E. coli) poses a significant public health challenge, requiring swift and accurate identification of drug-resistant strains to optimize patient treatment and curb resistance spread. This study aims to predict antibiotic resistance in E. coli through genomic analysis, evaluating machine learning techniques. Leveraging a one-year dataset from a US hospital, genomic data underwent annotation and pan-genome analysis to identify relevant features. Complementary metadata on antibiotic susceptibility and epidemiology were integrated using Python programming. Random Forest (RF) and Extreme Gradient Boosted Tree (XGBT) algorithms were trained to predict resistance, with performance assessed using accuracy, precision, and recall metrics. Both models showed strong performance, with XGBT slightly outperforming RF across antibiotics. Results focused on Penicillin (Ampicillin), Tetracycline, and Fluoroquinolones (Ciprofloxacin). Identifying key features influencing resistance informs medical decision-making, highlighting machine learning's potential in predicting E. coli antibiotic resistance accurately. Future work includes expanding scope, optimizing accuracy, exploring deep neural networks, and experimenting with other machine learning models, with implications for patient care and resistance mitigation.