Database Credentialed Access

Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB)

Ziming Wei Sanjat Kanjilal

Published: Dec. 5, 2025. Version: 1.0.0


When using this resource, please cite: (show more options)
Wei, Z., & Kanjilal, S. (2025). Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB) (version 1.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/2r5k-b955

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.

Abstract

The Antibiotic Resistance Microbiology Dataset – MGB (ARMD-MGB) is a de-identified resource derived from electronic health records (EHR) that facilitates research in antimicrobial resistance (AMR). ARMD-MGB encompasses data collected from over 225,000 adult patients over 10 years from hospitals in the Mass General Brigham healthcare system. The focus of the data is on over 970,000 microbiological cultures, with associated antibiotic susceptibilities, and clinical and demographic features of the patients who submitted the samples. Key attributes include organism identity, semi-quantitative antibiotic susceptibility results, susceptibility phenotypes, and de-identified clinical metadata.

The ARMD-MGB dataset is designed to complement the ARMD-Stanford dataset. Cohort inclusion and exclusion criteria, feature descriptions, and encoding are similar across sites. These datasets support studies on antimicrobial stewardship, causal inference, and clinical decision-making, and are designed to be reusable and interoperable, promoting collaboration and innovation in combating AMR.


Background

Antimicrobial resistance (AMR) arises from the multiscale interaction between evolutionary forces, microbiology, the built environment, and human behavior, making it one of the great challenges of the 21st century [1-3]. In 2019, AMR was linked to nearly 5 million deaths worldwide, including at least 1.27 million directly due to resistant infections [4]. In the U.S., over 2.8 million AMR infections occur annually, resulting in more than 35,000 deaths [5] and costs of more than $2 billion [6].

Machine learning (ML) is a subfield of artificial intelligence (AI) that has emerged as one potential avenue by which to address this complex phenomenon. ML is primarily concerned with the development of algorithms that are able to build a predictive model using a training data set, with little to no human input [7]. Interest in applying ML to electronic health record (EHR) data to understand AMR has intensified over the past decade, reflecting the exponential increase in biological and medical data availability, massive improvements in computational power, and critical breakthroughs in algorithm development [8].

Prior ML studies with EHR data have led to increased insight into AMR epidemiology [9,10], optimal treatment selection [11], and the building of AI-supported decision support tools [12,13], among many other topics. However, the wider adoption of these algorithms is limited by the use of single-center datasets, which limit generalizability, and the use of relatively few model types, along with sparse clinical metadata. To truly harness the power of ML, there is an important need for broadly generalizable datasets that contain diverse clinical practice patterns, patient populations, and patterns of AMR.

The ARMD-MGB and ARMD-Stanford [14] projects represent a pioneering effort to overcome this fundamental gap through the release of fully de-identified, population-based, multi-center, harmonized EHR datasets that contain rich metadata for hundreds of thousands of patients, and are specifically engineered for the study of AMR. Both ARMD datasets differ from previously publicly available datasets in this space, which focus only on genetic and genomic determinants of resistance [15,16], or lack deep clinical metadata [17], and/or lack sufficient power for individual-level prediction across a broad population [13].

ARMD-MGB uses the same inclusion and exclusion criteria, as well as an identical feature engineering pipeline as ARMD-Stanford, but there are some important differences between the two.

  • ARMD-MGB is specific to patients who received care within the Mass General Brigham healthcare system, a network of 12 academic and community hospitals located in the New England area, and contains data from 2015 through 2024. ARMD-Stanford is specific to the Stanford Health Care, located in California, and contains data from 1999 through 2024.
  • ARMD-MGB has granular microbiology data, including fields such as the susceptibility test method, minimum inhibitory concentrations and disk diameters, presence of beta-lactamase enzymes, and uniform breakpoint interpretations (from the CLSI M-100 document), which are not present in the ARMD-Stanford dataset.
  • ARMD-MGB categorizes ward type into inpatient / outpatient / emergency room / urgent care / day surgery while ARMD-Stanford categorizes ward types into inpatient / ICU / outpatient / emergency room.

Methods

Cohort Selection

  • Inclusion: Adult patients (age >= 18) who submitted a urine, blood, or respiratory culture for clinical and surveillance purposes to the clinical microbiology laboratory.
  • Exclusion: Microbiology cultures with no previous culture within the last 14 days.

Data de-identification

  • Patient ages are binned into categories (defined below in the data dictionary).
  • Patient ID, Patient Encounter ID, and Microbiology Culture Order ID are mapped to a random integer.
  • All dates are randomly shifted for each patient, encounter, and microbiology culture. They remain internally consistent for a given patient and across datasets.

Derivation of susceptibility phenotype labels

The medical record contains microbiological testing results for any specimens sent to the clinical microbiology labs of any hospital within the Massachusetts General Brigham healthcare system, which encompasses 12 hospitals in the New England region. The data includes the body site of collection, the identity of the pathogen, and its susceptibility testing results against standard panels of antibiotics. The dataset contains the metric used for each test (e.g., minimum inhibitory concentration (MIC) vs. disk diameter (DD)) and the numeric value of the corresponding test result, as well as the date and location category of specimen collection (both anonymized but internally consistent).

The microbiology data contains the susceptibility phenotypic labels reported by the laboratory based on their internal protocols, as well as phenotypes derived from clinical breakpoints published by the Clinical and Laboratory Standards Institute (CLSI) in the 2022 edition of the M-100 document [18]. This facilitates comparison across time where changes in clinical breakpoints may have otherwise led to discrepancies in reporting.


Data Description

File descriptions

Identifying feature descriptions (common to all datasets)

  • anon_id
    • Description: De-identified patient ID
    • Type: Integer
    • Encoding: random number
    • Units: N/A
    • Range: 1 - 226,659
  • pat_enc_csn_id_coded
    • Description: De-identified patient encounter ID, assigned every time a patient visits the hospital for any reason
    • Type: Integer
    • Encoding: random ID de-identification
    • Units: N/A
    • Range: 1 - 497,096
  • order_proc_id_coded
    • Description: De-identified microbiology culture ID, assigned for every unique sample received by the lab
    • Type: Integer
    • Encoding: random ID de-identification
    • Units: N/A
    • Range: 1 - 970,165
  • order_time_jittered_utc_shifted
    • Description: Microbiology culture order timestamp (anonymized but internally consistent)
    • Type: DateTime
    • Encoding: N/A
    • Units: UTC
    • Range: Anonymized

microbiology_cohort_deid_tj.csv

Description: Microbiology culture data for the study cohort.

  • Empty values indicate field is N/A for that culture (i.e., culture was negative)
  • culture_description
    • Description: Body site of culture
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • organism
    • Description: Organism name assigned by the clinical microbiology laboratory
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • neg_cx
    • Description: Flag if the culture was negative
    • Type: Binary
    • Encoding: N/A
    • Units: N/A
    • Range: X / NA
  • mult_org_ast
    • Description: Flag if multiple organisms were present in the sample AND had antimicrobial susceptibility testing (AST) performed
    • Type: Binary
    • Encoding: N/A
    • Units: N/A
    • Range: X / NA
  • has_AST
    • Description: Flag if AST was performed
    • Type: Binary
    • Encoding: N/A
    • Units: N/A
    • Range: X / NA
  • prelim_AST
    • Description: Flag if reported results were preliminary
    • Type: Binary
    • Encoding: N/A
    • Units: N/A
    • Range: X / NA
  • AST_panel
    • Description: Type of AST method used (broth microdilution, disk diameter, gradient diffusion, PCR, etc.)
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • enzyme_class
    • Description: Class of enzyme or mutation tested for (ESBL, PBP2a, carbapenemase, etc.)
    • Type: Categorical
    • Encoding: ESBL (extended spectrum beta-lactamase); PBP (penicillin binding protein); beta_lactamase; carbapenemase
    • Units: N/A
    • Range: N/A
  • enzyme
    • Description: Specific enzyme or mutant tested for
    • Type: Categorical
    • Encoding: blaZ; mecA; NDM; KPC; IMP; OXA; VIM
    • Units: N/A
    • Range: N/A
  • AST_code
    • Description: Three-letter code derived from the American Society of Microbiology abbreviation list for manuscripts [19] for the antibiotic tested in the AST panel
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • antibiotic
    • Description: Full name of the antibiotic tested in AST panel
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • AST_inequality
    • Description: Operator used in AST result
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: < / <= / > / >= / =
  • AST_val1
    • Description: Semi-quantitative value determined by AST panel
    • Type: Numerical
    • Encoding: N/A
    • Units: varies
    • Range: N/A
  • AST_val2
    • Description: Second value if drug tested had two components (e.g., trimethoprim-sulfamethoxazole)
    • Type: Numerical
    • Encoding: N/A
    • Units: varies
    • Range: N/A
  • AST_pheno
    • Description: Phenotypic label (Susceptible, Intermediate, Resistant, etc.) determined by lab internal protocols
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: Susceptible / Intermediate / Resistant / Other
  • CLSI_2022_pheno
    • Description: Phenotypic label determined by the 2022 version of CLSI M-100 [18]
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: Susceptible / Intermediate / Resistant / Other

ADI_deid_tj.csv

Description: Area Deprivation Index (ADI) scores for the patient based on the ZIP code [20, 21].

  • Empty values indicate field is missing
  • adi_score
    • Description: Actual ADI score
    • Type: Numerical
    • Encoding: N/A
    • Units: N/A
    • Range: 1.32 - 78.7
  • adi_state_rank
    • Description: Ranking of the ADI score within the state
    • Type: Integer
    • Encoding: N/A
    • Units: N/A
    • Range: 1–10

comorbidity_deid_tj.csv

Description: Patient comorbidities

  • Empty values indicate field is missing
  • ICD10
    • Description: ICD-10 code of diagnosis
    • Type: String
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • category
    • Description: Elixhauser comorbidity category of diagnosis [22]
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A

demographics_deid_tj.csv

Description: Patient demographics

  • Empty values indicate field is missing
  • age
    • Description: Age category of patient at time of microbiology culture collection
    • Type: Categorical
    • Encoding: N/A
    • Units: years
    • Range: 18-24 / 25-34 / 35-44 / 45-54 / 55-64 / 65-74 / 75-84 / 85-89 / ≥90
  • gender
    • Description: Gender identity of the patient at the time of microbiology culture collection
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: Male / Female / Other / Unknown

nursing_home_visits_deid_tj.csv

Description: Nursing home visits relative to culture time

  • Empty values indicate field is missing
  • nursing_home_visit_culture
    • Description: Time interval between the last nursing home visit and the current microbiology culture
    • Type: Numerical
    • Encoding: N/A
    • Units: days
    • Range: 0 - 160
  • visit_date_shifted
    • Description: Date of last nursing home visit relative to culture (anonymized but internally consistent)
    • Type: DateTime
    • Encoding: N/A
    • Units: UTC
    • Range: Anonymized

prior_abx_deid_tj.csv

Description: Prior exposures to antimicrobials relative to the current microbiology culture

  • Empty values indicate field is missing
  • last_dose_to_culture
    • Description: Time interval between last antibiotic dose and current microbiology culture
    • Type: Numerical
    • Encoding: N/A
    • Units: days
    • Range: 0 - 3,402
  • drug_code
    • Description: Three-letter code derived from the American Society of Microbiology abbreviation list for manuscripts [19] for the prior antibiotic exposure
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • medication_name
    • Description: Full name of the antibiotic taken or ordered
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • drug_class
    • Description: Antibiotic class (anti-staph beta-lactam, fluoroquinolone, etc.)
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • last_dose_DT_shifted
    • Description: Timestamp of prior antibiotic (anonymized but internally consistent)
    • Type: DateTime
    • Encoding: N/A
    • Units: UTC
    • Range: Anonymized

prior_micro_deid_tj.csv

Description: Full prior microbiology history for patient relative to current microbiology culture

  • Empty values
    • AST_pheno: N/A for that culture (i.e., no breakpoints exist or were reported)
    • All other fields (organism, prior_AST_DTS_shifted, etc.): Patient had no prior microbiology data
  • organism
    • Description: Name of the organism identified in a prior culture
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • drug_code
    • Description: Three-letter code derived from the American Society of Microbiology abbreviation list for manuscripts [19] for the antibiotic tested in the AST panel for the prior organism
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • antibiotic
    • Description: Full name of the antibiotic tested in AST panel for the prior organism
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • prior_AST_time_to_culture
    • Description: Time interval between the prior AST result and the current microbiology culture
    • Type: Numerical
    • Encoding: N/A
    • Units: days
    • Range: 1 - 3,332
  • prior_AST_DTS_shifted
    • Description: Timestamp of prior AST result (anonymized but internally consistent)
    • Type: DateTime
    • Encoding: N/A
    • Units: UTC
    • Range: Anonymized

prior_org_deid_tj.csv

Description: Prior organism history for patient relative to current microbiology culture

  • Empty values indicate field is missing
  • prior_org_days_to_culture
    • Description: Time interval between prior organism and current microbiology culture
    • Type: Numerical
    • Encoding: N/A
    • Units: days
    • Range: 1 - 3,417
  • prior_org
    • Description: Name of the organism in prior microbiology culture
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • prior_org_specific
    • Description: Organism category for the organism in prior microbiology culture (categorization schema shown below)
    • Type: Categorical
    • Encoding: N/A
    • Units: N/A
    • Range: N/A
  • prior_org_recorded_time_shifted
    • Description: Recorded time of prior microbiology culture (anonymized but internally consistent)
    • Type: DateTime
    • Encoding: N/A
    • Units: UTC
    • Range: Anonymized

prior_procedures_deid_tj.csv

Description: Prior procedures relative to current microbiology culture. Each procedure has a unique timestamp (not included). Multiple rows with the same procedure indicate a duration. For example, the following snippet indicates the patient was mechanically ventilated for 3 days, 62 to 64 days prior to a culture ordered on 2075-01-19:

anon_id pat_enc_csn_id_coded order_proc_id_coded procedure_description procedure_days_culture order_time_jittered_utc_shifted
54816 239397 156168 mech_vent 64 2075-01-19
54816 239397 156168 mech_vent 63 2075-01-19
54816 239397 156168 mech_vent 62 2075-01-19
  • Empty values indicate field is missing
  • procedure_description
    • Description: Type of procedure (surgical, central line, dialysis, ventilation, etc.), inferred from CPT codes
    • Type: Categorical
    • Encoding: mechvent (mechanical ventilation); surgical_procedure (of any type); cvc (central venous catheter); dialysis; urethral_catheter
    • Units: N/A
    • Range: N/A
  • procedure_days_culture
    • Description: Time interval between procedure and current microbiology culture
    • Type: Numerical
    • Encoding: N/A
    • Units: days
    • Range: 1 - 4,663

ward_type_deid_tj.csv

Description: Setting for the clinical encounter when the microbiology culture was taken.

  • Empty values indicate field is missing
  • hosp_ward_IP
    • Description: Flag for inpatient encounter
    • Type: Binary
    • Encoding: 1: yes / 0: no
    • Units: N/A
    • Range: 0 / 1
  • hosp_ward_OP
    • Description: Flag for outpatient encounter
    • Type: Binary
    • Encoding: 1: yes / 0: no
    • Units: N/A
    • Range: 0 / 1
  • hosp_ward_ER
    • Description: Flag for emergency room encounter
    • Type: Binary
    • Encoding: 1: yes / 0: no
    • Units: N/A
    • Range: 0 / 1
  • hosp_ward_UC
    • Description: Flag for urgent care room encounter
    • Type: Binary
    • Encoding: 1: yes / 0: no
    • Units: N/A
    • Range: 0 / 1
  • hosp_ward_day_surg
    • Description: Flag for day surgery encounter
    • Type: Binary
    • Encoding: 1: yes / 0: no
    • Units: N/A
    • Range: 0 / 1

Definition of organism categories:

prior_org_specific genus species resistance_profile
Burkholderia sp
Burkholderia
C_diff
Clostridioides difficile
Candida_non_albicans
Candida non albicans
DR_A_baumannii
Acinetobacter baumanni SAM
DR_P_aeruginosa
Pseudomonas aeruginosa CAZ | FEP | TZP
DR_S_maltophilia
Stenotrophomonas maltophilia SXT | MIN
DS_A_baumannii
Acinetobacter baumanni
DS_C_freundii
Citrobacter freundii
DS_C_koseri
Citrobacter koseri
DS_E_cloacae
Enterobacter cloacae
DS_E_coli
Escherichia coli
DS_K_aerogenes
Klebsiella aerogenes
DS_K_oxytoca
Klebsiella oxytoca
DS_K_pneumoniae
Klebsiella pneumoniae
DS_M_morganii
Morganella morganii
DS_P_aeruginosa
Pseudomonas aeruginosa
DS_P_mirabilis
Proteus mirabilis
DS_Providencia
Providencia
DS_S_maltophilia
Stenotrophomonas maltophilia
DS_S_marcescens
Serratia marcescens
ESBL_C_freundii
Citrobacter freundii CRO | CAZ | FEP | TZP
ESBL_C_koseri
Citrobacter koseri CRO | CAZ | FEP | TZP
ESBL_E_cloacae
Enterobacter cloacae CRO | CAZ | FEP | TZP
ESBL_E_coli
Escherichia coli CRO | CAZ | FEP | TZP
ESBL_K_aerogenes
Klebsiella aerogenes CRO | CAZ | FEP | TZP
ESBL_K_oxytoca
Klebsiella oxytoca CRO | CAZ | FEP | TZP
ESBL_K_pneumoniae
Klebsiella pneumoniae CRO | CAZ | FEP | TZP
ESBL_M_morganii
Morganella morganii CRO | CAZ | FEP | TZP
ESBL_P_mirabilis
Proteus mirabilis CRO | CAZ | FEP | TZP
ESBL_Providencia
Providencia CRO | CAZ | FEP | TZP
ESBL_S_marcescens
Serratia marcescens CRO | CAZ | FEP | TZP
MRSA
Staphylococcus aureus OXA | FOX
MSSA
Staphylococcus aureus
P_vulgaris
Proteus vulgaris
S_mitis_oralis
Streptococcus mitis / oral
S_pneumoniae
Streptococcus pneumoniae
Salmonella sp
Salmonella
Shigella sp
Shigella
VRE_faecalis
Enterococcus faecalis VAN
VRE_faecium
Enterococcus faecium VAN
VSE_faecalis
Enterococcus faecalis
VSE_faecium
Enterococcus faecium

Drug codes used for the categorization:

  • CAZ - Ceftazidime
  • CRO - Ceftriaxone
  • FEP - Cefepime
  • FOX - Cefoxitin
  • OXA - Oxacillin
  • SAM - Ampicillin-sulbactam
  • SXT - Trimethoprim-sulfamethoxazole
  • TZP - Piperacillin-tazobactam
  • VAN - Vancomycin

Usage Notes

Getting Started

The ARMD-MGB dataset is provided as a collection of de-identified CSV files. Each table can be linked using the anon_id (de-identified patient identifier) and, where applicable, the pat_enc_csn_id_coded (encounter identifier) or order_proc_id_coded (microbiology accession identifier).

Below is a minimal example demonstrating how to load and interpret key variables:

Load microbiology culture data

import pandas as pd

micro = pd.read_csv("microbiology_cohort_deid_tj.csv")
print(micro.head())

Interpret antibiotic susceptibility phenotype

phenotype_counts = micro["CLSI_2022_pheno"].value_counts()
print(phenotype_counts)

Intended Uses

ARMD-MGB is intended to support research on:

  • Antimicrobial resistance (AMR) epidemiology, including longitudinal and organism-specific resistance trends.
  • Antibiotic stewardship and prescribing practices, by linking prior antimicrobial exposure to subsequent resistance outcomes.
  • Causal inference and predictive modeling, through integration of microbiological, clinical, and environmental covariates.
  • Cross-site and temporal benchmarking, when used in conjunction with ARMD-Stanford or other harmonized datasets.

Because ARMD-MGB adheres to a harmonized schema shared with ARMD-Stanford, analyses can be conducted across both datasets to assess reproducibility and generalizability across health systems.

Limitations

  • Geographic and institutional context: Data originate from the Mass General Brigham (MGB) healthcare system in New England; therefore, resistance patterns may differ from other regions.
  • Temporal heterogeneity: Changes in clinical practice and diagnostic platforms over the 10-year period may influence detection rates.

Release Notes

Version 1.0.0: Initial public release of the dataset.


Ethics

This study was approved by the Institutional Review Board (IRB) of Massachusetts General Brigham healthcare, protocol 2017P000682.


Conflicts of Interest

The authors declare no conflicts of interest


References

  1. Murray CJ, Ikuta KS, Sharara F, Swetschinski L, Aguilar GR, et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022;399(10325):629–55.
  2. CDC. Antibiotic Resistance Threats in the United States, 2019 [Internet]. 2019 Nov p. 1–148. Available from: https://www.cdc.gov/antimicrobial-resistance/media/pdfs/2019-ar-threats-report-508.pdf?CDC_AAref_Val=https://www.cdc.gov/drugresistance/pdf/threats-report/2019-ar-threats-report-508.pdf
  3. Haredasht FN, Amrollahi F, Maddali MV, Marshall N, Ma SP, Cooper LN, et al. Antibiotic Resistance Microbiology Dataset (ARMD): A Resource for Antimicrobial Resistance from EHRs. Sci Data. 2025;12(1):1299.
  4. Schechner V, Temkin E, Harbarth S, Carmeli Y, Schwaber MJ. Epidemiological Interpretation of Studies Examining the Effect of Antibiotic Usage on Resistance. Clinical Microbiology Reviews. 2013 Apr 3;26(2):289–307.
  5. Darby EM, Trampari E, Siasat P, Gaya MS, Alav I, Webber MA, et al. Molecular mechanisms of antibiotic resistance revisited. Nat Rev Microbiol. 2023;21(5):280–95.
  6. Gontjes KJ, Gibson KE, Lansing BJ, Mantey J, Jones KM, Cassone M, et al. Association of Exposure to High-risk Antibiotics in Acute Care Hospitals With Multidrug-Resistant Organism Burden in Nursing Homes. Jama Netw Open. 2022;5(2):e2144959.
  7. Zhu YG, Zhao Y, Li B, Huang CL, Zhang SY, Yu S, et al. Continental-scale pollution of estuaries with antibiotic resistance genes. Nat Microbiol. 2017 Jan 30;2(4):16270.
  8. Anahtar MN, Yang JH, Kanjilal S. Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. J Clin Microbiol. 2021;59(7):e01260-20.
  9. Stracy M, Snitser O, Yelin I, Amer Y, Parizade M, Katz R, et al. Minimizing treatment-induced emergence of antibiotic resistance in bacterial infections. Science. 2022;375(6583):889–94.
  10. Yelin I, Snitser O, Novich G, Katz R, Tal O, Parizade M, et al. Personal clinical history predicts antibiotic resistance of urinary tract infections. Nat Med. 2019 Jul 5;25(7):1143–52.
  11. Corbin CK, Sung L, Chattopadhyay A, Noshad M, Chang A, Deresinksi S, et al. Personalized antibiograms for machine learning driven antibiotic selection. Commun Medicine. 2022;2(1):38.
  12. Kanjilal S, Oberst M, Boominathan S, Zhou H, Hooper DC, Sontag D. A decision algorithm to promote outpatient antimicrobial stewardship for uncomplicated urinary tract infection. Sci Transl Med. 2020;12(568):eaay5067.
  13. Jones N, Shih MC, Healey E, Zhai CW, Advani S, Smith-McLallen A, et al. Use of Machine Learning to Assess the Management of Uncomplicated Urinary Tract Infection. JAMA Netw Open. 2025;8(1):e2456950.
  14. Burkov A. The Hundred-Page Machine Learning Book. 1st ed. Andriy Burkov; 2019.
  15. National Database of Antibiotic Resistant Organisms (NDARO) - Pathogen Detection - NCBI [Internet]. [cited 2025 Nov 3]. Available from: https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/
  16. Alcock BP, Raphenya AR, Lau TTY, Tsang KK, Bouchard M, Edalatmand A, et al. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res. 2019;48(D1):D517–25.
  17. ResistanceMap [Internet]. [cited 2025 Nov 3]. Available from: https://resistancemap.onehealthtrust.org/
  18. CLSI. M-100 document. 2023.
  19. American Society of Microbiology. Writing your Paper: Abbreviations and Conventions. Available from: https://journals.asm.org/writing-your-paper#abbreviations
  20. Kind AJH, Buckingham W. Making Neighborhood Disadvantage Metrics Accessible: The Neighborhood Atlas. New England Journal of Medicine, 2018. 378: 2456-2458. DOI: 10.1056/NEJMp1802313.
  21. University of Wisconsin School of Medicine and Public Health. 2022 Area Deprivation Index v4.0.1. Downloaded from https://www.neighborhoodatlas.medicine.wisc.edu/ 05/15/2025
  22. Moore BJ, White S, Washington R, Coenen N, Elixhauser A. Identifying Increased Risk of Readmission and In-hospital Mortality Using Hospital Administrative Data. Méd Care. 2017;55(7):698–705.

Share
Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
PhysioNet Credentialed Health Data License 1.5.0

Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0

Required training:
CITI Data or Specimens Only Research

Corresponding Author
You must be logged in to view the contact information.

Files