AI outperforms doctors in Harvard trial of emergency triage diagnoses | AI (artificial intelligence)

https://www.theguardian.com/technology/2026/apr/30/ai-outperforms-doctors-in-harvard-trial-of-emergency-triage-diagnoses

Publish Date: 2026-04-30 14:00:00

Source Domain: www.theguardian.com

From George Clooney in ER to Noah Wyle in The Pitt, emergency department doctors have long been popular heroes. But will it soon be time to hang up the scrubs?

A groundbreaking Harvard study has found that AI systems outperformed human doctors in high-pressure emergency medicine triage, diagnosing more accurately in the potentially life and death moments when people are first rushed to hospital.

The results were described by independent experts as showing “a genuine step forward” in the clinical reasoning of AIs and came as part of trials that tested the responses of hundreds of doctors against an AI.

The authors said the results, published in the journal Science, showed large language models (LLMs) “have eclipsed most benchmarks of clinical reasoning”.

One experiment focused on 76 patients who arrived at the emergency room of a Boston hospital. An AI and a pair of human doctors were each given the same standard electronic health record to read – typically including vital sign data, demographic information and a few sentences from a nurse about why the patient was there. The AI identified the exact or very close diagnosis in 67% of cases, beating the human doctors, who were right only 50%-55% of the time.

It showed the AIs’ advantage was particularly pronounced in triage circumstances requiring rapid decisions with minimal information. The diagnosis accuracy of the AI – OpenAI’s o1 reasoning model – rose to 82% when more detail was available, compared with the 70-79% accuracy achieved by the expert humans, though this difference was not statistically significant.

It also outperformed a larger cohort of human doctors when asked to provide longer term treatment plans, such as providing antibiotics regimes or planning end-of-life processes. The AI and 46 doctors were asked to examine five clinical case studies and the computer made significantly better plans, scoring 89% compared with 34% for humans using conventional resources, such as search engines.

Source