ChatGPT Health ‘Under-Triaged' Half of Medical Emergencies in a New Study
The study found ChatGPT Health under-triaged 51.6% of emergency cases and over-triaged 64.8% of nonurgent cases, raising concerns about its safety and reliability.
- Last week, an independent safety review published in Nature Medicine found ChatGPT Health, OpenAI's health-focused chatbot, under-triaged 51.6% of emergency cases needing immediate hospital care using 60 patient simulations compared with three physicians' consensus.
- OpenAI says over 40 million people use ChatGPT for health questions, with nearly 2 million weekly messages on insurance, raising safety concerns amid promotion of medical record upload tools.
- Clinical vignettes showed that ChatGPT Health downplayed respiratory failure and diabetic ketoacidosis symptoms about 50% of the time, told a mock asthma crisis to wait, and gave inconsistent suicidal ideation and 988 referrals.
- OpenAI responded that the study doesn't mirror typical use and ChatGPT Health is continually updated, while researchers warned missed emergencies like diabetic ketoacidosis pose patient-safety risks and clinicians such as John Mafi urged stronger safeguards.
- Experts recommend controlled trials and closer tech-health collaboration, stressing model training transparency and benchmarks, while clinicians see potential in rural and global health settings despite current risks.
13 Articles
13 Articles
ChatGPT Health Tool Isn't So Great in a Crisis
A tool billed as a way to plug your medical records into ChatGPT and receive health advice is drawing sharp warnings from researchers. In the first independent safety review of ChatGPT Health , published in Nature Medicine , the system underestimated the urgency of care in just over half of cases where...
A study shows that although the AI chatbot detects strokes well, in other emergencies it is next to it every second time. The tool also fails to detect suicide risks.
How well did ChatGPT Health triage emergencies?
Key findings from recent testing A structured evaluation found the health‑focused version of the large language model frequently underestimated how serious some emergency scenarios were, under‑triaging roughly half of the cases in the test. That means the tool routinely recommended a lower level…
NEWS: Study Finds ChatGPT Health Misses Half of Serious Medical Emergencies
As reported by Digital Health, more than half of serious medical emergencies were incorrectly assessed by ChatGPT Health in a new independent safety study, which also keep reading The post NEWS: Study Finds ChatGPT Health Misses Half of Serious Medical Emergencies first appeared on Practice Business.
Coverage Details
Bias Distribution
- 100% of the sources lean Left
Factuality
To view factuality data please Upgrade to Premium







