Together AI’s Instant Clusters Enable Latent Health to Build Clinical AI That Outperforms GPT-4

7x lower
training cost
97%
clinical QA accuracy
3x faster
experimentation
Executive Summary
Latent Health needed affordable, flexible training to build high-accuracy clinical AI for major health systems without slowing iteration.
They adopted Together AI’s Instant GPU Clusters—bare-metal H100s with SSH, fast provisioning, and InfiniBand—enabling multi-node RL and long-context training.
As a result, they achieved ~7× lower training cost ($14 vs $98/hr), 97% clinical QA accuracy (beating GPT-4/o3), and 2–3× faster experimentation—powering minutes-not-hours review cycles across partner hospitals.
About Latent Health
Latent Health automates critical healthcare workflows for 25 major health systems including UCSF, Northwestern, Yale, and Vanderbilt University Medical Center. Their AI platform analyzes patient charts and surfaces the clinical information pharmacists need for medication approvals, reducing review cycles from hours to minutes.
Founded by experts in ML research and healthcare operations with a mission to create "a provider for every patient," the 35-person company processes workflows for hundreds of pharmacists across major health networks. To compete against commercial foundation models on healthcare-specific tasks, Latent Health needed infrastructure that could support rapid, cost-effective model training.
The Challenge
Building clinical question answering models with extraordinary accuracy required overcoming critical infrastructure challenges:
The Solution
Latent Health chose Together Instant Clusters for clinical AI training, unlocking the performance and flexibility needed to outcompete commercial foundation models.
Results
Together AI's infrastructure enabled Latent Health to achieve breakthrough clinical AI performance while maintaining cost efficiency:
Real-World Clinical Impact
These technical improvements translate directly into measurable outcomes across Latent Health's health system partners. MetroHealth achieved an 80% reduction in review time (down from 25 minutes to just 5 minutes per prior authorization) alongside a 45% increase in submission capacity. Ochsner Health saw 75% faster review times and 96% increase in monthly throughput per pharmacist, impacting over 20,000 specialty patients.
"Together Instant Clusters enabled us to build the intelligence backbone that's core to our business. Our partners will accept nothing less than state of the art so it is critical that we distill clinical nuance and the latest compendia into our models. Together AI's infrastructure gives us both the performance and experimentation freedom to make that happen." — Allan Bishop, Head of Engineering, Latent Health
Use case details
Products used
Highlights
- H100 clusters via SSH
- 7x lower training cost
- 97% clinical QA accuracy
- 2–3x faster experimentation
Use case
Rapid training for clinical QA models
Company segment
AI-native startup