PRISM-Health
2 Benchmarks
Clinical & Healthcare AI Evaluation
Rigorous evaluation of AI as a clinical agent — execution-grounded EHR workflows and medical audio reasoning, validated against board-certified clinician judgement.
Model Rankings
Success Rate · higher = better
Scatter Plot
Models above the diagonal are stronger at understanding than executing. Closer to the diagonal = better-balanced agent.
Full Model Comparison
| PROVIDER | MODEL | SUCCESS RATE | QUERY SR | ACTION SR | CORRECT | INCORRECT | AVG HISTORY | INVALID ACTION % |
|---|---|---|---|---|---|---|---|---|
| Anthropic | claude-3.5-sonnet-v2 | 67.7 | 80 | 55.3 | 203 | 76 | 4.19 | 7 |
| OpenAI | gpt-4o-mini | 62 | 66 | 58 | 186 | 114 | 4.14 | 0 |
| gemini-2.5-pro | 61.3 | 72 | 50.7 | 184 | 114 | 4.4 | 0.7 | |
| Meta | llama-4-maverick | 58.3 | 83.3 | 33.3 | 175 | 89 | 5.76 | 11 |
| gemini-3-flash-preview | 54.3 | 58 | 50.7 | 163 | 137 | 4.35 | 0 | |
| Nvidia | llama-3.3-nemotron-49b | 54 | 78.7 | 29.3 | 162 | 114 | 4.19 | 7.7 |
| DeepSeek | deepseek-v3.2 | 48.3 | 41.3 | 55.3 | 145 | 148 | 4.76 | 2.3 |
| Amazon | nova-premier-v1 | 9.3 | 17.3 | 1.3 | 28 | 79 | 4.89 | 64.3 |
| Microsoft | phi-4 | 0 | 0 | 0 | 0 | 0 | 2 | 100 |
MED-ART · Sample Tasks
Sample task 1 of 1
Security
Disciplined security and privacy practices aligned with global standards to protect sensitive data, intellectual property, and model assets throughout the AI lifecycle.
Centific applies rigorous security, access control, and auditability standards to safeguard enterprise data, human workflows, and AI systems at scale.
Blog
Customer Stories
Proven results
with leading AI teams.
See how organizations use Centific’s data and expert services to build, deploy, and scale production-ready AI.
Connect with Centific
Updates from the frontier of AI data.
Receive updates on platform improvements, new workflows, evaluation capabilities, data quality enhancements, and best practices for enterprise AI teams.
Data
Infrastructure
engineered for Trust.
Confidently scale every part of your AI development lifecycle with secure, compliant, production-ready operations.
Connect data, models, and people — in one enterprise-ready platform.
Seamlessly connect your existing systems, infrastructure, and workflows — all in one unified platform.
Centific Premier Hackathon 2.0
This is your moment.
Seamlessly connect your existing systems, infrastructure, and workflows — all in one unified platform.
Connect data, models, and people — in one enterprise-ready platform.






