A Red Team Study on CBRN Capabilities in Frontier AI Models: Anthropic, OpenAI, Meta, Cohere, Mistral

First systematic evaluation reveals alarming vulnerabilities in leading AI models' CBRN safety measures

Our comprehensive red team study evaluated 10 frontier AI models against a novel dataset covering Chemical, Biological, Radiological, and Nuclear (CBRN) domains. The findings expose critical safety gaps that pose immediate risks to global security.

Shocking Discoveries:

Safety mechanisms are fundamentally brittle - persona-based attacks achieve 81.7% success vs. 38.2% for direct queries
Extreme performance disparity across industry - attack success rates range from 18.9% to 84.3% between leading models
Alarmingly high direct query success - some models provide dangerous CBRN information 83% of the time when directly asked
Enhancement query catastrophe - 8 out of 10 models show >70% attack success rates, reaching 92.9% in worst cases
Clear industry leaders and laggards identified through rigorous NIST AI Risk Management Framework methodology

Get the report Now!

contact us

hello@enkryptai.com

GO to

A red team study on CBRN capabilities among frontier models

Models Featured From - Anthropic, OpenAI, Meta, Cohere, Mistral

First systematic evaluation reveals alarming vulnerabilities in leading AI models' CBRN safety measures

Get the report Now!

contact us

GO to