A red team study on CBRN capabilities among frontier models

Models Featured From - Anthropic, OpenAI, Meta, Cohere, Mistral

First systematic evaluation reveals alarming vulnerabilities in leading AI models' CBRN safety measures

Our comprehensive red team study evaluated 10 frontier AI models against a novel dataset covering Chemical, Biological, Radiological, and Nuclear (CBRN) domains. The findings expose critical safety gaps that pose immediate risks to global security.

Shocking Discoveries:
  • Safety mechanisms are fundamentally brittle - persona-based attacks achieve 81.7% success vs. 38.2% for direct queries
  • Extreme performance disparity across industry - attack success rates range from 18.9% to 84.3% between leading models
  • Alarmingly high direct query success - some models provide dangerous CBRN information 83% of the time when directly asked
  • Enhancement query catastrophe - 8 out of 10 models show >70% attack success rates, reaching 92.9% in worst cases
  • Clear industry leaders and laggards identified through rigorous NIST AI Risk Management Framework methodology

Get the report Now!