AI security research

Publications

No free lunch with guardrail

Benchmarks show stronger guardrail improve safety but can reduce usability. Paper proposes a framework to balance the trade-offs — ensuring practical, secure LLM deployment.

Read Paper

Investigating Implicit Bias in LLMs

A study of 50+ models reveals that bias persists — and sometimes worsens — in newer models. The work calls for standardized benchmarks to prevent discrimination in real-world AI use.

Read Paper

VERA: Validation & Enhancement for RAG

VERA improves Retrieval-Augmented Generation by refining retrieved context and output, reducing hallucinations and enhancing response quality across open-source and commercial models.

Read Paper

Related Blog

Fine-Tuning, Quantization & Safety

Fine-tuning increases jailbreak vulnerability, while quantization has varied effects. Our analysis emphasizes the role of strong guardrails in deployment.
‍

Read Paper

Related Blog

SAGE-RT Synthetic Red Teaming

SAGE enables scalable, synthetic red-teaming across 1,500+ harmfulness categories — achieving 100% jailbreak success on GPT-4o and GPT-3.5 in key scenarios.

Read Paper

Related Blog

AI Guardrail Benchmark Studies

Our research goes beyond just publications – it’s been applied to real-world benchmark studies to evaluate the security and performance of leading AI guardrails. These comparative tests provide practical insights into how guardrails perform under real attack scenarios, driving measurable improvements in AI safety.

Enkrypt AI, Guardrails AI, and Protect AI LLM Guard

Read Blog

Enkrypt AI vs Azure Content Safety vs Amazon Bedrock Guardrails

Read Blog

Enkrypt AI, IBM Granite, Azure AI, Prompt Shield, and Amazon Bedrock Guardrails

Read Blog

Building Safer AI from the Ground Up:
Securing LLM Providers

Enkrypt AI tests over 100 leading foundation models - including from AI21, DeepSeek, Databricks, and Mistral - to strengthen the safety of their LLMs without compromising performance.

Insecure Code

CBRN

Health

Toxicity

Race

Biases

Harmful Content

Religion

Gender

Hate Speech

Discrimination

Suicide & Self-harm

Harmful Content

Sexual Content

Gender

Insecure Code

Health

Toxicity

50,000+ tests

CBRN

Regulated or Controlled Substances

Guns & Illegal Weapons

Criminal Planning

Harmful Content

Biases

We conduct more than 50,000 dynamic red-teaming evaluations per model, spanning critical risk categories: bias, insecure code, CBRN threats, harmful content, and toxicity. This rigorous testing ensures our insights are among the most trusted in the industry.

AI Research

Driving real-world impact

Publications

No free lunch with guardrail

Investigating Implicit Bias in LLMs

VERA: Validation & Enhancement for RAG

Fine-Tuning, Quantization & Safety

SAGE-RT Synthetic Red Teaming

LLM Safety & Security Leaderboard

AI Guardrail Benchmark Studies

Building Safer AI from the Ground Up:
Securing LLM Providers

AI Research

Driving real-world impact

Publications

No free lunch with guardrail

Investigating Implicit Bias in LLMs

VERA: Validation & Enhancement for RAG

Fine-Tuning, Quantization & Safety

SAGE-RT Synthetic Red Teaming

LLM Safety & Security Leaderboard

AI Guardrail Benchmark Studies

Building Safer AI from the Ground Up: Securing LLM Providers

Building Safer AI from the Ground Up:
Securing LLM Providers