Advancing safety & alignment for the Jamba model family.
From open-source baseline to policy-aligned frontier model.
Standard post-training techniques close some of the safety gap - but not all of it. The joint approach closed the edge cases through adversarial evaluation plus targeted alignment data.
Four numbers that defined the release.
Each metric ties back to the core thesis - responsible AI at capability parity.
How Jamba 1.5a was built.
Targeted red-teaming, synthetic alignment data, and iterative post-training - compounding across five phases.
Every adversarial run leaves a reproducible trace.
- Per-run jailbreak / injection / toxicity / policy categories
- Automatic preference-pair generation for iterate-flagged failures
- Capability gate results re-run every iteration
- Public model card updated on release with full red-team report
Advancing Safety and Alignment for the Jamba Model Family
AI21 Labs partnered with Enkrypt AI to strengthen the safety, alignment, and enterprise readiness of the Jamba model family, including the jointly developed Jamba 1.5a. As AI21 expanded its open-source and commercial offerings, the team sought a more rigorous approach to identify safety vulnerabilities, embed policy-specific alignment, and ensure the model could uphold organizational and regulatory requirements at scale.
Through this collaboration, AI21 Labs and Enkrypt AI combined advanced red teaming, synthetic alignment data generation, and iterative post-post-training techniques to achieve stronger safety guarantees without compromising performance. The outcome is Jamba 1.5a — a model that improves safety scores by wide margins while maintaining competitive benchmark results, demonstrating that responsible AI can scale without sacrificing capability.
Meet the author:


