Generative AI Security: Why Shared Responsibility Matters

Published on

June 5, 2025

Introduction

‍

As generative AI (Gen AI) continues its rapid ascent, enterprises grapple with new layers of complexity around safety, security, and compliance. Much like traditional cloud computing, where responsibilities are divided between “cloud providers” and “cloud customers,” Gen AI demands its own shared‐responsibility model — one that accounts for everything from the foundational pre‐trained model all the way to production‐ready AI agents interacting with users. In this article, we’ll unpack a multi‐layered framework that clarifies who “owns” which aspects of AI security and risk, from base‐model alignment to real‐world deployment.

‍

Why a Shared‐Responsibility Model Matters for Gen AI

‍

In classic cloud environments, the division of labor is relatively straightforward:

Cloud Providers ensure the physical data centers, hypervisors, host operating systems, and underlying network security are rock‐solid.
Cloud Customers take it from there: they configure guest OS security, manage applications, encrypt data, and set up identity‐and‐access controls.

But generative AI introduces new dimensions: large language models (LLMs) and agentic systems can both generate and act on information in ways that traditional applications never could. A misconfigured prompt or an agent that inadvertently calls an external API can lead to reputational damage, compliance violations, or even regulatory penalties.

‍

By mapping Gen AI responsibilities onto a layered structure—much like the classic cloud model—we can clearly delineate which party handles each security, safety, and compliance task. This alignment not only reduces risk but also helps get new AI capabilities into production faster, with fewer surprises.

‍

The Four Layers of Gen AI Responsibility

‍

Below is a high‐level breakdown of the four key layers in Gen AI, along with the corresponding parties responsible for each.

Layer	Primary Actors	Key Responsibilities
1. Foundation Model Development	Model Provider (e.g., OpenAI, Anthropic, Mistral)	Curate and vet massive training datasets, ensuring illegal or harmful content (e.g., hate speech, CSAM) is filtered out. Build base‐model architecture with robustness against adversarial data poisoning. Embed initial “alignment” techniques (e.g., RLHF) to reduce overtly toxic or undesirable outputs. Continuously monitor version updates, patch biases, and resolve known vulnerabilities.
2. Model‐as‐a‐Service (API) Layer	API Provider (could be same as Model Provider or a third party)	Host inference endpoints on secure infrastructure (network firewalls, DDoS protection, rate limiting). Provide baseline content filters to block disallowed queries (e.g., explicit content, known PII attacks). Maintain clear versioning, deprecation policies, and SLAs for availability. Offer an API‐level abuse‐detection system to throttle or block anomalous traffic patterns.
3. Application & Agent Integration	App/Agent Developer (Enterprise DevOps, Solution Teams)	Fine‐tune or prompt‐tune the base LLM on proprietary data—ensuring no leakage of sensitive PII or intellectual property. Implement domain‐specific safety guardrails: input sanitization, custom output filters, and red‐teaming scenarios tailored to the industry (e.g., finance, healthcare). Build “human‐in‐the‐loop” checkpoints for high‐risk AI actions (e.g., sending a transaction). Secure all third‐party tool integrations (e.g., if an agent can call an external CRM API, lock down API keys and enforce strict RBAC). Instrument comprehensive logging and monitoring—for both prompt/response pairs and any downstream tool calls.
4. Deployment & End‐User Governance	End Organization (Security, Compliance, Business Teams)	Define acceptable‐use policies for employees or customers (e.g., “No customer PII in free‐form prompts”). Conduct regular user training on “prompt hygiene,” phishing risks, and how to escalate suspected AI misuse. Continuously monitor production outputs with automated classifiers (e.g., bias, toxicity, PII leakage). Maintain audit trails and records of data flows (GDPR, HIPAA, CCPA compliance). Establish incident‐response playbooks (e.g., “What if an agent starts sending out SPAM?” or “What if the LLM leaks protected health information?”).

‍

Key takeaway:

‍

Layers 1 and 2 (Foundation Model + API) roughly correspond to the “provider side”—analogous to “physical infrastructure” in the cloud model.
Layers 3 and 4 (Application/Agent Integration + Deployment/Governance) align with the “consumer side”—analogous to the “guest OS, applications, and data” in the cloud model.
‍

Provider‐Side Responsibilities (Layers 1 & 2)

‍

Even before an enterprise writes a single line of code, much of the heavy lifting around AI safety and compliance falls on the model and API providers:

‍

Foundation Model Development

Training Data Curation
- Filter out illicit or harmful sources (hate speech, extremist content, CSAM).
- Vet for data poisoning attempts (malicious actors slipping adversarial examples into the dataset).
Initial Alignment & Bias Mitigation
- Use techniques such as Reinforcement Learning from Human Feedback (RLHF) to minimize overtly disallowed outputs.
- Regularly retrain or fine‐tune base models to patch emergent biases or vulnerabilities discovered in the wild.
Model Hardening
- Embed defense mechanisms against known adversarial attacks (e.g., prompt injections, jailbreaking).
- Stress‐test the model internally, simulating malicious queries to identify blind spots.

Model‐as‐a‐Service (API) Layer

Infrastructure Security
- Operate inference endpoints on hardened servers—firewalls, DDoS protection, network isolation.
- Implement rate limits and anomaly detection to block abusive or high‐volume query bursts.
Baseline Content Filtering
- Provide a “first line of defense” that automatically blocks blatantly disallowed prompts/outputs (e.g., explicit instructions to commit wrongdoing).
- Issue clear error codes and logs when a query is rejected, so integrators can understand why.
Versioning & Patch Management
- Publish change logs whenever safety filters are updated or a known vulnerability is patched.
- Communicate deprecation schedules years in advance, giving customers time to migrate to newer, more secure model variants.

Why it matters:

‍

Even if you’re building a highly specialized frontline application, your base model (Layer 1) and API (Layer 2) must already be free from egregious security and safety gaps. If the provider cuts corners on content filtering or ignores data hygiene, downstream integrations will struggle to remain compliant.

‍

Consumer‐Side Responsibilities (Layers 3 & 4)

‍

Once your organization obtains access to an LLM or agent framework, the baton passes to application developers and business teams to ensure domain‐specific safety and governance:

‍

Application & Agent Integration (Layer 3)

Data & Prompt Hygiene
- Scrub proprietary or regulated information from prompts. For example, avoid sending raw customer PII into the LLM without encryption or explicit masking.
- Verify that any fine‐tuning dataset has the necessary consent and contractual rights (e.g., GDPR or CCPA-compliant data processing).
Domain‐Specific Guardrails
- Implement filters that address your industry’s unique risks:
  - Finance: Block unlicensed “financial advice,” suspicious transaction prompts, or regulatory terms that could trigger an SEC audit.
  - Healthcare: Filter out direct “diagnosis” requests to avoid violating HIPAA or medical-practice regulations.
- Conduct systematic red‐teaming—simulate worst‐case prompt injections, reverse‐prompting, or chain‐of‐thought leaks. Build automated test suites that hammer these scenarios repeatedly.
Agent‐Specific Security (if building agents)
- Every time an agent calls an external API (CRM, payment gateway, email service), enforce strict API‐key management and role‐based access control (RBAC).
- Lock down intermediate reasoning: if your agent logs internal “thoughts” for debugging, ensure these logs are encrypted and cannot be exfiltrated.
- Add explicit kill switches or fallback conditions before irreversible actions (e.g., “If transaction > $10,000, require human approval”).
Monitoring & Alerting
- Instrument runtime logs that record prompt/response pairs (with sensitive data masked).
- Build dashboards with automated classifiers to surface potential policy violations—bias, toxicity, PII leaks, or misuse of regulated terminology.
- Set up real‐time alerts to security and compliance teams if suspicious behavior is detected (e.g., a sudden spike in disallowed‐content triggers).

‍

Deployment & End‐User Governance (Layer 4)

Policy & Governance
- Publish a clear “Responsible AI Use” policy for everyone: “Allowed: internal report summarization. Not allowed: generating customer credit-scoring predictions.”
- Define clear ownership for compliance audits: which teams will review logs, triage incidents, and update guardrails.
Training & Awareness
- Conduct regular training sessions. Teach employees “prompt hygiene” best practices, how to spot phishing attempts that leverage AI, and how to report suspicious AI outputs.
- Create quick‐reference guides (intranet wikis or playbooks) that clarify do’s and don’ts for interacting with AI tools.
Regulatory Compliance
- Maintain detailed audit trails. If your model provides medical or financial advice, store the input/output records for the mandated retention period (e.g., 3–7 years, depending on jurisdiction).
- Ensure proper data‐subject consent when storing or processing personal data. If customers’ data is used in fine‐tuning, you may need documented opt-ins.
Incident Response & Continuous Improvement
- Have a documented playbook: “Who to notify if an AI agent sends an email to unintended recipients” or “What to do if an LLM starts outputting disallowed content.”
- Regularly review flagged incidents, update your domain-specific filters, and feed learnings back to both your development team and, when applicable, to the model provider.
  ‍

Why it matters:

‍

Even if you trust that your LLM vendor has done everything right, a poorly configured prompt or lack of domain-specific guardrails can still lead to serious issues—a data breach, reputational harm, or regulatory fines. By treating Layers 3 and 4 with the same rigor as traditional application security, you get ahead of problems before they scale.

‍

Putting It All Together: A Simplified Responsibility Matrix

‍

Below is a concise mapping of who “owns” each core task, from data curation through user training:

Responsibility	Model/API Provider	App/Agent Developer	Organization / Policy Team
Training Data Curation	✔ (filter out harmful content)	—	—
Base Model Alignment (RLHF, etc.)	✔	—	—
Inference Infrastructure Security	✔ (rate limits, firewalls)	—	—
Fine‐Tuning & Prompt Tuning	—	✔ (ensure no PII/IP leaks)	—
Custom Input/Output Filtering	✔ (baseline filters)	✔ (domain‐specific filters)	—
Agent Tool‐Call Authorization	—	✔ (secure APIs, RBAC)	—
Runtime Monitoring & Logging	—	✔ (instrument logs & alerts)	—
Compliance Policy (GDPR, HIPAA, etc.)	—	—	✔ (define policy, audit owners)
User Training & Awareness	—	—	✔ (train staff, enforce policy)
Incident Response Playbook	—	—	✔ (document escalation procedures)

‍

The Role of Agents: Extra Complexity, Extra Care

‍

Unlike a simple text‐in/text‐out LLM, agentic systems can take actions: calling external APIs, interacting with databases, or even initiating transactions. This “agency” layer introduces additional responsibilities:

Tool‐Call Security
- Every external API call must be authenticated and authorized. For example, if an agent can issue a “funds transfer” request, you must enforce multi-factor checks or human approval for transfers above a certain threshold.
Internal Reasoning Logs
- Agents often keep a “chain of thought” to explain why they chose a particular action. Those logs must be encrypted and access-controlled to prevent privileged information from leaking.
Kill Switches & Fallbacks
- Embed a “stop‐gap” mechanism: if the agent encounters an ambiguous or potentially harmful request (e.g., “Send unauthorized emails to customers”), it should default to “Request human approval.”
Simulation & Sandbox Testing
- Before deploying any agent capable of real‐world actions, run it in a sandbox environment that mimics production. Simulate malicious prompts (e.g., “Buy Bitcoin with stolen credit card”) to ensure your guardrails hold.

Bottom line:

‍

Agents close the gap between “suggest” and “act.” That’s powerful, but it also raises the stakes. If your agent can execute trades, send invoices, or provision new cloud resources, then each of those actions needs its own security and compliance posture.

‍

Continuous Feedback Loops: Keeping Everything Aligned

‍

A true shared‐responsibility model isn’t static. As new vulnerabilities emerge—whether it’s a novel prompt injection technique or a regulatory change—you need a robust feedback mechanism:

‍

Provider ↔ Developer
- If your red team uncovers a new way to bypass the base model’s content filter, report it back to the LLM provider. They can update their safety layers, pushing patches to all clients.
- Conversely, when providers release new safety enhancements, you must test and integrate those updates into your application/agent pipelines.
Developer ↔ Organization
- If your monitoring system flags an unusual spike in disallowed‐content requests, your security team needs to work with developers to immediately adjust filters or temporarily shut down affected endpoints.
- When the compliance/legal team updates policies (e.g., new GDPR guidance), developers must revise prompt‐engineering guidelines and update audit‐logging configurations.
Organization ↔ Users
- Regularly gather user feedback—do employees feel confident that their prompts won’t leak sensitive data? Are customers noticing any inappropriate outcomes? This input helps refine training programs and policy clarity.
- If a compliance audit uncovers gaps (e.g., missing consent for data used in model training), update both policy and developer practices to close those gaps.

Key Takeaways & Best Practices

Recognize the Multi‐Layered Nature of Gen AI Risk
- Unlike traditional cloud apps, Gen AI requires distinct treatment at the foundation, API, application, and governance layers.
Divide and Conquer: Define Ownership Clearly
- Model/API providers handle data‐curation, initial alignment, and infrastructure security.
- Application/agent developers focus on domain‐specific guardrails, fine‐tuning hygiene, and securing downstream tool calls.
- Policy teams set organizational rules, train end users, and maintain compliance auditable records.
Agents Demand Extra Rigor
- Every action “move” your AI agent can make must be explicitly authorized and monitored. Build kill switches, sandbox tests, and encrypted reasoning logs as core requirements.
Adopt Continuous Monitoring & Feedback Loops
- Set up real‐time alerts for policy violations. Conduct periodic red‐team exercises. Feed findings back to both the LLM provider and internal teams to iteratively strengthen defenses.
Stay Ahead of the Regulatory Curve
- Keep an eye on evolving AI regulations (e.g., EU AI Act, proposed U.S. guidelines). Design your audit logs, data‐retention policies, and user training programs so that you can pivot quickly when new requirements emerge.

‍

Conclusion

‍

Generative AI unlocks unprecedented innovation but also multiplies security, compliance, and safety challenges across multiple layers. By adopting a shared‐responsibility model—one that mirrors the spirit of traditional cloud but accounts for LLM alignment, domain‐specific guardrails, and agentic actions—enterprises can confidently accelerate AI adoption while minimizing risk. Whether you’re a model vendor, an application developer, or part of an organization’s compliance team, understanding your slice of the shared‐responsibility pie is the first step toward unlocking AI’s transformative potential—safely and responsibly.

‍

Meet the Writer

Sahil Agarwal

Generative AI Security: Why Shared Responsibility Matters

Introduction

Why a Shared‐Responsibility Model Matters for Gen AI

The Four Layers of Gen AI Responsibility

Key takeaway:

Provider‐Side Responsibilities (Layers 1 & 2)

Foundation Model Development

Model‐as‐a‐Service (API) Layer

Why it matters:

Consumer‐Side Responsibilities (Layers 3 & 4)

Application & Agent Integration (Layer 3)

Deployment & End‐User Governance (Layer 4)

Why it matters:

Putting It All Together: A Simplified Responsibility Matrix

The Role of Agents: Extra Complexity, Extra Care

Bottom line:

Continuous Feedback Loops: Keeping Everything Aligned

Key Takeaways & Best Practices

Conclusion

More articles

Red Team Base and Instruct Models: Two Faces of the Same Threat

America’s AI Action Plan: Racing to Stay Ahead

A Partnership for Responsible AI: Truefoundry and Enkrypt AI