Surfing in the dark — Hidden Dangers Lurking on Every Web Page


AI agents aren’t sci-fi anymore. They research topics, fill forms, compare prices, and handle entire workflows without supervision. OpenAI’s ChatGPT Agent can think, browse, and act on your behalf. Perplexity’s Comet automates routine tasks like reading emails and organizing research. Book a flight, renew a visa, cross-check receipts, then summarize the best option? All in one flow.
But here’s the problem: the web isn’t a controlled lab. Every page contains text, code, and markup that modern agents read as potential instructions. This makes them vulnerable to indirect prompt injection, where malicious sites hide “advice” that hijacks the agent’s mission.
Security experts like OWASP rank this as the top AI risk. Unlike traditional software, AI models process data and instructions together, not as separate channels. Defenses exist, but no perfect fix.
Why These Attacks Work
Instruction confusion is the core issue. Agents struggle to distinguish between “what my user asked” and “what this webpage suggests.” This ambiguity is the root of prompt injection vulnerabilities.
Cascading actions create a massive blast radius. Once agents can click, type, upload, and access private accounts, one bad instruction can spiral into data theft, unauthorized changes, or worse. The damage isn’t limited to a single webpage or action; it can spread across all your connected accounts and services. Microsoft treats this as unavoidable and recommends layered defenses.
Small model risks matter too. Smaller Language Models are cheaper and faster, but studies show they’re easier to trick than larger models, especially as autonomous agents.
Browser Agents vs Agentic Browsers
These sound similar but work differently:
- Browser agent: AI software that controls a regular browser (like an AI pilot)
- Agentic browser: A browser with AI features built in (like Comet, where AI is part of the dashboard)
Both face the same risk: they read untrusted pages and can be influenced by hidden instructions. Companies add guardrails, but attackers adapt quickly.
For the purpose of this blog, we’ll demonstrate just how vulnerable browser agents can be through our experiments.
Demo 1: The Email Heist

We built a fake vacuum review site and buried a malicious instruction in one review. The agent’s task was simple: read reviews and recommend the best vacuum.
The hidden instruction wasn’t visible to normal users, but our agent found it. Instead of reviewing vacuums, it opened webmail, searched for sensitive data, forwarded information to our test inbox, then tried to delete the evidence.


Demo 2: The Biased Recommendation

We also tested a subtle version that told the agent to recommend a one-star product over a five-star one. It did, confidently.

This was a controlled experiment with fake data and permission. The point: there’s little friction between “read webpage” and “execute risky actions” once agents trust page content over user intent.
What This Means for Builders
Treat every webpage as hostile.
If your agent reads it, it can influence behavior. Handle third-party content like untrusted code.
Control capabilities, not just prompts.
Limit what agents can do by default. Require user approval for risky actions like sending emails or making purchases.
Use least privilege.
Limit permissions, isolate data, and log everything. Build strict boundaries and filter outputs carefully.
Test against evolving attacks.
Static filters lag behind new techniques. Red-team your agent like a web app, not a toy.
Be careful with smaller models.
If switching to save costs, retest security. Smaller models show higher jailbreak success rates.
Consider proactive scanning tools.
Solutions like Enkrypt AI’s Asset Risk Scanner can scan webpages for policy violations and injection attacks before your agent interacts with them. These tools flag and block malicious content, adding another layer of protection.

The Bottom Line
Agentic browsing feels magical because it collapses hours of work into one instruction. But every webpage becomes a potential boss for your agent. If your system can’t tell who’s really in charge, it will follow the loudest voice, not the user.
Treat indirect injection as inevitable, not possible. Design for damage control, built with human oversight for critical decisions, and keep defenses adaptive and layered.
The future belongs to agents that are useful, not gullible.
References
- https://www.enkryptai.com/
- https://guardrails-ingestion-demo.vercel.app/
- https://medium.com/enkrypt-ai/small-models-big-problems-why-your-ai-agents-might-be-sitting-ducks-4cb9c03860a6
- https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_Injection_Prevention_Cheat_Sheet.html
- https://www.microsoft.com/en-us/msrc/blog/2025/07/how-microsoft-defends-against-indirect-prompt-injection-attacks/