AI Security in 2026: Prompt Injection, the Lethal Trifecta, and How to Defend

Introduction

Last year saw progress in agentic AI systems unimaginable only a short while ago. Along with these advances, however, have come new classes of security vulnerabilities, and examples of exploits in the wild continue to grow by the day. As the new year kicks off, let’s review what the core issue is, how it’s been exploited, and what you should be doing to protect yourself.

Prompt Injections – The Core Issue

The core issue remains the same as it has since the first LLMs – models have no ability to reliably distinguish between instructions and data. There is no notion of untrusted content – any content they process is subject to being interpreted as an instruction. To give you a sense of the magnitude of this problem and why it’s not going away anytime soon, OpenAI themselves even published a post recently admitting as much and calling it a frontier security challenge. Their research attempting to solve it goes back several years now, so it’s unlikely this problem goes away anytime soon.

The Lethal Trifecta

The problem was only magnified in 2025 as we saw more complex agentic systems being put into production that have access to tools like web search. Any ingested content can be interpreted by the model as an instruction. Simon Willison coined the term The Lethal Trifecta to describe the issue:

Access to private data — The agent can read your emails, documents, and databases
Exposure to untrusted tokens — The agent processes input from external sources (emails, shared docs, web content)
Exfiltration vector — The agent can make external requests (render images, call APIs, generate links)

If your agentic system has all three, it’s vulnerable. Period.

2025's Notable Attacks

Two of the more high-profile attacks of 2025 fell within this framework.

EchoLeak — Microsoft 365 Copilot

The first major zero-click agentic vulnerability to hit a production enterprise system. An attacker sends a crafted email to anyone in your organization. When any user later asks Copilot a question, it retrieves the poisoned email, executes the embedded instructions, and exfiltrates sensitive data via an image URL—all without a single click.

How it worked:

Attacker sends an email with a hidden prompt injection
Victim asks Copilot an unrelated question
Copilot’s RAG system retrieves the malicious email as context
Embedded instructions tell Copilot to search for sensitive data
Results are encoded in an image URL request to the attacker’s server
Browser “loads the image,” sending data to the attacker

GeminiJack — Google Gemini Enterprise

An attacker shares a Google Doc, sends a calendar invite, or emails someone in your organization. Hidden instructions get indexed by Gemini Enterprise’s RAG system. When any employee runs a routine search, the agent executes those instructions, searches across Gmail/Calendar/Docs for sensitive data, and exfiltrates via—you guessed it—an image URL. This was essentially the same attack as EchoLeak, but applied to Google’s stack instead of Microsoft’s.

Looking Ahead – A Framework for 2026

Agentic systems aren’t going away, so what can you do to protect yourself and your organization?

1. Map Your Blast Radius

Before you can secure agentic systems, you need to know:

What data sources can your agents access?

What’s the maximum damage if one is compromised?

Who can send content that gets indexed into RAG systems?

2. Implement the Principle of Least Privilege

Your agent probably doesn’t need access to all of Gmail, all of SharePoint, all of Slack, and all your databases simultaneously. Segment access:

Limit data sources to what’s actually needed

Implement per-user or per-role permissions

Avoid “helpful” defaults that grant broad access

3. Control Exfiltration Vectors

The “lethal” part of the lethal trifecta is often the easiest to address:

Block or heavily restrict external image loading in AI-generated responses

Implement Content Security Policy (CSP) controls

Monitor for unusual patterns of external requests

Consider sandboxing AI-generated output before rendering

4. Treat Agentic Systems Like Privileged Infrastructure

Agents with data access are effectively privileged users in your environment. Apply the same rigor you’d use for service accounts:

Audit access patterns

Log all queries and responses

Alert on anomalous behavior

Regular security assessments

5. Monitor the MCP Ecosystem

If you’re using MCP-connected tools:

Audit which MCP servers you’re connecting to

Never expose MCP servers to untrusted networks

Keep mcp-remote and related tooling updated

Review tool descriptions for hidden instructions (tool poisoning)

Final Thoughts

2025 established that AI security isn’t a theoretical concern—it’s an operational reality. 2026 will bring more of the same, plus new attack surfaces as agentic AI systems gain more autonomy, more tool access, and more integration into critical workflows. Organizations that understand the fundamental patterns—rather than chasing individual branded vulnerabilities—will be better positioned to defend against whatever comes next.

AI Platform Overview

AI Security in 2026: Prompt Injection, the Lethal Trifecta, and How to Defend

Introduction

Prompt Injections – The Core Issue

The Lethal Trifecta

2025's Notable Attacks

EchoLeak — Microsoft 365 Copilot

GeminiJack — Google Gemini Enterprise

Looking Ahead – A Framework for 2026

1. Map Your Blast Radius

2. Implement the Principle of Least Privilege

3. Control Exfiltration Vectors

4. Treat Agentic Systems Like Privileged Infrastructure

5. Monitor the MCP Ecosystem

Final Thoughts

Recommended resources

26 AI Resolutions for CIOs in 2026

The Rise of Clawdbot and the Shadow AI Problem Security Teams Aren’t Ready For

Airia Launches Agent Constraints: Industry’s First Policy Engine to Enable Centralized AI Agent Governance

AI Platform Overview

Introduction

Prompt Injections – The Core Issue

The Lethal Trifecta

2025's Notable Attacks

EchoLeak — Microsoft 365 Copilot

GeminiJack — Google Gemini Enterprise

Looking Ahead – A Framework for 2026

1. Map Your Blast Radius

2. Implement the Principle of Least Privilege

3. Control Exfiltration Vectors

4. Treat Agentic Systems Like Privileged Infrastructure

5. Monitor the MCP Ecosystem

Final Thoughts

Recommended resources

26 AI Resolutions for CIOs in 2026

The Rise of Clawdbot and the Shadow AI Problem Security Teams Aren’t Ready For

Airia Launches Agent Constraints: Industry’s First Policy Engine to Enable Centralized AI Agent Governance

Prompt Injections – The Core Issue

Looking Ahead – A Framework for 2026