Contributing Authors
Table of Contents
Introduced only in November 2024, model context protocol (MCP) has unleashed the potential of AI, as reflected by the explosive growth and exploration of AI agent ecosystems. Now implemented by many vendors and organizations, MCP provides a standardized way for AI agents, models, and tools to interact – but it also introduces new vulnerabilities that must be addressed to scale AI responsibly.
One of the most pressing vulnerabilities is indirect prompt injection. Attackers can embed malicious instructions in external resources (e.g., web pages and documents) that AI agents access. When the MCP server facilitates an interaction, malicious instructions embedded in external sources may alter or hijack agent behavior, exposing sensitive data or triggering unintended actions.
Identity is another frequently heard concern. MCP servers connect AI agents to downstream tools and models. These interactions can be with a single agent or multiple agents. However, without strong identity and access controls, MCP servers risk becoming overly permissive gateways, connecting users to the wrong tools and to the wrong data sets. Misconfigured permissions can grant agents excessive authority, violating the principle of least privilege.
Finally, MCP servers can also lack centralized policy enforcement. The absence of centralized policy enforcement creates not only inconsistencies, but headaches. One team may apply security filters diligently, while another bypasses them entirely, creating uneven levels of protection across the enterprise, resulting in a whack-a-mole scenario that can quickly deteriorate trust.
For enterprises deploying their own MCP servers, these security vulnerabilities must be addressed to deploy secure and private AI and agentic AI for the enterprise. To counter prompt injection, organizations need guardrails that inspect both inputs and outputs. Filters must detect anomalous tokens, malicious instructions, and unsafe outputs without degrading user experience. Ideally, redaction and replacement should occur in memory, so data never leaves the secure execution environment.
MCP servers must be bound into enterprise identity providers through OAuth or equivalent protocols. Access should be governed by universal tokens that enforce permissions consistently across all models and frameworks, confining that access to a single session to prevent permission drift. This creates a zero-trust environment where every single request is authenticated and authorized.
One method of securely running AI is to use ephemeral sandboxes. Rather than keeping MCP servers running indefinitely, low-latency containers spin up on demand for each request, connect to the downstream model or tool, and then terminate afterward. This limits attack persistence and ensures every session begins with a clean slate.
In addition to these efforts, agentic AI vendors should also provide immutable prompt monitoring with a clear way to send these logs directly to the organization’s security operations center. Coupled with detailed observability dashboards for latency, accuracy, cost, and drift, this approach enables proactive detection of bias or misuse. Finally, AI red-teaming against MCP servers and connected agents help to test defenses and continually monitor for weaknesses. In AI red-teaming, attack libraries and even agent-driven attacks can simulate adversarial conditions and reveal performance under realistic scenarios. AI red-teaming should not be a one-off exercise but a continuous exercise in any organization implementing AI.
MCP is essential for scaling AI agents, but its vulnerabilities are real: prompt injection, weak permissions, persistence risks, governance gaps, and inadequate monitoring. The blueprint for addressing these risks is clear: employ guardrails, enable zero-trust identity, use ephemeral sandboxing or similar, centralize policies, use immutable prompt logging, and employ continuous red-teaming.
For enterprises navigating the next wave of agent adoption, MCP is both powerful and secure – but only when managed correctly.
Message from the Sponsor
Airia provides enterprises with a secure way to deploy and govern AI at scale. The Airia MCP Gateway is designed to address emerging vulnerabilities in model context protocol (MCP) by enforcing centralized policies, zero-trust identity, and enterprise-grade security controls. With built-in guardrails and observability, organizations can connect AI agents, models, and tools while maintaining compliance and minimizing risk. To learn more about how Airia helps enterprises orchestrate and secure agentic AI, visit: https://airia.com/airia-launches-mcp-gateway/