One compromised agent in a cooperating network can steer the rest, and a single bad decision can cascade across the whole system. Multi-agent architectures add two new risk classes that single-agent thinking misses. Here is how to contain them.
Single-agent systems are giving way to teams of agents: a planner that delegates to specialists, a researcher that hands findings to a writer, a swarm of workers coordinated by an orchestrator. The architecture is powerful because it decomposes hard problems. It is also a larger attack surface, because now the agents talk to each other, and the messages between them are a channel an attacker can target. The OWASP Top 10 for Agentic Applications names two of these risks directly, Insecure Inter-Agent Communication and Cascading Failures, and they are the ones single-agent thinking tends to miss.
The trust assumption that breaks
Most multi-agent systems are built on an implicit assumption: agents trust each other. The planner trusts the worker's result. The writer trusts the researcher's summary. That trust is what makes the collaboration efficient, and it is exactly the assumption an attacker wants. Compromise one agent, through an injected instruction in the content it processes, and you do not just own that agent. You own its outputs, which the other agents accept as trusted input. One poisoned node can steer the whole graph, because the other nodes were never designed to doubt it.
This is the multi-agent version of a lesson that keeps recurring in AI security: content that arrives through a "trusted" channel still has to be treated as untrusted. Between agents, the trusted channel is the message bus, and a message from a peer agent deserves the same scrutiny as a document from the open web, because a compromised peer is now an attacker.
Insecure inter-agent communication
The first concrete risk is the messages themselves. If communication between agents is unauthenticated, unvalidated, or unencrypted, several attacks open up. An attacker who can forge or inject a message can impersonate a legitimate agent and issue instructions the others will follow. An attacker who can read messages can harvest whatever sensitive data flows between agents. And an attacker who can tamper with a message in transit can quietly redirect the system's behavior without compromising any single agent at all.
The defenses are recognisable from distributed-systems security, applied to a new context: authenticate the sender of every inter-agent message so a forged one is rejected; validate message content rather than executing it on faith; and carry identity with the request so an agent cannot be impersonated. The agents may trust each other's roles, but the system should verify each message.
Cascading failures
The second risk is what happens after something goes wrong in one place. In a chain or graph of agents, a single bad decision, an error, a hallucination, a hijacked goal, does not stay local. It feeds the next agent, which acts on it, and feeds the one after that. A small fault at the input becomes a large, coordinated failure at the output, with each agent faithfully amplifying the mistake of the last. The same property that lets multi-agent systems decompose problems lets them decompose failures, in reverse.
Containing cascades is an architecture problem more than a detection one. Put validation between stages so a bad result is caught before it propagates. Use circuit breakers that halt the chain when a step produces something anomalous, rather than passing it along. Avoid blind trust in upstream output for high-stakes actions. The goal is that one agent's failure is a contained incident, not a system-wide one.
A short checklist
| Risk | Control |
|---|---|
| Forged or injected inter-agent messages | Authenticate senders; carry and validate identity per message |
| Eavesdropping on agent traffic | Encrypt inter-agent communication |
| A compromised agent steering peers | Treat peer messages as untrusted input; validate, do not execute on faith |
| Cascading failure across the chain | Validation and circuit breakers between stages; no blind trust in upstream output |
| Over-broad reach of any single agent | Least privilege per agent, so one compromise is contained |
Frequently asked questions
If each agent is individually hardened, is the system secure? Not necessarily. Multi-agent risk lives in the connections, not just the nodes. You can have well-defended agents and still be compromised through forged messages, eavesdropping, or a cascade that turns one agent's honest mistake into a system-wide failure. The communication layer needs its own controls.
Should agents trust each other at all? They can trust roles while verifying messages. "This is the research agent" can be assumed if identity is authenticated; "this research result is safe to act on" should be validated. The distinction keeps the collaboration efficient without making it gullible.
How does this relate to single-agent prompt injection? It is the same root cause, untrusted content reaching an agent that acts, applied to a new channel. In a multi-agent system, a compromised peer becomes the source of the untrusted content, so inter-agent messages join documents and web pages on the list of things that must be scrutinised.
Where Promptention fits
The principle we apply to a single agent, scan untrusted content before it can act, extends naturally to the spaces between agents, where one compromised node becomes the source of untrusted input for the rest. Pairing that with least-privilege identity per agent means a compromise stays contained instead of cascading. The collaboration stays fast; the trust between agents stops being a free pass.
Promptention's agentic defense covers the input, tool, and communication surfaces of multi-agent systems, aligned to OWASP ASI07 and ASI08.
Further reading: OWASP Top 10 for Agentic Applications (2026), "Insecure Inter-Agent Communication" and "Cascading Failures."
