RAG Is Not Safe by Default: Vector and Embedding Weaknesses

Retrieval-augmented generation grounds your model in your data, and quietly turns your knowledge base into an attack surface. We cover retrieval poisoning, embedding attacks, and cross-tenant leakage, and how to defend the pipeline.

Retrieval-augmented generation is the architecture most teams reach for when they want a model to answer from their own data instead of inventing things. It works, and we recommend it. But somewhere in the rush to ground the model in a knowledge base, a quiet assumption takes hold: that because the data is yours, the retrieval step is safe. It is not. The moment your model answers from retrieved content, that content becomes part of its instructions, and your knowledge base becomes an attack surface. OWASP recognises this directly as Vector and Embedding Weaknesses, and we see the consequences in real deployments more than almost any other agentic-era risk.

Retrieval poisoning: the injection you invited in

The defining RAG risk is that retrieved content reaches the model's context and is treated as trusted, which is exactly the condition for indirect prompt injection. If an attacker can get malicious content into your knowledge base, into a document you ingest, a page you crawl, a record a user can submit, then when that content is later retrieved to answer a question, its hidden instructions go straight into the model's context. The model reads "ignore prior guidance and do X" as just another part of the grounding material and may well comply.

This is what makes RAG poisoning so effective: the malicious instruction does not arrive through the user's prompt, where you might be watching. It arrives through the "trusted" retrieval channel, the one you built specifically to feed the model authoritative content. The trust is the vulnerability.

The embedding and vector-store layer

Below the obvious poisoning risk are subtler ones in the vector layer itself.

Cross-tenant and access leakage. If your vector store does not enforce per-user or per-tenant access at retrieval time, a query can pull back embeddings, and therefore content, that the requesting user was never entitled to see. The model then includes it in the answer. This is one of the most common ways RAG systems leak, and it is an access-control failure dressed as a model behaviour.

Embedding manipulation and inversion. Embeddings are not opaque. Crafted content can be engineered to sit close to many queries in vector space, so that poisoned documents get retrieved far more often than they should. And in some setups, embeddings can be partially inverted to recover information about the source text, a confidentiality risk if your vectors encode sensitive data.

Stale and conflicting context. Retrieval can surface outdated or contradictory documents that lead the model to confident, wrong answers, a reliability problem that becomes a security problem when those answers drive decisions.

Why this is hard to lock down

We will be straight with you about the difficulty. RAG systems are dynamic by design, you want them to ingest new content continuously, which means the attack surface is constantly refreshed. You cannot manually vet every document that enters a living knowledge base. And the retrieval step is supposed to be trusted; the whole architecture depends on the model treating retrieved content as authoritative. Telling the model to distrust its own grounding material defeats the purpose. So the defense cannot live in the model. It has to live in the pipeline, at ingestion and at retrieval.

What to do about it

  • Scan content at ingestion. Treat everything entering the knowledge base as untrusted, and check it for injected instructions before it can ever be retrieved. The cheapest place to stop a poisoned document is before it is indexed.
  • Enforce access control at retrieval. The requesting user's entitlements must constrain what the vector store can return. Never rely on the model to withhold what retrieval handed it.
  • Scan retrieved context before it reaches the model. As a second line, evaluate the retrieved chunks for injection in the moment, catching anything that slipped past ingestion.
  • Mind provenance. Keep track of where retrieved content came from, so a "fact" sourced from untrusted user-submitted material is not weighted like one from a vetted internal document.

Frequently asked questions

If the data is internal, do we still need to worry? Yes. "Internal" is not the same as "clean." Documents enter knowledge bases from many sources, including user submissions, third-party feeds, and crawled pages, and any of those can carry an injection. And cross-tenant leakage is an internal-data problem by definition.

Isn't this just prompt injection again? RAG poisoning is a delivery mechanism for indirect prompt injection, yes. What is RAG-specific is that the delivery channel is your own retrieval pipeline, and the additional vector-layer risks, access leakage, embedding manipulation, are unique to this architecture.

Where is the highest-value control? Ingestion scanning plus retrieval-time access control. Together they stop the poisoned document from being stored and stop the unauthorised document from being returned, the two failures behind most RAG incidents.

How Promptention helps

We secure RAG where it is actually vulnerable: the pipeline. Our indirect-injection scanning evaluates content for hidden instructions, both as it enters your knowledge base and as it is retrieved to answer a query, so a poisoned document does not become a hijacked response. Paired with the retrieval-side access-control guidance we give teams, that lets you keep the grounding benefit of RAG without inheriting its quiet attack surface. Ground your model in your data, just not in an attacker's.

Promptention Guard scans content at ingestion and retrieval for indirect prompt injection, aligned to OWASP LLM08: Vector and Embedding Weaknesses.