Securing Retrieval-Augmented Generation (RAG) Applications

Eugene Weiss

Published Feb 3, 2025

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that allows Large Language Models (LLMs) to access and utilize external data that they were not originally trained on. In a RAG system, documents are chunked (broken up into blocks of text of a designated size), with their content embedded in vectors by passing each chunk through an embedding model. Some of the content is lost in the embedding process, but, if it works as intended, the crucial content needed for an effective search is retained. When a query is made, the prompt data is embedded in a vector by the same means, and the prompt vector is compared to the data vectors to find which of them are closest.

Unlike fine-tuning, which involves retraining an LLM on sensitive or proprietary data, RAG systems retrieve relevant information dynamically from external sources, ensuring a clear separation between the model and the knowledge it accesses. This separation makes RAG inherently more secure, as it reduces the risk of exposing sensitive data through model training.

It’s important to realize that while this post is about RAG as it is currently conceived, the principles of RAG security can be applied to any case where a language model is accessing an external data source.

State of the Art in RAG Security

The primary security considerations in RAG applications include securing object stores and vector databases from unauthorized access, implementing robust guardrails, and ensuring compliance with emerging AI security frameworks.

Prevent Object Store and Vector Database Breaches: Secure storage solutions must include strong encryption at rest and in transit, stringent IAM policies, and continuous monitoring for unauthorized access attempts.
Use AI Security Guardrails: Implementing systems and policies to control data retrieval and prevent sensitive information leakage is useful.
Compliance: Compliance for AI is a new and rapidly changing environment. In the Resources section below, there are links provided to current frameworks.

Limitations of these Approaches: Current approaches are designed to limit access to entire systems – make sure that the entire system is only accessible by authenticated users in a single class, and make sure that the databases are only accessed by an approved system. They don’t support granular controls on the data accessed.

Key Resources for RAG Security

NIST AI Risk Management Framework: A foundational guide for thinking about AI security, though it lacks specific RAG-focused data protection strategies.
OWASP AI Security Guide: Covers AI security best practices, including RAG-related risks.
Cloud Security Alliance on RAG Security: A comprehensive discussion of best practices for securing RAG applications.

Why Hasn't More Been Done?

AI is evolving so quickly that we need to remember that real RAG applications are only around 2 years old as of this writing. So far, RAG has primarily been applied to less sensitive data sources, reducing its attractiveness as a target for attackers. A typical use case is to help employees access HR documentation. As the use of RAG expands into sectors that handle proprietary and regulated data, security concerns will become more pressing, demanding stronger protections.

Recommended by LinkedIn

AI Data Security Has an Information Problem. I Built a…

Stephen Years 2 months ago

Building Trust in AI: A Governance-Driven Approach to…

Thiago Ribeiro 12 months ago

Zero Trust for AI Agents

Som Rout 2 months ago

What is Needed for More Secure RAG Applications?

To use RAG applications where genuinely sensitive data is included, current approaches are insufficient.

IAM for Text

Each chunk of retrieved text should be properly tagged with ownership and sensitivity metadata. The best practice is to inherit security tags from the source document rather than attempting to determine which specific chunk contains what type of data. This ensures consistency and simplifies permission management.

IAM for Vectors

Although vector representations of data are lossy, they still pose a significant security risk. The vector database used in a RAG system must be both searchable and capable of storing security metadata to enforce access controls. Choosing a vector database that can support IAM for specific vectors should be done for a RAG system that contains sensitive content.

Prevent Model and Vendor Data Retention

Prompts, retrieved text, and vectors should not be retained or stored by the LLM provider or the RAG system itself beyond the necessary session duration. If the data is sensitive, make sure that your model set-up, or your LLM provider do not retain any prompt data beyond the current session. While models and LLM providers put guardrails on model output to try to limit the output of sensitive data, you should assume that anyone who has access to a language model has access to all the data that the model was trained on.

The Path Forward

One doesn’t need to invent whole new principles to cybersecurity for AI, but one does need to develop carefully constructed new ways to apply those principles. Zero-Trust is the most important of these. If “identity is the new perimeter”, as I’ve heard it expressed, then we are mainly talking about combining NHI or Non Human Identity with human identity, in the form of transitive identity with more granular access control. More work has to be done to determine how to operationalize more granular access controls. Project WIMSE created by IETF is beginning to address this.

As RAG adoption grows, so will the need for enhanced security measures. Organizations must proactively implement robust security frameworks to safeguard their data while leveraging the benefits of AI-driven retrieval systems.

yogi porla 8mo

looking forward to the demo Eugene Weiss

John Dundas 1y

Outstanding Eugene Weiss

See more comments

To view or add a comment, sign in

Securing Retrieval-Augmented Generation (RAG) Applications

Eugene Weiss

What is RAG?

State of the Art in RAG Security

Key Resources for RAG Security

Why Hasn't More Been Done?

Recommended by LinkedIn

What is Needed for More Secure RAG Applications?

IAM for Text

IAM for Vectors

Prevent Model and Vendor Data Retention

The Path Forward

More articles by Eugene Weiss

Others also viewed

Securing Agentic AI in the Enterprise: Guardrails that Work

FAIRRSS: How We Built an AI-Powered Financial Security Risk System on IBM watsonx.ai

Openclaw and open-source agentic system: a cybersecurity risk assessment

AI Risk Radar #007: Agent Security Is Now Three Problems

Your Existing Security Framework Was Not Built for AI. Here’s What 55 Controls Across 7 Domains Actually Cover in a Regulated Enviroment.

Securing the Future of Government Systems with AI

The CISO Is AI Governance’s Chief Reality Officer

Revolutionizing Data Management: How AI Transforms Efficiency and Security in the Digital Age

Agentic AI: A Challenge for Security Standards

Mythos & GPT-5.4-Cyber: The Upcoming AI-Driven Vulnerability Surge in SAP

How to Secure Large Language Models

Understanding the Role of Rag in AI Applications

How to Use RAG Architecture for Better Information Retrieval

Limitations of the RAG Approach in AI

How to Improve RAG Retrieval Methods

New Approaches to RAG Models

How to Improve Retrieval-Augmented Generation Architectures

Explore content categories

What is RAG?

State of the Art in RAG Security

Key Resources for RAG Security

Why Hasn't More Been Done?

Recommended by LinkedIn

What is Needed for More Secure RAG Applications?

IAM for Text

IAM for Vectors

Prevent Model and Vendor Data Retention

The Path Forward

More articles by Eugene Weiss

How to Do MCP Security Right, Part 4: Adding Context

How to Do MCP Security Right, Part 3: Integrating Human with Non-Human Identity

How to do MCP Security Right, Part 2: A Simple Example

How to do MCP Security Right, Part 1: Introduction

Securing Unstructured Data: Challenges and Future Directions

Data Security and Zero Trust

Others also viewed

Securing Agentic AI in the Enterprise: Guardrails that Work

FAIRRSS: How We Built an AI-Powered Financial Security Risk System on IBM watsonx.ai

Openclaw and open-source agentic system: a cybersecurity risk assessment

AI Risk Radar #007: Agent Security Is Now Three Problems

Your Existing Security Framework Was Not Built for AI. Here’s What 55 Controls Across 7 Domains Actually Cover in a Regulated Enviroment.

Securing the Future of Government Systems with AI

The CISO Is AI Governance’s Chief Reality Officer

Revolutionizing Data Management: How AI Transforms Efficiency and Security in the Digital Age

Agentic AI: A Challenge for Security Standards

Mythos & GPT-5.4-Cyber: The Upcoming AI-Driven Vulnerability Surge in SAP

Similar topics

How to Secure Large Language Models

Understanding the Role of Rag in AI Applications

How to Use RAG Architecture for Better Information Retrieval

Limitations of the RAG Approach in AI

How to Improve RAG Retrieval Methods

New Approaches to RAG Models

How to Improve Retrieval-Augmented Generation Architectures

Explore content categories