Retrieval Augumented Generation

Eeswar C.

Published Sep 1, 2023

Anyone within the industry who has utilized ChatGPT for business purposes would likely have had the thought, "This is truly impressive! I appreciate how GPT can effectively address my inquiries. Now, the question is, how can I implement this for my own use? Can I train it using my specific data?"

Upon delving into this, one begins to explore the costs and complexities associated with training. This raises the question of whether such an endeavor is feasible or advisable. It seems unlikely that we are prepared to become direct competitors with OpenAI at this time.

Article content — Lewis et al., (2021) (

A group of Meta AI researchers introduced a methodology known as Retrieval Augmented Generation (RAG) to tackle tasks that require substantial knowledge. RAG merges an information retrieval component with a text generation model. This allows RAG to be fine-tuned and its internal knowledge to be adjusted efficiently without requiring a complete retraining of the entire model.

RAG operates by taking an input and retrieving a collection of pertinent supporting documents based on a given source. These documents are then concatenated as context with the original input, which is subsequently fed into the text generation component to produce the final output. This adaptability of RAG proves valuable for scenarios in which factual information may evolve over time, addressing a limitation of Language Model's static knowledge. RAG's approach permits language models to bypass the need for complete retraining, enabling them to access the most up-to-date information for generating accurate outputs via retrieval-based generation.

The process of implementing RAG involves several steps:

Candidate Selection: The retrieval system identifies a set of text snippets that are potential candidates due to their relevance to the input context or query.

Scoring and Ranking: Each candidate snippet is assigned a score based on factors such as relevance and accuracy. The retrieval system arranges the candidate snippets in order of their scores.

Input Combination: The top-rated candidate snippets are combined with the original input context or query, creating an extended input that encompasses both retrieved text and the original input.

Generation Process: The extended input is fed into the generative model, which utilizes both the retrieved text snippets and the original input to generate the final text output.

Is it possible to construct such a system?

Recommended by LinkedIn

From Chatbots to Contextual Intelligence: Why RAG Is…

Nallas Corporation 2 months ago

🔗How Retrieval-Augmented Generation (RAG) Powers…

SATYAJIT SINGH 9 months ago

Let the Genie out of the bottle: The Pre-Built AI…

Ryan Meyer 2 years ago

Leading cloud service providers like Microsoft and Amazon offer RAG solutions.

RAG with Azure Machine Learning:

In Azure Machine Learning, RAG is facilitated through integration with Azure OpenAI Service, making use of large language models and vectorization. This integration supports tools like Faiss and Azure Cognitive Search as vector stores, along with open-source offerings like LangChain for data chunking. Implementing RAG involves formatting data to enable efficient searchability before sending it to the Language Model, ultimately optimizing token consumption. Regularly updating the data is also crucial for maintaining RAG's effectiveness.

RAG with Amazon SageMaker:

External data that enhances prompts can come from various sources like document repositories, databases, or APIs. The process involves converting documents and user queries into a compatible format for relevance searches. Embedding language models are used to transform the data into numerical representations, allowing comparisons. RAG models leverage these embeddings to combine user queries and relevant context, which is then fed to the foundation model. Knowledge libraries and their embeddings can be updated asynchronously.

The process is similar across platforms like AWS, Azure, and IBM, and open-source tools like Haystack can also achieve similar results.

The era of generative AI has unlocked numerous capabilities for existing systems. One notable advancement is Vector databases and retrieval augmented generation. This overview only scratches the surface of the potential, such as building AI agents capable of processing various data types like text, images, videos, or audio. RAG and vector databases tackle the challenges of extended context windows in Language Models, bringing historical knowledge-based reasoning to the forefront.

Retrieval Augumented Generation

Eeswar C.

Recommended by LinkedIn

RAG with Azure Machine Learning:

Gentle Gaint

170 followers

More articles by Eeswar C.

Others also viewed

OpenAI for Analytics

ChatGPT and DALL-E-2 — Show me the Data Sources

GPT-4 Turbo: 4 Major Updates To Look Out For

GPT-5 Explained: New Features, Benchmarks, and Real-World Impact

GPT-5: The Launch, the Backlash, and the Bigger Picture

OpenAI Launches GPT-5 — A "Ph.D.-Level Expert" Now in Everyone’s Hands

GPT-4: Overview

From APIs to AI models: what every tech professional should understand today

Re-training Strategy for fine-tuned LLMs

Building a Retrieval-Augmented Generation (RAG) model with ChatGPT

How to Improve RAG Retrieval Methods

Implementing Retrieval Augmented Generation in Enterprises

How to Use RAG Architecture for Better Information Retrieval

How to Use Retrieval Augmented Generation Strategies

How to Improve Retrieval-Augmented Generation Architectures

New Approaches to RAG Models

Challenges of Retrieval Augmented Generation

Retrieval-Augmented Generation Technology Stack Guide

Explore content categories

Recommended by LinkedIn

RAG with Azure Machine Learning:

Gentle Gaint

170 followers

More articles by Eeswar C.

AI Will Scale. Your Business Model May Not.

The next enterprise AI battle is not for chat. It is for the layer around work!

From Destination Platforms to Embedded Intelligence

In-Context Learning

Diffusion Model - Gen AI

Anomaly Detection with VAE

Neural Network

BERT - Who?

How Does my Iphone know its me?

Natural Language Data Search

Others also viewed

OpenAI for Analytics

ChatGPT and DALL-E-2 — Show me the Data Sources

GPT-4 Turbo: 4 Major Updates To Look Out For

GPT-5 Explained: New Features, Benchmarks, and Real-World Impact

GPT-5: The Launch, the Backlash, and the Bigger Picture

OpenAI Launches GPT-5 — A "Ph.D.-Level Expert" Now in Everyone’s Hands

GPT-4: Overview

From APIs to AI models: what every tech professional should understand today

Re-training Strategy for fine-tuned LLMs

Building a Retrieval-Augmented Generation (RAG) model with ChatGPT

Similar topics

How to Improve RAG Retrieval Methods

Implementing Retrieval Augmented Generation in Enterprises

How to Use RAG Architecture for Better Information Retrieval

How to Use Retrieval Augmented Generation Strategies

How to Improve Retrieval-Augmented Generation Architectures

New Approaches to RAG Models

Challenges of Retrieval Augmented Generation

Retrieval-Augmented Generation Technology Stack Guide

Explore content categories