RAG Against the Machine: How to get the most out of AI models

An abstract image of a cloud raining data.
(Image credit: Pixabay)

Large language models (LLMs) have taken the world by storm. It's unsurprising given their versatility to answer questions on a broad range of topics and generate content at speed. However, it's becoming clear that the most valuable models to enterprises are not those that can recite the works of Shakespeare, but those that can provide accurate, domain-specific expertise.

In most cases, that means using industry or company-specific data - something most organizations will be wary of plugging into a model. This is exactly where Retrieval Augmented Generation (RAG) frameworks come in.

Shane McAllister

Lead Developer Advocacy (Global) at MongoDB

Getting under the hood

RAG is a process that improves the accuracy, currency and context of LLMs like GPT4. They work by combining a pre-trained LLM with a retrieval component that is connected to readily accessible information. The retrieval system finds relevant information in a knowledge library like a database. This, in turn, is passed to the LLM, or foundation model, to provide a more informed and accurate natural language answer with the most current and relevant information for the task.

RAG systems allow LLMs to refer to an external authoritative source of knowledge outside of the data set it was trained on, such as a company’s proprietary data, without needing to be retrained or compromising the security of that data.

It is this information retrieval component that is at the heart of how RAG works, and how it's differentiated from general LLMs. Chatbots and other technologies that use natural language processing can massively benefit from RAG. And a variety of industries, especially those handling sensitive or specialized data, can begin to maximize the full potential of data-driven LLMs with RAG in their corner.

The best of both worlds

Using a RAG approach brings several benefits. One of the most important is the ability to make large language models more agile. Most language models have a defined training window that can go out of date quickly, but RAG allows volatile and time-sensitive data to be used in an LLM, such as developments in the news. As a result, RAG allows an LLM to be updated at the point of the user’s request, rather than requiring it to be entirely retrained with new data regularly.

RAG can also allow the model to be supplemented with sensitive data that cannot (and should not!) be used for the initial training of the LLM. RAG is particularly useful for any generative AI applications that work within highly domain-specific contexts, healthcare, financial services and science and engineering for example. Data in these domains tends to be sensitive, and there are various frameworks and regulations in place to safeguard its privacy, meaning training data is often sparse. In turn, RAG is essential to building useful generative AI tools in these industries.

As an example, consider electronic health records and medical histories. These contain sensitive information protected by privacy laws. While such records would never be included in the initial LLM training, RAG can integrate this data during runtime, allowing a healthcare professional to make queries about patients without compromising their data. This enables RAG applications to offer more precise and relevant responses to patient queries, enhancing personalized care and decision-making while maintaining data privacy and security.

Limitations to note

While RAG is a powerful approach, it’s not a silver bullet. Its effectiveness depends on the quality of the retrieval system and the data being used. If the retrieval system fails to find accurate or relevant documents, the generated output can be incorrect. Similarly, the retrieval database must also contain accurate, up-to-date, and high-quality documents to ensure responses are useful. RAG systems are a powerful addition to an LLM’s accuracy, but this approach does not entirely eliminate the risks of AI hallucinations, or inaccurate responses.

Also, while being able to draw from more up-to-date sources of information, RAG systems do not access information from the internet in real-time. Instead, RAG requires pre-indexed datasets or specific databases that must be regularly updated as that data evolves. However, it is usually still much easier to update this additional database than to retrain the foundational LLM.

A new frontier of generative AI applications

Given the use cases of RAG, we’re likely to see further research into hybrid models that combine retrieval and generation in AI and NLP. This could inspire innovations in model architectures leading to the development of generative AI capable of taking actions based on contextual information and user prompts, known as agentic applications.

RAG agentic applications have the potential to deliver personalized experiences, such as negotiating and booking the best deals for a vacation. The coming years will likely see advancements in allowing RAG models to handle more complex queries and understand subtle nuances in the data they retrieve.

We list the best AI chatbot for business.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Shane McAllister, Lead Developer Advocacy (Global) at MongoDB.

Read more
Ai tech, businessman show virtual graphic Global Internet connect Chatgpt Chat with AI, Artificial Intelligence.
What is RAG in AI? The low-down on Retrieval Augmented Generation
A person holding out their hand with a digital AI symbol.
Your AI, your rules: Why BYO-LLM “bring your own LLM” is the future
Image of someone clicking a cloud icon.
Unified data means faster AI: Here’s how to unleash its potential
Ai tech, businessman show virtual graphic Global Internet connect Chatgpt Chat with AI, Artificial Intelligence.
What is AI? Everything you need to know about Artificial Intelligence
An abstract image of digital security.
Looking before we leap: why security is essential to agentic AI success
A person holding out their hand with a digital AI symbol.
The decision-maker's playbook: integrating Generative AI for optimal results
Latest in Pro
Branch office chairs next to a TechRadar-branded badge that reads Big Savings.
This office chair deal wins the Amazon Spring Sale for me and it's so good I don't expect it to last
Saily eSIM by Nord Security
"Much more than just an eSIM service" - I spoke to the CEO of Saily about the future of travel and its impact on secure eSIM technology
NetSuite EVP Evan Goldberg at SuiteConnect London 2025
"It's our job to deliver constant innovation” - NetSuite head on why it wants to be the operating system for your whole business
FlexiSpot office furniture next to a TechRadar-branded badge that reads Big Savings.
Upgrade your home office for under $500 in the Amazon Spring Sale: My top picks and biggest savings
Beelink EQi 12 mini PC
I’ve never seen a PC with an Intel Core i3 CPU, 24GB RAM, 500GB SSD and two Gb LAN ports sell for so cheap
cybersecurity
Chinese government hackers allegedly spent years undetected in foreign phone networks
Latest in News
DeepSeek
Deepseek’s new AI is smarter, faster, cheaper, and a real rival to OpenAI's models
Open AI
OpenAI unveiled image generation for 4o – here's everything you need to know about the ChatGPT upgrade
Apple WWDC 2025 announced
Apple just announced WWDC 2025 starts on June 9, and we'll all be watching the opening event
Hornet swings their weapon in mid air
Hollow Knight: Silksong gets new Steam metadata changes, convincing everyone and their mother that the game is finally releasing this year
OpenAI logo
OpenAI just launched a free ChatGPT bible that will help you master the AI chatbot and Sora
An aerial view of an Instavolt Superhub for charging electric vehicles
Forget gas stations – EV charging Superhubs are using solar power to solve the most annoying thing about electric motoring