Read our latestState of AI development report– explore trends, usage, and emerging patterns!Read the results →
All Posts
Feb 19, 2025
LLM basics
What is Agentic RAG?
8 min

Discover how combining agents with RAG can make your AI workflows more context-aware, and proactive.

No items found.

Authors:

No items found.

Commonly referred to as RAG, Retrieval-Augmented Generation enhances LLM applications by adding context to prompts. RAG helps reduce hallucinations and produce more relevant outputs.  

But what if we could take it a step further?

Enter Agentic RAG—a powerful evolution where AI agents are introduced in the retrieval and generation process. These agents can reason, plan, and utilize external tools, enabling them to dynamically manage retrieval strategies, refine context understanding, and adapt workflows for complex tasks.

In this post, we’ll break down the differences between naive RAG and agentic RAG, explore a real-world use case, and examine its benefits and limitations.

Taking context one step further

Augmenting a prompt with context is what makes LLMs actually useful—it’s the difference between a chatbot that just responds and one that understands. Imagine you’re building an internal knowledge base chatbot. With the right context, it can do more than just surface docs—it can guide employees to the right answers, freeing up support teams for bigger challenges.

The RAG process. Source: Link

For instance, a new employee could ask such a chatbot how often employee training occurs. Given the context of the company’s internal policies and training schedule, the chatbot can respond accordingly.

So let’s say the chatbot responds with how often employee training occurs, but now what? For a “naive” RAG chatbot, that might be it. While the knowledge of how often training occurs might be helpful to a new employee, it doesn’t necessarily assist them in actually signing up for and completing the training – if they even need that specific training in the first place.

Agentic RAG goes beyond adding context—it brings AI agents into the workflow, enabling the chatbot to take action, make decisions, and adapt dynamically.

In the case of our internal knowledge base chatbot example, an AI agent could determine if the employee needs the training based on context such as the employee’s role, responsibilities, and access permissions. Then, based on this context, the agent can perform tasks such as providing the most updated training schedule or even signing up the employee for the next session. 

In other words, the agent can use context to decide what to do and has access to tools that enable it to execute on that decision.

Why Agentic RAG?

You might be asking yourself, “How is this different from traditional non-AI rule-based systems that use conditionals to determine next steps?” It is an important question to ask and the answer is this:

  • Agentic RAG frees developers from having to write code for every possible “if-then.”
  • Agentic RAG is not limited to just the scenarios specified by developers – it can autonomously handle cases it has never encountered before by drawing on learned data.
  • Agentic RAG has the ability to learn from interactions, adapting and refining its performance over time.

While RAG improves LLM responses by reducing hallucinations and enhancing factual accuracy, it is limited to simply generating responses. Agentic RAG enables more intelligent automation that becomes more sophisticated and refined as it learns more. Agentic RAG is modular and scalable, reducing the need for human oversight.

Now that you have an understanding of the differences between naive and agentic RAG and how those differences can impact an AI-powered application, let’s dig a little deeper into these concepts and look at how we can implement them.

What is an AI agent?

While this concept and architecture is still rather new, the basic principle is this: An AI agent is a system that uses LLMs to dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks within a given environment.

This is one definition, and it is probably not a good one. The field is constantly developing and shifting. The underlying thing is that these agents should do more on their own, at different levels and with varying degrees of autonomy. Some simply follow predefined steps, while others adapt, learn, and make decisions based on new information. The key idea is that they reduce the need for human intervention, whether by automating tasks, reasoning through problems, or even collaborating with other systems to get things done.

In general, AI agents have the following characteristics:

  • They are goal-oriented, working towards a predefined objective and making decisions and executing on them depending on their assigned role.
  • They have tools such as access to APIs, upstream code functions, other applications, or databases that they can use to complete their objectives.
  • They can be adaptable and proactive with the ability to evolve as context changes, initiating or suggesting solutions without the need for explicit instructions.
  • They can be autonomous, operating independently with little to no human intervention.
  • They can have contextual awareness, meaning they can perceive and interpret their environment and evolve and adapt accordingly.

In our example of an internal company knowledge base chatbot, you could have an AI agent with the role of “Employee Training Agent.” 

Its objective could be to make sure employees complete all necessary training and it could be equipped with tools such as access to employee databases, the company wiki, the employee training calendar, and an API that enables the agent to sign employees up for training sessions. 

The Employee Training Assistant agent could leverage an LLM to evaluate an employee’s training query, determine if the employee needs training, provide a training schedule, and assist the employee in completing any outstanding training.

Putting it all together

So how can we combine the concepts behind RAG and AI agents to create a chatbot workflow that empowers employees and frees up human resources to deal with more complex issues? 

Let’s take a look.

1. Employee Initiates a Query

An employee queries an internal knowledge base chatbot: “How often does employee training occur?”

2. Router AI Agent Evaluates the Query

This query is evaluated by a Router AI Agent. The objective of the Router Agent is to evaluate the query and determine the appropriate agent to respond to the query. The Router Agent could be equipped with tools like access to an HR database and standard LLM prompting. Using context like the employee’s role, responsibilities, access, tenure, and chat history, the Router Agent decides what to do next and executes on its decision.

3. Router Agent Selects the Next Step
Let’s say that in this case, the Router Agent determines that given the employee’s role within the marketing organization and their recent start date, the Employee Training Agent is the most appropriate next step.

4. Employee Training Agent Takes Action

The Employee Training Agent has tools like access to the HR database, an employee calendar API, and a training calendar API. Without any human intervention, the Employee Training Agent not only enrolls the employee in the next brand guide training session, but also interacts with an Internal Knowledge Base Agent.

5. Internal Knowledge Base Agent Generates a Response

The Internal Knowledge Base Agent combines the employees original query, the outcomes of the Employee Training Agent, and context from the company’s internal knowledge base to prompt an LLM and generate a response.

6. Employee Receives a Personalized Answer

The employee receives the response: “Brand guide training is required for all employees in the marketing organization. The next training session is on the last Monday of the month at 10 AM PST. The training is now on your calendar. Let me know if this works for you. Is there anything else I can help you with?”

Is agentic RAG always better than naive RAG?

While Agentic RAG enables more complex workflows, improves accuracy, and can adapt over time, it comes with trade-offs. The more AI involved, the more expensive it gets—each retrieval and generation step means paying for additional tokens. More retrieval also adds latency, so while the final output might be more refined than a naive RAG workflow, it could be too slow to be useful.

On top of that, implementing Agentic RAG isn’t always straightforward—it can quickly become complex, requiring careful planning to justify the added cost and engineering effort.

When considering whether or not to implement agentic RAG, it is important to carefully consider your use case, objectives, and available resources. The example with the knowledge base chatbot might be better suited for a larger company in which many new hires who require different training may be starting each week. In that case, it could make sense to invest in engineering resources for this particular project. For a smaller company with fewer and less regular new hires, it might just make more sense for an employee to email HR for the training schedule.

Regardless of the use case and its context, you will need to perform evaluations of the models you use so you can adapt the performance them to your needs (and not end up wasting money!). It is important to choose an AI framework that enables you to easily make testing part of your strategy.  

Agentic RAG Frameworks

Agentic RAG is still in its early days—there’s no standard playbook yet, but that doesn’t mean you can’t start experimenting.

As LLMs evolve, open-source frameworks are making agentic workflows more accessible, with better composability, integrations, and modular tooling. There are a few agentic RAG frameworks that help with this:

  • Vellum AI is designed to help developers build, evaluate, and deploy AI products with a focus on modularity and extensibility. It enables seamless integration of LLMs, retrieval mechanisms, and agentic workflows, making it easier to create and refine AI products for production.
  • Crew AI is best known for building AI agent teams that can collaborate to complete complex tasks and share tools among each other.
  • LangGraph focuses on building AI agent workflows with a graph-based approach, enabling structured, multi-agent collaboration.
  • LlamaIndex connects LLMs with external data sources, enabling efficient retrieval and structured querying.
  • Swarm is OpenAI’s experimental agentic framework that emphasizes a lightweight nature in order to be more flexible.

The future of AI-powered workflows

Agentic RAG represents a significant shift in the role of AI from a passive responder to an active problem-solver. By intelligently leveraging context, executing tasks, and continuously refining its approach, agentic RAG has the potential to transform how businesses and individuals interact with artificial intelligence.

However, implementation comes with trade-offs. Latency, cost, and complexity must be carefully considered. While agentic RAG isn’t the right fit for every use case, its ability to automate repetitive processes and enhance decision-making makes it an invaluable tool for those looking to push the boundaries of AI-driven efficiency.

As AI frameworks evolve, agentic RAG will only become more powerful and accessible. Now is the time to explore how it can fit into your workflows.

The future of AI isn’t just about retrieving information; it’s about taking meaningful action.

ABOUT THE AUTHOR
Liz Acosta
Technical writer

Liz Acosta is a film student turned social media manager and content creator turned engineer turned developer advocate. She loves pizza, plants, pugs, and Python. She is particularly interested in the intersection of tech and community, and one of her favorite things to do is find ways to turn complex concepts into casual conversation.

No items found.
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect
Related Posts
View More

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.