Search...

Index

Inline evaluation / Guardrails: Ensure good system performance at run-time

This is some text inside of a div block.

How can I use LLMs to classify user intents for my chatbot?

Learn how to build and evaluate intent handler logic in your chatbot workflow

Author

Anita Kirkovska

Jan 11, 2024

Creating an AI chatbot involves more than just using a model API and adding context with your data.

You should also consider the various intents users might have and how to manage them.

In this blog post, we'll outline the steps to set up and evaluate intent detection in your chatbot workflow.

‍

What Is Intent Detection?

Intent detection is a task that identifies a user's intention or desired outcome from their query. This task is essential for AI chatbots to provide accurate and relevant responses.

A very standard intent detection workflow uses a pre-trained LLM model, a prompt with instructions and context, and a handler logic for each of the intents.

‍

Why do you need it?

Let’s look at a real example to illustrate the need for this step in a chatbot workflow.

Imagine that you’re building a customer service chatbot for an e-commerce business. Your visitors can ask the chatbot any question. Before the chatbot responds or takes any action, it needs to accurately understand the visitor's intent.

To keep this example simple, lets say that you have defined 5 intent categories: "Order Status", "Product Information", "Payments", "Returns" and "Feedback”.

For each category, there should be a distinct step where the LLM powered chatbot, figures out the user's intent. It does this by placing the user's question into the right category. After identifying the intent, the chatbot can then take the next appropriate actions for that particular category.

Having separate steps for the prompts and intent handlers is useful because each of your intents might need to do different actions. For example: “Returns” might need to be handled by an external service/API that a handler action should call, and the handler for “Product information” might just call an LLM and a context doc to answer with text response. Also, adding too many instructions in one prompt can also influence the performance.

Identifying these intents accurately allows the chatbot to respond better, call an external API or route the query to the correct personnel for further assistance.

In the next section we show you how to implement it.

‍

How can you build intent detection for your chatbot?

To build a reliable intent detection for your chatbot you need to cover 4 critical steps:

Defining intents
Setting up the intent detection prompt
Setting up handler logic prompts
Testing and evaluating prompts/models

We give more details on these in the following sections.

Define intents

Identify the main reasons users interact with your chatbot, like asking for help or making a purchase.

Group these reasons into categories, or 'intents', using insights from customer interactions and FAQs.

‍

Write the intent detection prompt

After you have your intents, you should start drafting the system prompt that will be used to classify the user’s query. Make sure to give clear directions, and follow best prompt engineering practices. Here’s a simple example for the system prompt:



🤖 System Prompt:

You’re a LLM that detects intent from user queries. Your task is to classify the user's intent based on their query. Below are the possible intents with brief descriptions. Use these to accurately determine the user's goal, and output only the intent topic.

- Order Status: Inquiries about the current status of an order, including delivery tracking and estimated arrival times.

- Product Information: Questions regarding product details, specifications, availability, or compatibility.

- Payments: Queries related to making payments, payment methods, billing issues, or transaction problems.

- Returns: Requests or questions about returning a product, including return policies and procedures.

- Feedback: User comments, reviews, or general feedback about products, services, or experiences.

- Other: Choose this if the query doesn’t fall into any of the other intents.

💬 User Query:
I would like to check my last order.

🤖 Response:
Order status.

* For more reliable outputs you can also consider using function calling with your models, so that you always get a structured response from the model, one that can be used to run specific functions in your code. Learn how to set it up here.

Don’t forget to add a fallback option!

Did you notice that we added a fallback intent “Other” in the system prompt?

Adding fallback prompts is essential for handling situations where the chatbot fails to understand or correctly classify a user's intent.

Fallback prompts act as a safety net to keep users engaged, even when their query isn't a clear match. They can involve clarifying questions, rephrasing the query, or offering human assistance.

‍

Set up handler logic

For each intent, you need to develop a response mechanism, and decide if the chatbot should perform an action like calling an API to another tool/service, or to just provide a text response.

To implement the handler logic you’ll need to build a more complex LLM chain, and use other prompts to provide text responses or to call an API to perform a specific action with external tools and services.

‍

Test and evaluate prompts accross test cases

Testing and evaluating prompts across test cases is crucial for building reliable chatbots.

Before you push it to production, you need to be sure that your intent classifiers and handlers are working properly. You should test every intent path with various user queries and evaluate the performance of different prompt and model combinations.

For example, we recently did an experiment where we used four models and few-shot prompts to classify if a customer support ticket has been resolved or not. We had around 200 test cases, and used Vellum to evaluate our configuration at scale. Below you can see how that looked like in the product.

Let us help!

Building an intent classifier is not just a one-time setup; it’s a continuous process that requires extensive evaluation and monitoring once in production.

If you’re planning to build your custom chatbot and need assistance with the setup and evaluation, we can help.

Vellum’s platform for building production LLM apps can help you build a reliable chatbot. We provide the tooling layer to experiment with prompts and models, evaluate at scale, monitor them in production, and make changes with confidence if needed.

If you’re interested, you can book a call here. You can also subscribe to our blog and stay tuned for updates from us.

Creating an AI chatbot involves more than just using a model API and adding context with your data.

You should also consider the various intents users might have and how to manage them.

In this blog post, we'll outline the steps to set up and evaluate intent detection in your chatbot workflow.

‍

What Is Intent Detection?

Intent detection is a task that identifies a user's intention or desired outcome from their query. This task is essential for AI chatbots to provide accurate and relevant responses.

A very standard intent detection workflow uses a pre-trained LLM model, a prompt with instructions and context, and a handler logic for each of the intents.

‍

Why do you need it?

Let’s look at a real example to illustrate the need for this step in a chatbot workflow.

To keep this example simple, lets say that you have defined 5 intent categories: "Order Status", "Product Information", "Payments", "Returns" and "Feedback”.

Identifying these intents accurately allows the chatbot to respond better, call an external API or route the query to the correct personnel for further assistance.

In the next section we show you how to implement it.

‍

How can you build intent detection for your chatbot?

To build a reliable intent detection for your chatbot you need to cover 4 critical steps:

Defining intents
Setting up the intent detection prompt
Setting up handler logic prompts
Testing and evaluating prompts/models

We give more details on these in the following sections.

Define intents

Identify the main reasons users interact with your chatbot, like asking for help or making a purchase.

Group these reasons into categories, or 'intents', using insights from customer interactions and FAQs.

‍

Write the intent detection prompt



🤖 System Prompt:

You’re a LLM that detects intent from user queries. Your task is to classify the user's intent based on their query. Below are the possible intents with brief descriptions. Use these to accurately determine the user's goal, and output only the intent topic.

- Order Status: Inquiries about the current status of an order, including delivery tracking and estimated arrival times.

- Product Information: Questions regarding product details, specifications, availability, or compatibility.

- Payments: Queries related to making payments, payment methods, billing issues, or transaction problems.

- Returns: Requests or questions about returning a product, including return policies and procedures.

- Feedback: User comments, reviews, or general feedback about products, services, or experiences.

- Other: Choose this if the query doesn’t fall into any of the other intents.

💬 User Query:
I would like to check my last order.

🤖 Response:
Order status.

Don’t forget to add a fallback option!

Did you notice that we added a fallback intent “Other” in the system prompt?

Adding fallback prompts is essential for handling situations where the chatbot fails to understand or correctly classify a user's intent.

Fallback prompts act as a safety net to keep users engaged, even when their query isn't a clear match. They can involve clarifying questions, rephrasing the query, or offering human assistance.

‍

Set up handler logic

For each intent, you need to develop a response mechanism, and decide if the chatbot should perform an action like calling an API to another tool/service, or to just provide a text response.

‍

Test and evaluate prompts accross test cases

Testing and evaluating prompts across test cases is crucial for building reliable chatbots.

Let us help!

Building an intent classifier is not just a one-time setup; it’s a continuous process that requires extensive evaluation and monitoring once in production.

If you’re planning to build your custom chatbot and need assistance with the setup and evaluation, we can help.

If you’re interested, you can book a call here. You can also subscribe to our blog and stay tuned for updates from us.

ABOUT THE AUTHOR

Anita Kirkovska

Founding Growth Lead

An AI expert with a strong ML background, specializing in GenAI and LLM education. A former Fulbright scholar, she leads Growth and Education at Vellum, helping companies build and scale AI products. She conducts LLM evaluations and writes extensively on AI best practices, empowering business leaders to drive effective AI adoption.

No items found.

talk with an AI Expert

Product Updates

July 1, 2025

•

6 min

Vellum Product Update | May & June

LLM basics

June 8, 2025

•

5 min

Big Ideas from the AI Engineer World’s Fair

LLM basics

June 1, 2025

•

8 min

Build AI Products Faster: Top Development Platforms Compared

Customer Stories

May 30, 2025

•

5 min

How GravityStack Cut Credit Agreement Review Time by 200% with Agentic AI

Guides

May 28, 2025

•

7 min

How the Best Product and Engineering Teams Ship AI Solutions

Model Comparisons

May 23, 2025

•

8 min

Evaluation: Claude 4 Sonnet vs OpenAI o4-mini vs Gemini 2.5 Pro

The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska

Head of Engineering