Vellum is coming to the AI Engineering World's Fair in SF. Come visit our booth and get a live demo!

How to craft effective prompts

A curated list of best practices, techniques and practical advice on how to get better at prompt engineering.

Written by
Reviewed by
No items found.

Prompt engineering is one of those job functions that suddenly became a thing.

Two years ago, the term was scattered across a few white-papers, just relevant to niche research circles. But today, it’s a hot sought-after skill — just look at all the open positions on this job board. Given the recent explosion of business-ready AI, it’s to little surprise.

Admittedly, the definition of prompt engineer is vague; some might argue that anyone using an LLM is a prompt engineer. But the term tends to refer to those that are building something instrumental with LLMs, like an AI-powered application or a business-critical functionality.

For those problems, messy prompts with tacked-on clarifications aren’t forgivable. Instead, prompts need to render accurate, correctly-formatted, and affordable results.

At Vellum, we’ve partner with hundreds of companies who are at the forefront of building reliable AI systems in production. We worked with product, engineering, and marketing teams who must adopt good prompt engineering skills in order to succeed in their job.

From designing prompts and architectures to scaling evaluations, we’ve figured out what makes a great prompt engineer—and great prompts. Now, we’re excited to share these insights with you. Let’s get into the details.

Why prompt engineering

These days, knowing how to do prompt engineering can make you significantly better and more productive at your daily job—whether you're a Product Manager, Developer, or in any other role. It’s a skill that helps you get the most out of AI tools and makes solving problems faster and more effective.

It’s also a skill that can land you a high-paying technical role. AI startups have raised $48.4 billion so far this year and are hiring across various functions. If you have even a basic understanding of AI, you’ll have a strong advantage in the job market.

However, to own the prompt engineering process, you must remain educated on emerging prompt engineering techniques. The space is quickly changing as new AI models evolve and more complex tasks become conquerable. Without constantly up-skilling yourself on new techniques, your prompts will fall behind industry standards.

Sign up to our newsletter to stay in the loop, and keep reading for more tips and insights that will help you become the best prompt engineer.

Getting started

In this section, we won’t cover specifically how to prompt ChatGPT to do some basic tasks, there are many resources for that.

Instead, we’ll focus on how to approach prompt engineering for scenarios where you're working on AI functionality or building an AI-powered app from the ground up.

No matter your experience, there are three key rules that you need keep in mind as you learn to write great prompts:

1. Know exactly what you want to achieve

In coding, using pseudo code to outline your logic before writing the actual code is a common practice. Similarly, with prompt engineering, you should first clearly outline all your goals for your prompt.

Here is a list of questions you should answer:

  • Will I need both static and dynamic content for my prompt? How will I pass dynamic variables?
  • What should the output be? (JSON, String, XML)
  • Do you need to add external context or is the LLM familiar with the task?
  • Are there any limits or constraints on the response (length/format)
  • Do you need to tweak some of the LLM parameters? Which ones?

The more precisely you understand your objectives, the easier it becomes to construct a prompt that delivers the desired results. Sometimes, collaborating with your co-workers is the best way to iron-out these requirements, especially if multiple departments or app functions are at stake.

2. You can’t afford to skip experimentation

LLMs aren’t perfect, and neither is prompt engineering.

Every rule in this document should be taken with a grain of salt. The best way to really vet if your prompt engineering is optimal is to test different things. Try different techniques, provide examples, split your prompts for more control, and tweak the little details. We’ll give you all the tools you need to do this properly in the next sections.

But make sure you’re aware that an LLM is never equally capable for all the possible tasks; there is a fair amount of variance, and only by scrutinizing your prompts will you be able to determine if your strategy is optimal.

3. LLMs are not humans

Whether you are a developer or not, the first rule to remember when thinking about prompt engineering is that LLMs aren’t humans. For lack of a better word, LLMs are obedient in nature. They thrive with strict directions. They aren’t designed to freestyle.

They just are also different. Some things that are obvious to a human aren’t clear to an LLM; likewise, complex instructions that a human would stumble over, an LLM can accomplish flawlessly.

Finally, remember this: LLMs will never say, “I don’t know”, out of the box. They’re programmed to always provide an answer, even when they don’t actually know it. This can lead to what we call "model hallucinations," so it’s crucial to manage and be aware of this tendency.

Do these rules differ for developers and non-developers?

Both developers and non-developers can interact with LLMs in the same way. While developers might have an edge with logical reasoning, prompt engineering is a skill that anyone can pick up with practice.

The outcome, however, for non-technical folks versus developers can vary:

For non-developers, prompt engineering can strictly be prompt structure and performance. It’s like writing a good email to a co-worker that you don’t know very well. Clarity is the north star. Vagueness is a curse. Detail helps. Too much detail could hurt. A good prompt is simply striking the right balance between these things in order to achieve high performance.

For developers, the focus is more on how to integrate LLM outputs into their existing code. They see prompts as part of their AI functionality and think about how to make these outputs work smoothly in their systems. JSON outputs are often ideal for this, but getting valid JSON responses can be tricky and requires a bit of trial and error to get right.

Let’s look at some best practices.

Best Practices

First things first: Do prompt engineering techniques depend on the model?

Different model, Different prompting design?

The tough thing about prompt engineering is that good advice for GPT-4o may not apply directly to Claude 3.5 Sonnet (in fact, Claude 3.5 Sonnet might require different prompts from GPT-4). Different tier models introduce more challenges — stronger models are better at inference (e.g. GPT-4); while others like GPT-3.5 need more spelled out commands. The key learning here is there are generally good practices surrounding prompt engineering, but you should also check out model-specific advice.

There are some excellent resources on specific model prompt engineering:

Also, we’ve created two free tools that can help you write better prompts for specific models:

You can dive into the specific guides linked above, or bookmark them for later and keep reading for more tips on creating great prompts.

Six tips for great prompt design

Tips for Great Prompt Design

We’ve already figured out that some models need specific prompt structures, but there are a few general rules that apply across the board. Let’s take a quick look at what those are.

1) Specific input = better response

When designing prompts, it's important to be crystal clear about what you want. Vague prompts will lead to vague or incorrect responses. Break down complex tasks into smaller, more manageable pieces if needed and chain your prompts.

Be as specific as you can about things like the desired format, level of detail in the response, tone and style.

2) Provide context to avoid hallucinations

No LLM can know everything, especially when it comes to specific details about your industry or domain. If you want the LLM to incorporate your unique insights, you’ll need to provide that context during inference.

In some cases, static context will do. Just write all the information you’d want the LLM to have when it’s answering a user query. In other cases, you might need to pass dynamic context, as the context lengths of various models can be limiting. Here’s more info on context length.

In that case, you can do follow RAG based approach, where you can dynamically add context in your prompts based on the user query. Read more about it here.

3) Separate your Instructions from Context

At this point, you can imagine that your prompts can become very lengthy — in such case you need to explicitly tell the model where to find specific information.

OpenAI suggests that you always start with the instruction, and then separate it with the rest of the elements using ### or “”” . See below for how to do it:

### Instructions ###
Summarize the main ideas for the given text.

### Text ###
{Context here}

4) Tell the model what to DO instead what NOT to do

In most cases, you'll notice the model adding extra information or unnecessary characters to your responses. Your first instinct might be to tell the model what not to do. But models don’t follow that logic — clear positive instructions works more reliably than telling the model what not to do.

For example, if you only want the final answer to a math problem, say, “Only output the final answer,” instead of, “Don’t include explanations.”

5) Assign a role

Models perform much better when they know their role. You should assign this role at the start of your prompt. For example: "You’re the world’s best content writer. You’ll write highly engaging articles on a given topic provided by the user." This helps guide the model to produce the desired results more effectively.

6) Add examples

Even with the best prompt design and context, the model may still struggle to replicate the level of reasoning you need for certain answers.

This usually happens for more complex tasks where the model actually need to see an example or think about the problem before it provides the answer.

In such cases we use specific prompt engineering techniques like one-shot, few shot, chain of thought prompting and others that we’ll cover in the next section.

Techniques

There are a few universal prompt engineering techniques that’ll prove useful for a grand majority of LLMs. Mastering these techniques will make you a better prompt engineer.

One-Shot

One-shot is a technique where an example response is provided to guide the model’s response. This enables the model to deliver the correct output format via mimicry, but also without extensive training.

Generate a recipe for banana pudding.

Example recipe: "Ingredients: Bread, Ham, Cheese, Butter, Sesame Seeds. Step 1: Heat the bread at medium heat in a toaster for 4 minutes ..."

One-shot contrasts with zero-shot, the default case where no example is provided, forcing the model to use intuition. In our experience, one-shot dramatically improves a model’s efficacy over zero-shot queries.

Few-Shot

Few-shot is a technique that’s similar to one-shot, but where multiple examples are provided to aid the model. The additional examples increases the likelihood that the model will understand the desired format of the output. It also prevents the model from overly mimicking a single example.

Generate a recipe for banana pudding.

Example recipe: "Ingredients: Bread, Ham, Cheese, Butter, Sesame Seeds. Step 1: Heat the bread at medium heat in a toaster for 4 minutes ..."

Another recipe: "Ingredients: Rice, Spring Onion, Garlic, Chicken. Step 1: Wash the rice thoroughly. Step 2: Dice the spring onion ..."

Similarly, we’ve found that few-shot queries deliver more reliable results — see more examples of how one-shot and few-shot compare.

CoT (Chain of Thought)

CoT, or Chain of Thought, is a prompting technique that encourages the model to break down complex problems into step-by-step processes. CoT is similar to One-Shot prompting; however, instead of just providing an example response, it also includes the intermediary reasoning.

Q: A train travels 20 miles east. Then, the train travels 
10 miles north. Finally, the train travels 10 miles east. 
How far east has the train traveled? 

A: 
(step) travels 20 miles east. 
(step) travels 10 miles north. 
(total) has traveled 20 miles east and 10 miles north. 
(step) travels 10 miles east. 
(total) has traveled 20 miles east, 10 miles east, and 10 miles north. 
(combine) 20 miles east + 10 miles east is 30 miles east. 
(total) has traveled 30 miles east and 10 miles north. 

Q: A train travels 15 miles west. Then, the train travels 
5 miles south. Finally, the train travels 5 miles west. 
How far west has the train traveled?

Our users have dramatically improved the efficacy of their prompts by using CoT for any logical query. It’s particularly helpful for math or math-adjacent problems.

We wrote a whole guide on chain of thought prompting — read it here.

Prompt Caching

Prompt caching is a technique that stores previously generated responses to avoid redundant API calls and reduce latency. Prompt caching is particularly useful for frequently asked questions or similar queries. It helps optimize costs and minimizes latency, dramatically improving response times.

Currently the only model that supports this option is Anthropic — read how to use it here.

Meta-Prompting

Meta-prompting is a prompt engineering technique where LLMs are entrusted to write a detailed prompt from a seed prompt. Meta-prompting is useful when writing a complete prompt is difficult, but the overall objective is clear (e.g. writing an easy-to-read recipe).

Meta-prompting is akin to a teacher asking a student to write an outline for their essay before writing the actual essay; it encourages LLMs to plan-ahead when producing content. It also enables humans to intervene if the generated prompt isn’t satisfactory.

Write a good prompt for writing an easy-to-follow recipe 
for creating a vanilla cake.

Meta-prompting might sound like AI is coming for prompt engineering jobs (with an incredibly twist of irony). But meta-prompting is more about leveraging LLMs to dynamically generate good prompts; it creates its own set of challenges. It’s akin to use an ORM to manage SQL—a layer of abstraction doesn’t eliminate challenges.

What we do know is that meta-prompting can work but it might require a lot of work on the initial prompt. Out of the box, LLMs aren’t really great prompt engineers — so we can’t 100% rely on their capabilities to produce good prompts for our AI systems in production.

RAG

RAG, however is something that powers most of the AI systems in production today. This is a technique that tackles the limited context windows of LLMs and takes advantage of LLMs ability to determine similarity.

High level overview of RAG
High level overview of RAG

With RAG, the LLM is first used to generate embeddings—vectorized representations of your own domain data. These embeddings are stored in a vector database (e.g. pg_vector, Pinecone, Chroma, or OpenAI’s native contextual memory). Then, when a query is made, vector search is run against the database to retrieve contextual data to the user query. That data is then merged with the prompt to generate a context-aware response.

You don’t need to hear our insights to know about the importance of RAG. It is, arguably, the second-biggest innovation (after LLMs) in the world of AI. And we do believe that mastering RAG is an essential for any prompt engineer — learn how to do it here.

Engineering the model parameters

Now, let’s assume you have the strongest prompts and all the right context, let’s take you further by adjusting model parameters like temperature, logit bias, top_p and others to refine how the model responds to user queries.

These LLM parameters can be tweaked in every API request that you send to the model. Let’s quickly cover the most used ones and how they can help:

  • JSON_mode: Ensures the model will always output valid JSON.
  • Presence Penalty: Prevents the model from repeating a word.
  • Frequency Penalty: Prevents the model to repeat a word that was previously used.
  • Structured Outputs: Ensures the model will always output valid JSON that adheres to a provided JSON Schema.
  • Function Calling: Ensures the model will create arguments for functions you’ve defined in the prompt.
  • Logprobs: Show how likely each word (token) is to appear in a sentence based on the words that came before it.
  • Max_tokens: Specifies the maximum number of tokens that can be generated in the output.
  • Logit Bias: Control whether the model should output specific tokens or not.
  • Seed: Receive (mostly) consistent outputs.
  • Stop Sequence: Stop generating tokens at a desired point, such as the end of a sentence or a list.
  • Streaming: Send tokens to the UI in batches so that you can give the appearance of generation like ChatGPT.
  • Temperature: Control how creative the LLM is with the outputs.
  • Top_k: Set the value of “k” for the model to consider only the k most likely tokens.
  • Top_p (Nucleus Sampling): Generate tokens until the cumulative probability exceeds the chosen threshold.

To learn how to use each of these and the use-cases where they're most effective - take a look at our LLM Parameter Guide.

Now that we've covered the basics, let's dive into some engineering courses that can help you take this knowledge to the next level and sharpen your skills even further.

Educational courses

There are a few good online courses that you can take to become a better prompt engineer. These courses cover the necessary techniques for writing effective prompts that generate accurate results. Many of them are beginner friendly, with no prior-development experience needed.

There are some courses that we recommend:

How to Get a Prompt Engineering Job?

Getting better at your job with a prompt engineering skill under your belt is very useful. But if you want to get a job as a prompt engineer — you need to build up your practical portfolio and have a good understanding of common prompt design techniques. Some of the prompt engineering roles will require more technical capabilities than others — but it’s usually good to satisfy the following parameters:

  1. Understand the basics of AI (e.g. How do they work, Prompt Design, Available models)
  2. Know how to prompt LLM models (format, techniques — everything we covered above)
  3. Practical portfolio - Show what you’ve built and document how you did it
  4. Have a creative hacker spirit and love solving puzzles
  5. Master Python — having coding knowledge will give you better chances to get any prompt engineering role. There are some frameworks that allow for more visual coding — like Vellum.

Here are some job boards and open prompt engineering positions:

Conclusion

As a closing thought, let’s remind ourselves that LLMs are a complex thing. They might be user-friendly and easy to try, but mastering them is a complex science. Admittedly, there is a wishy-wooshy nature to it—LLMs are black boxes, so techniques to conquer them are based on trial-and-error.

But it’s similar to other fields in science, where empirical data is king. And, most importantly, it shows no signs of going away anytime soon.

ABOUT THE AUTHOR
Anita Kirkovska
Founding Growth Lead

An AI expert with a strong ML background, specializing in GenAI and LLM education. A former Fulbright scholar, she leads Growth and Education at Vellum, helping companies build and scale AI products. She conducts LLM evaluations and writes extensively on AI best practices, empowering business leaders to drive effective AI adoption.

ABOUT THE reviewer

No items found.
lAST UPDATED
Aug 5, 2025
share post
Expert verified
Related Posts
Guides
October 21, 2025
15 min
AI transformation playbook
LLM basics
October 20, 2025
8 min
The Top Enterprise AI Automation Platforms (Guide)
LLM basics
October 10, 2025
7 min
The Best AI Workflow Builders for Automating Business Processes
LLM basics
October 7, 2025
8 min
The Complete Guide to No‑Code AI Workflow Automation Tools
All
October 6, 2025
6 min
OpenAI's Agent Builder Explained
Product Updates
October 1, 2025
7
Vellum Product Update | September
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component, Use {{general-cta}}

Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component  [For enterprise], Use {{general-cta-enterprise}}

The best AI agent platform for enterprises
Production-grade rigor in one platform: prompt builder, agent sandbox, and built-in evals and monitoring so your whole org can go AI native.

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Button Text

LLM leaderboard CTA component. Use {{llm-cta}}

Check our LLM leaderboard
Compare all open-source and proprietary model across different tasks like coding, math, reasoning and others.

Case study CTA component (ROI)

40% cost reduction on AI investment
Learn how Drata’s team uses Vellum and moves fast with AI initiatives, without sacrificing accuracy and security.

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

6+ months on engineering time saved
Learn how CourseMojo uses Vellum to enable their domain experts to collaborate on AI initiatives, reaching 10x of business growth without expanding the engineering team.

Case study CTA component (Time to value) = {{time-cta}}

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.
New CTA
Sorts the trigger and email categories

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

Prior authorization navigator
Automate the prior authorization process for medical claims.
Healthcare explanations of a patient-doctor match
Summarize why a patient was matched with a specific provider.

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

AI agent for claims review
Review healthcare claims, detect anomalies and benchmark pricing.
Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

E-commerce shopping agent
Check order status, manage shopping carts and process returns.

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.
LinkedIn Content Planning Agent
Create a 30-day Linkedin content plan based on your goals and target audience.

Dynamic template box for Sales, Use {{sales}}

Start with some of these sales examples

Research agent for sales demos
Company research based on Linkedin and public data as a prep for sales demo.

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.
AI legal research agent
Comprehensive legal research memo based on research question, jurisdiction and date range.

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

Turn LinkedIn Posts into Articles and Push to Notion
Convert your best Linkedin posts into long form content.

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

No items found.

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Q&A RAG Chatbot with Cohere reranking
Trust Center RAG Chatbot
Read from a vector database, and instantly answer questions about your security policies.

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

Financial Statement Review Workflow
Extract and review financial statements and their corresponding footnotes from SEC 10-K filings.
Q&A RAG Chatbot with Cohere reranking

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.
Retail pricing optimizer agent
Analyze product data and market conditions and recommend pricing strategies.
Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
Healthcare explanations of a patient-doctor match
Summarize why a patient was matched with a specific provider.
E-commerce shopping agent
Check order status, manage shopping carts and process returns.
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Build AI agents in minutes for

{{industry_name}}

Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.
Prior authorization navigator
Automate the prior authorization process for medical claims.
Population health insights reporter
Combine healthcare sources and structure data for population health management.
Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.
Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.
Legal RAG chatbot
Chatbot that provides answers based on user queries and legal documents.

Case study results overview (usually added at top of case study)

What we did:

1-click

This is some text inside of a div block.

28,000+

Separate vector databases managed per tenant.

100+

Real-world eval tests run before every release.