Docs / Key Concepts / Memory & Context

Memory & Context

Your assistant remembers you. Not just within a single conversation, but across days, weeks, and months. Here's how.

Two types of memory

Your assistant has two ways of remembering things:

1. Workspace files (structured, persistent)

These are the files in ~/.vellum/workspace/ that store organized information:

  • USER.md — Facts about you (name, location, preferences, projects)
  • SOUL.md — Behavioral rules and personality
  • IDENTITY.md — Your assistant's own identity

Your assistant reads these files at the start of every conversation. They're the baseline context that makes it feel like it knows you. Your assistant also updates them as it learns new things about you.

2. Long-term memory (searchable, associative)

Beyond workspace files, your assistant has a memory system that works more like... well, memory. It can:

  • Save facts — “Marina is working on Project Moonshot”
  • Save preferences — “Marina hates morning meetings”
  • Save decisions — “We decided to use TypeScript for the new skill”
  • Save learnings — “The Gmail API paginates at 100 results”

These memories are searchable. When you mention Project Moonshot three weeks later, your assistant can recall everything it saved about it. You don't have to re-explain context.

How it decides what to remember

Your assistant doesn't save everything. That would be noisy and useless. It saves things when:

  • You share a personal fact or preference (“I'm a vegetarian”)
  • You make a decision worth tracking (“Let's go with option B”)
  • It learns something non-obvious from a task (“This API requires a specific header format”)
  • You correct its behavior (“Don't be so formal in emails”)
  • Something seems important for future interactions

It's designed to err on the side of remembering too little rather than too much. If you want it to remember something specific, just say so:

“Remember that my dentist appointment is on March 15th.”
“Save this: the project deadline is end of Q2.”

How context works in a conversation

Every time you send a message, your assistant assembles a bundle of context:

  1. The current conversation — Everything you've said and it's responded in this session
  2. Workspace files — SOUL.md, USER.md, IDENTITY.md (read fresh each turn)
  3. Relevant memories — It searches its long-term memory for anything related to your message
  4. Skill instructions — If a skill is loaded, its instructions are included
  5. Your message — What you just said

All of this gets sent to the AI model together. That's how your assistant can respond with awareness of who you are, what you've discussed before, and what's relevant right now.

Managing your memories

You have full control over what your assistant remembers:

  • Ask what it knows: “What do you remember about me?” or “What do you know about Project Moonshot?”
  • Correct mistakes: “Actually, I prefer tea, not coffee.” (It'll update the memory)
  • Delete memories: “Forget what I told you about my dentist appointment.”
  • Read the files directly: Open USER.md or any workspace file in a text editor and change whatever you want.

Privacy note

Memories are stored locally on your machine. They're part of your workspace. They don't get synced to a cloud, shared with other users, or used to train AI models.

However, memories are included in the context sent to the AI model when they're relevant to a conversation. This is how your assistant “thinks” with your context. It's the same trade-off we keep mentioning: local storage, cloud thinking.

🫣 Practically speaking: If you tell your assistant something sensitive, it may save it as a memory and include it in future AI model calls when relevant. If that concerns you, you can ask it to forget specific things, or edit your workspace files directly to remove anything you don't want persisted.