Here's the architecture in plain English. No whiteboards required.
Your assistant is built in layers, and each one has a clear job:
┌─────────────────────────────────────────────┐ │ YOU (the human) │ ├─────────────────────────────────────────────┤ │ CHANNELS │ │ Desktop App · Voice · (future) │ ├─────────────────────────────────────────────┤ │ ASSISTANT CORE │ │ Personality · Memory · Decision-making │ ├─────────────────────────────────────────────┤ │ SKILLS │ │ Email · Calendar · Weather · Browser ... │ ├─────────────────────────────────────────────┤ │ TOOLS │ │ file_read · bash · web_search · browser │ │ navigate · memory_save · ui_show ... │ ├─────────────────────────────────────────────┤ │ WORKSPACE │ │ SOUL.md · USER.md · IDENTITY.md · skills/ │ └─────────────────────────────────────────────┘
When you type something, here's what happens:
The key insight: your assistant isn't one monolithic thing. It's a conversation layer, a thinking layer, an action layer, and a memory layer, all working together. That's what makes it feel like more than a chatbot.
| Component | Where it runs | Why it matters |
|---|---|---|
| Desktop app | Your machine | What you see and interact with |
| Assistant core | Your machine (Docker container) | Processes messages, manages state, coordinates tools |
| AI model | Cloud (Anthropic, etc.) | The “thinking” part. Your prompts are sent here. |
| Workspace files | Your machine (~/.vellum/) | Your assistant's persistent brain. Local, readable, yours. |
| Tool execution | Your machine (sandbox or host) | Where actions actually happen |
| Connected services | Cloud (Gmail, Slack, etc.) | External services your assistant talks to on your behalf |
🫣 The trade-off, again: The workspace and tools are fully local. The thinking happens in the cloud. Your prompts, context, and conversation history are sent to the AI model provider. We keep coming back to this because transparency is one of our principles, not because we're trying to scare you.