mnemonic is a local MCP server that stores memories as plain markdown in git. Project-scoped semantic search. No database. No always-on service.
Your AI assistant forgets everything between sessions. mnemonic fixes that — without a database or a SaaS subscription.
Every memory is a markdown file with YAML frontmatter — designed for removability from day one. Give it a try; we think you'll like it enough to stay. And if you ever do leave, all the knowledge you've gathered is yours: plain markdown, independent of mnemonic, no strings attached.
Memories belong to projects. When you're in a repo, relevant notes surface first with a similarity boost. Global memories stay accessible but don't crowd out context.
No Postgres to babysit. Every remember, update, and consolidate creates a semantic git commit — your decision log and implementation plans travel with the code in the same history. Pushes are controlled so project-vault writes do not fail on unpublished branches by default. History is inspectable, conflicts are isolated to individual files, and sync works across machines.
Embeddings generated via Ollama run entirely on your machine — no data leaves your network. recall finds the right note even when you don't remember the exact words; grep and file tools only match strings you already know. Embeddings are gitignored and recomputed on each machine.
Project memories live in .mnemonic/ inside your repo. Commit them alongside code so the whole team benefits from captured decisions.
The MCP server starts on demand via stdio. Claude Code, Cursor, and other MCP clients invoke it per-session. Zero background processes when you're not using it.
Pending migrations are visible per vault, dry-runs are built in, and failed runs roll staged note writes back instead of leaving half-migrated memories behind.
Mark notes temporary for plans and WIP, permanent for decisions worth keeping. Decision and summary notes rank higher automatically as they accumulate references and relationships — no manual tagging required. Pin any note as a session anchor to always bring it to the top.
Opt-in temporal mode adds compact git-backed history to recall results. Each change is described in plain language — "Expanded the note with additional detail", "Connected this note to related work" — and the overall arc is summarised: "The core decision remained stable while rationale expanded." See how a decision formed without wading through raw diffs.
mnemonic routes memories between a private main vault and a shareable project vault based on where you are and what you're storing.
Private global memories stored in ~/mnemonic-vault — its own git repo. Cross-project knowledge, user preferences, early brainstorming before a repo exists, and anything you don't want committed to a project repo.
Project memories committed into <git-root>/.mnemonic/ alongside your source code. Architecture decisions, bug fix context, and tribal knowledge travel with the repo.
No project yet? Capture ideas in the main vault first, then move only project-specific notes into your repo once it exists.
Before embedding, notes are projected into clean structured text: title, lifecycle, tags, summary, and h1–h3 headings — no prose noise. Semantic search finds the right note even when you don't remember exact words. When results are weak, lexical matching rescues by scanning projections for keyword overlap. Inferred role and importance further boost scores so summary and decision notes surface naturally.
All tools are text-first and optimized for LLM consumption — compact, semantically explicit, no unnecessary structure. On protected branches, write actions pause and ask before committing unless you choose a different policy.
mnemonic requires Node.js 18+ and Ollama for local embeddings. No other services needed.
ollama pull nomic-embed-text-v2-moe
qwen3-embedding:0.6b is also a viable alternative for longer-context notes:
ollama pull qwen3-embedding:0.6b
No code changes are required; set EMBED_MODEL=qwen3-embedding:0.6b.
Pick one:
npm install @danielmarbach/mnemonic-mcp
brew tap danielmarbach/mnemonic-mcp https://github.com/danielmarbach/mnemonic brew install mnemonic-mcp
docker pull danielmarbach/mnemonic-mcp:latest
Pre-built for linux/amd64 and linux/arm64. Tagged with the release version and latest.
npm install npm run build
{
"mcpServers": {
"mnemonic": {
"command": "npx",
"args": ["@danielmarbach/mnemonic-mcp"],
"env": {
"VAULT_PATH": "/Users/you/mnemonic-vault"
}
}
}
}
{
"mcpServers": {
"mnemonic": {
"command": "docker",
"args": [
"compose", "-f",
"/path/to/mnemonic/compose.yaml",
"run", "--rm", "mnemonic"
]
}
}
}
Add to ~/.config/opencode/opencode.json or project-local opencode.json.
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"mnemonic": {
"type": "local",
"command": ["npx", "@danielmarbach/mnemonic-mcp"],
"environment": {
"VAULT_PATH": "/Users/you/mnemonic-vault"
}
}
}
}
Add to ~/.codex/config.toml or project-local .codex/config.toml.
[mcp_servers.mnemonic] command = "npx" args = ["@danielmarbach/mnemonic-mcp"] [mcp_servers.mnemonic.env] VAULT_PATH = "/Users/you/mnemonic-vault"
Open the command palette and search for MCP: Open User Configuration, or edit mcp.json directly, then add the snippet below.
{
"servers": {
"mnemonic": {
"command": "npx",
"args": ["@danielmarbach/mnemonic-mcp"],
"env": {
"VAULT_PATH": "/Users/YOU/mnemonic-vault"
}
}
}
}
| Variable | Default | Description |
|---|---|---|
VAULT_PATH |
~/mnemonic-vault | Path to your markdown vault |
OLLAMA_URL |
http://localhost:11434 | Ollama server URL |
EMBED_MODEL |
nomic-embed-text-v2-moe | Ollama embedding model |
DISABLE_GIT |
false | Set true to skip all git operations |
The runtime is compatible with other Ollama embedding models that support /api/embed. For example, qwen3-embedding:0.6b works as a drop-in EMBED_MODEL override and may be preferable for longer-context notes.
The main vault's ~/mnemonic-vault/config.json holds machine-local settings you can edit by hand. Two fields are user-tunable:
| Field | Default | Description |
|---|---|---|
reindexEmbedConcurrency |
4 | Parallel embedding requests during sync (capped 1–16) |
mutationPushMode |
main-only | all, main-only, or none — controls when writes auto-push to git |
projectMemoryPolicies and projectIdentityOverrides are written automatically by MCP tools — no need to edit them by hand.
{
"reindexEmbedConcurrency": 8,
"mutationPushMode": "none"
}
Mnemonic's tools are self-describing — each includes "use when" / "do not use when" guidance, behavioral annotations, and typed schemas. Most models will use them correctly from tool metadata alone.
For on-demand workflow guidance, use the mnemonic-workflow-hint MCP prompt. It stays compact for weaker models: start by checking what is already known, inspect before writing, prefer updating existing memory, capture new memory only when nothing matches, and clean up related notes when work is done. It also keeps resumed work grounded in project orientation first, with short-lived work-in-progress context recovered only afterward.
Vault maintenance and onboarding without an MCP client. Run them directly from the shell.
Apply pending schema migrations to your vaults. Always preview with --dry-run first — failed runs roll staged writes back automatically.
# Preview what would change mnemonic migrate --dry-run # Apply and auto-commit mnemonic migrate # List available migrations mnemonic migrate --list # Limit to one project vault mnemonic migrate --cwd=/path/to/project
Import Claude Code auto-memory into your vault. Each ## heading in ~/.claude/projects/<project>/memory/*.md becomes a separate mnemonic note — independently searchable via recall. Duplicate titles are skipped, so it's safe to re-run.
# Preview what would be imported mnemonic import-claude-memory --dry-run # Import for the current project mnemonic import-claude-memory # Specific project path mnemonic import-claude-memory --cwd=/path/to/project # Then embed and push mnemonic sync
The project stores its design log in its own .mnemonic/ vault. Every architectural call and deliberate trade-off lives there as a real note, captured through the MCP tools while building the project.
Similarity boost, not hard filter. recall gives project notes +0.15 cosine similarity rather than excluding global notes. Global memories (user prefs, cross-project patterns) remain accessible in project context.
No auto-relationship via LLM. Decided against using a local model to auto-build relationships. Small models lack session context, produce spurious edges, and corrupt the graph silently. Instead: agent instructions prompt relate immediately after remember while session context is warm.
Embeddings gitignored. Derived data — always recomputable. Committing them causes unresolvable merge conflicts (can't merge float arrays).
Do not implement the full runtime project-context loading/unloading architecture yet. Implement a lightweight recall heuristic instead: when scope is all, prefer current-project matches first and widen to global matches only if needed to fill the limit.
Rationale: this captures most of the practical benefit without introducing cache lifecycle, invalidation, active-project state, or long-lived runtime complexity. Keep the broader dynamic-loading plan as a future scaling option.
A module-level singleton in src/cache.ts caches notes, embeddings, and projections per vault for the duration of the MCP session. Notes and embeddings are co-loaded in a single Promise.all() pass on first access, so both are always warm after one I/O round trip.
The cache is keyed by project ID and invalidated on every write-path tool. All cache functions are fail-soft: errors return undefined and callers fall back to direct storage. Instrumented with [cache:hit/miss/build/invalidate/fallback] events and per-tool timing.
Four post-processing layers added on top of semantic recall, each additive and fail-soft: provenance and confidence metadata from git; opt-in temporal history enriched after semantic selection; projection-based embedding from structured note representations; bounded 1-hop relationship previews scored by project affinity, anchor status, and recency.
Core recall ranking is unaffected by any layer. Failures degrade gracefully to basic results.
Temporal mode now explains what kind of change happened, not just that one occurred. Each history entry is classified into one of eight semantic categories — create, refine, expand, clarify, connect, restructure, reverse, unknown — using structural signals: additions/deletions ratio, churn, relationship changes, and commit message prefixes.
Classification is language-independent. Conservative thresholds prefer unknown over misclassification. Raw diffs are intentionally excluded from default output.
Every architectural call, trade-off, and lesson — committed to the repo inside .mnemonic/notes/ as plain markdown. Human-readable. Open the folder and read them directly.