MrMemory Documentation
Everything you need to give your AI agents persistent memory.
Quickstart
Install the SDK for your language:
Python
pip install mrmemory
# With LangChain/LangGraph support:
pip install mrmemory[langchain]
TypeScript / Node
npm install memorymr
Authentication
All API requests require an API key passed via the Authorization header:
Authorization: Bearer amr_sk_your_key_here
Get your API key from the dashboard after signing up.
API Endpoints
Base URL: https://amr-memory-api.fly.dev
Store a new memory. Content is automatically embedded and indexed for semantic search.
curl -X POST https://amr-memory-api.fly.dev/v1/memories \
-H "Authorization: Bearer amr_sk_..." \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode", "tags": ["preferences"], "namespace": "settings", "agent_id": "my-bot", "ttl_seconds": 86400}'
ttl_seconds is optional — memories auto-expire and get pruned. Max 10,000 memories per namespace (Starter).
Manually trigger cleanup of expired memories.
curl -X POST https://amr-memory-api.fly.dev/v1/memories/prune \
-H "Authorization: Bearer amr_sk_..."
Semantic search across stored memories. Returns results ranked by cosine similarity.
curl "https://amr-memory-api.fly.dev/v1/memories/recall?query=user+preferences&namespace=settings&agent_id=my-bot&limit=5&threshold=0.7" \
-H "Authorization: Bearer amr_sk_..."
Delete a specific memory by ID.
curl -X DELETE https://amr-memory-api.fly.dev/v1/memories/mem_abc123 \
-H "Authorization: Bearer amr_sk_..."
List all stored memories (paginated).
curl "https://amr-memory-api.fly.dev/v1/memories?namespace=settings&limit=20&offset=0" \
-H "Authorization: Bearer amr_sk_..."
Real-time WebSocket for live memory events between agents. Subscribe to namespaces/agents and receive instant notifications.
// Connect
ws = new WebSocket("wss://amr-memory-api.fly.dev/v1/ws?token=amr_sk_...")
// Subscribe to events
ws.send(JSON.stringify({"type": "subscribe", "namespace": "weather-trading"}))
// Receive real-time events
// {"type": "memory.created", "memory": {...}, "namespace": "...", "agent_id": "..."}
LLM-powered auto-remember. Send conversation messages and the server extracts facts, deduplicates, and stores them automatically. Supports async (fire-and-forget) and sync modes. BYOK supported.
curl -X POST https://amr-memory-api.fly.dev/v1/memories/auto \
-H "Authorization: Bearer amr_sk_..." \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "I love hiking and my favorite language is Rust"},
{"role": "assistant", "content": "Great choices! Rust is fantastic."}
],
"namespace": "default",
"sync": true
}'
Set "sync": false (default) for fire-and-forget — returns a job_id immediately. Pass "llm_api_key" for BYOK.
Compress related memories in a namespace. Groups semantically similar memories and merges each group into a single, denser memory using LLM summarization.
curl -X POST https://amr-memory-api.fly.dev/v1/memories/compress \
-H "Authorization: Bearer amr_sk_..." \
-H "Content-Type: application/json" \
-d '{
"namespace": "default",
"threshold": 10,
"similarity_threshold": 0.75,
"sync": true,
"dry_run": false
}'
Set "dry_run": true to preview what would be compressed. threshold sets minimum memory count before compression triggers.
Python SDK
from mrmemory import AMR
client = AMR("amr_sk_...", agent_id="my-bot", namespace="user-prefs")
# Store a memory
client.remember("User prefers dark mode and vim keybindings", tags=["preferences"])
# Semantic recall
memories = client.recall("What are the user's preferences?")
for m in memories:
print(m.content, m.similarity)
# Forget a memory
client.forget(memories[0].id)
Auto-Remember
# Extract memories from conversations automatically
result = client.auto_remember([
{"role": "user", "content": "I love hiking and my favorite language is Rust"},
{"role": "assistant", "content": "Great choices!"},
], sync=True)
# → {"extracted": 2, "created": 2, "duplicates_skipped": 0, ...}
Compression
# Compress related memories into denser representations
result = client.compress(namespace="default", sync=True, dry_run=True) # preview
result = client.compress(namespace="default", sync=True) # actually compress
# → {"before_count": 50, "after_count": 28, "groups_compressed": 12, ...}
LangChain / LangGraph Integration
pip install mrmemory[langchain]
from mrmemory.langchain import MrMemoryCheckpointer, MrMemoryStore
from langgraph.graph import StateGraph
checkpointer = MrMemoryCheckpointer(api_key="amr_sk_...")
store = MrMemoryStore(api_key="amr_sk_...")
# Drop-in replacement for any LangGraph checkpointer
graph = StateGraph(...).compile(checkpointer=checkpointer, store=store)
Async
from mrmemory import AsyncAMR
async with AsyncAMR("amr_sk_...", agent_id="my-bot") as client:
await client.remember("User prefers dark mode")
memories = await client.recall("preferences")
TypeScript SDK
import { AMR } from 'memorymr'
const amr = new AMR('amr_sk_...')
// Store a memory
await amr.remember('User prefers dark mode and vim keybindings')
// Semantic recall
const memories = await amr.recall('What are the user preferences?')
for (const m of memories) {
console.log(m.content, m.score)
}
// Forget a memory
await amr.forget(memories[0].id)
Self-Edit Tools
Let your agents manage their own memory — update, prune, and merge.
Update a Memory
PATCH /v1/memories/{id}
Content-Type: application/json
Authorization: Bearer amr_sk_...
{
"content": "Updated preference: user now prefers light mode",
"tags": ["preference", "ui"]
}
Re-embeds automatically when content changes. Partial updates supported (send only fields you want to change).
Bulk Delete Outdated
DELETE /v1/memories/outdated
Content-Type: application/json
{
"older_than_seconds": 2592000,
"tags": ["ephemeral"],
"namespace": "chat-session",
"dry_run": true
}
Supports dry_run: true to preview what would be deleted without actually removing anything.
Merge Memories
POST /v1/memories/merge
Content-Type: application/json
{
"memory_ids": ["mem_abc123", "mem_def456", "mem_ghi789"],
"namespace": "default"
}
Merges 2-50 memories into one using LLM summarization. Source memories are deleted. The merged memory gets is_compressed: true and merged_from tracking. Pass content to override the LLM summary.
Python SDK
# Update
client.update("mem_abc123", content="New content", tags=["updated"])
# Bulk delete old memories
result = client.delete_outdated(older_than_seconds=86400*30, dry_run=True)
print(f"Would delete {result['deleted']} memories")
# Merge
merged = client.merge(["mem_abc123", "mem_def456"])
print(merged.content) # LLM-summarized
TypeScript SDK
// Update
await amr.update('mem_abc123', { content: 'New content', tags: ['updated'] })
// Bulk delete
const result = await amr.deleteOutdated({ olderThanSeconds: 86400 * 30, dryRun: true })
// Merge
const merged = await amr.merge(['mem_abc123', 'mem_def456'])
Self-Hosting
Run MrMemory on your own infrastructure with Docker Compose. Full stack: Rust API + PostgreSQL + Qdrant vector DB.
Quick Start
# Clone the repo
git clone https://github.com/masterdarren23/mrmemory.git
cd mrmemory/amr-project
# Configure
cp .env.example .env
# Edit .env — add your OPENAI_API_KEY
# Launch
docker compose up -d
# API is now at http://localhost:8080
curl http://localhost:8080/v1/health
Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Your Agent │────▶│ MrMemory │────▶│ Qdrant │
│ (SDK/REST) │ │ API (Rust) │ │ (Vectors) │
└─────────────┘ └──────┬───────┘ └─────────────┘
│
┌──────▼───────┐
│ PostgreSQL │
│ (Metadata) │
└──────────────┘
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY | Yes | — | For embeddings (text-embedding-3-small) |
DATABASE_URL | Yes | — | PostgreSQL connection string |
QDRANT_URL | No | http://localhost:6334 | Qdrant REST endpoint |
EMBEDDING_MODEL | No | text-embedding-3-small | OpenAI embedding model |
LISTEN_ADDR | No | 0.0.0.0:8080 | API listen address |
Rust SDK
The MrMemory API is a Rust/Axum server. For Rust agents, use the REST API directly with reqwest:
use reqwest::Client;
use serde_json::json;
let client = Client::new();
// Remember
client.post("https://amr-memory-api.fly.dev/v1/memories")
.bearer_auth("amr_sk_...")
.json(&json!({
"content": "User prefers Rust over Python",
"tags": ["preference", "language"],
"namespace": "default"
}))
.send().await?;
// Recall
let res = client.get("https://amr-memory-api.fly.dev/v1/memories/recall")
.bearer_auth("amr_sk_...")
.query(&[("query", "programming preferences"), ("limit", "5")])
.send().await?
.json::<serde_json::Value>().await?;
println!("{}", res);
CrewAI Integration
Add persistent memory to your CrewAI agents:
from crewai import Agent, Task, Crew
from mrmemory import AMR
memory = AMR("amr_sk_...", agent_id="researcher", namespace="research")
# Store context before task
memory.remember("Project deadline is March 30th", tags=["project", "deadline"])
# Create agent with memory-aware instructions
researcher = Agent(
role="Research Analyst",
goal="Research and report findings",
backstory="You have access to long-term memory via MrMemory.",
tools=[],
)
# Recall relevant context in your tool or callback
def recall_context(query: str) -> str:
results = memory.recall(query, limit=5)
return "\n".join(m.content for m in results)
# Use in task description
context = recall_context("project deadlines")
task = Task(
description=f"Given this context:\n{context}\n\nAnalyze the timeline.",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
# Save results back to memory
memory.remember(f"Research findings: {result}", tags=["findings"])
Pricing
| Plan | Price | Memories | API Calls/mo |
|---|---|---|---|
| Starter | $5/mo | 10,000 | 50,000 |
| Pro | $25/mo | 100,000 | 500,000 |