MrMemory Documentation

Everything you need to give your AI agents persistent memory.

Quickstart

Install the SDK for your language:

Python

pip install mrmemory

# With LangChain/LangGraph support:
pip install mrmemory[langchain]

TypeScript / Node

npm install memorymr

Authentication

All API requests require an API key passed via the Authorization header:

Authorization: Bearer amr_sk_your_key_here

Get your API key from the dashboard after signing up.

API Endpoints

Base URL: https://amr-memory-api.fly.dev

POST/v1/memories

Store a new memory. Content is automatically embedded and indexed for semantic search.

curl -X POST https://amr-memory-api.fly.dev/v1/memories \
  -H "Authorization: Bearer amr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers dark mode", "tags": ["preferences"], "namespace": "settings", "agent_id": "my-bot", "ttl_seconds": 86400}'

ttl_seconds is optional — memories auto-expire and get pruned. Max 10,000 memories per namespace (Starter).

POST/v1/memories/prune

Manually trigger cleanup of expired memories.

curl -X POST https://amr-memory-api.fly.dev/v1/memories/prune \
  -H "Authorization: Bearer amr_sk_..."
GET/v1/memories/recall?query=...

Semantic search across stored memories. Returns results ranked by cosine similarity.

curl "https://amr-memory-api.fly.dev/v1/memories/recall?query=user+preferences&namespace=settings&agent_id=my-bot&limit=5&threshold=0.7" \
  -H "Authorization: Bearer amr_sk_..."
DELETE/v1/memories/:id

Delete a specific memory by ID.

curl -X DELETE https://amr-memory-api.fly.dev/v1/memories/mem_abc123 \
  -H "Authorization: Bearer amr_sk_..."
GET/v1/memories

List all stored memories (paginated).

curl "https://amr-memory-api.fly.dev/v1/memories?namespace=settings&limit=20&offset=0" \
  -H "Authorization: Bearer amr_sk_..."
WS/v1/ws?token=amr_sk_...

Real-time WebSocket for live memory events between agents. Subscribe to namespaces/agents and receive instant notifications.

// Connect
ws = new WebSocket("wss://amr-memory-api.fly.dev/v1/ws?token=amr_sk_...")

// Subscribe to events
ws.send(JSON.stringify({"type": "subscribe", "namespace": "weather-trading"}))

// Receive real-time events
// {"type": "memory.created", "memory": {...}, "namespace": "...", "agent_id": "..."}
POST/v1/memories/auto

LLM-powered auto-remember. Send conversation messages and the server extracts facts, deduplicates, and stores them automatically. Supports async (fire-and-forget) and sync modes. BYOK supported.

curl -X POST https://amr-memory-api.fly.dev/v1/memories/auto \
  -H "Authorization: Bearer amr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "I love hiking and my favorite language is Rust"},
      {"role": "assistant", "content": "Great choices! Rust is fantastic."}
    ],
    "namespace": "default",
    "sync": true
  }'

Set "sync": false (default) for fire-and-forget — returns a job_id immediately. Pass "llm_api_key" for BYOK.

POST/v1/memories/compress

Compress related memories in a namespace. Groups semantically similar memories and merges each group into a single, denser memory using LLM summarization.

curl -X POST https://amr-memory-api.fly.dev/v1/memories/compress \
  -H "Authorization: Bearer amr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "namespace": "default",
    "threshold": 10,
    "similarity_threshold": 0.75,
    "sync": true,
    "dry_run": false
  }'

Set "dry_run": true to preview what would be compressed. threshold sets minimum memory count before compression triggers.

Python SDK

from mrmemory import AMR

client = AMR("amr_sk_...", agent_id="my-bot", namespace="user-prefs")

# Store a memory
client.remember("User prefers dark mode and vim keybindings", tags=["preferences"])

# Semantic recall
memories = client.recall("What are the user's preferences?")
for m in memories:
    print(m.content, m.similarity)

# Forget a memory
client.forget(memories[0].id)

Auto-Remember

# Extract memories from conversations automatically
result = client.auto_remember([
    {"role": "user", "content": "I love hiking and my favorite language is Rust"},
    {"role": "assistant", "content": "Great choices!"},
], sync=True)
# → {"extracted": 2, "created": 2, "duplicates_skipped": 0, ...}

Compression

# Compress related memories into denser representations
result = client.compress(namespace="default", sync=True, dry_run=True)  # preview
result = client.compress(namespace="default", sync=True)  # actually compress
# → {"before_count": 50, "after_count": 28, "groups_compressed": 12, ...}

LangChain / LangGraph Integration

pip install mrmemory[langchain]
from mrmemory.langchain import MrMemoryCheckpointer, MrMemoryStore
from langgraph.graph import StateGraph

checkpointer = MrMemoryCheckpointer(api_key="amr_sk_...")
store = MrMemoryStore(api_key="amr_sk_...")

# Drop-in replacement for any LangGraph checkpointer
graph = StateGraph(...).compile(checkpointer=checkpointer, store=store)

Async

from mrmemory import AsyncAMR

async with AsyncAMR("amr_sk_...", agent_id="my-bot") as client:
    await client.remember("User prefers dark mode")
    memories = await client.recall("preferences")

TypeScript SDK

import { AMR } from 'memorymr'

const amr = new AMR('amr_sk_...')

// Store a memory
await amr.remember('User prefers dark mode and vim keybindings')

// Semantic recall
const memories = await amr.recall('What are the user preferences?')
for (const m of memories) {
  console.log(m.content, m.score)
}

// Forget a memory
await amr.forget(memories[0].id)

Self-Edit Tools

Let your agents manage their own memory — update, prune, and merge.

Update a Memory

PATCH /v1/memories/{id}
Content-Type: application/json
Authorization: Bearer amr_sk_...

{
  "content": "Updated preference: user now prefers light mode",
  "tags": ["preference", "ui"]
}

Re-embeds automatically when content changes. Partial updates supported (send only fields you want to change).

Bulk Delete Outdated

DELETE /v1/memories/outdated
Content-Type: application/json

{
  "older_than_seconds": 2592000,
  "tags": ["ephemeral"],
  "namespace": "chat-session",
  "dry_run": true
}

Supports dry_run: true to preview what would be deleted without actually removing anything.

Merge Memories

POST /v1/memories/merge
Content-Type: application/json

{
  "memory_ids": ["mem_abc123", "mem_def456", "mem_ghi789"],
  "namespace": "default"
}

Merges 2-50 memories into one using LLM summarization. Source memories are deleted. The merged memory gets is_compressed: true and merged_from tracking. Pass content to override the LLM summary.

Python SDK

# Update
client.update("mem_abc123", content="New content", tags=["updated"])

# Bulk delete old memories
result = client.delete_outdated(older_than_seconds=86400*30, dry_run=True)
print(f"Would delete {result['deleted']} memories")

# Merge
merged = client.merge(["mem_abc123", "mem_def456"])
print(merged.content)  # LLM-summarized

TypeScript SDK

// Update
await amr.update('mem_abc123', { content: 'New content', tags: ['updated'] })

// Bulk delete
const result = await amr.deleteOutdated({ olderThanSeconds: 86400 * 30, dryRun: true })

// Merge
const merged = await amr.merge(['mem_abc123', 'mem_def456'])

Self-Hosting

Run MrMemory on your own infrastructure with Docker Compose. Full stack: Rust API + PostgreSQL + Qdrant vector DB.

Quick Start

# Clone the repo
git clone https://github.com/masterdarren23/mrmemory.git
cd mrmemory/amr-project

# Configure
cp .env.example .env
# Edit .env — add your OPENAI_API_KEY

# Launch
docker compose up -d

# API is now at http://localhost:8080
curl http://localhost:8080/v1/health

Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  Your Agent  │────▶│  MrMemory    │────▶│  Qdrant     │
│  (SDK/REST)  │     │  API (Rust)  │     │  (Vectors)  │
└─────────────┘     └──────┬───────┘     └─────────────┘
                           │
                    ┌──────▼───────┐
                    │  PostgreSQL  │
                    │  (Metadata)  │
                    └──────────────┘

Environment Variables

VariableRequiredDefaultDescription
OPENAI_API_KEYYesFor embeddings (text-embedding-3-small)
DATABASE_URLYesPostgreSQL connection string
QDRANT_URLNohttp://localhost:6334Qdrant REST endpoint
EMBEDDING_MODELNotext-embedding-3-smallOpenAI embedding model
LISTEN_ADDRNo0.0.0.0:8080API listen address

Rust SDK

The MrMemory API is a Rust/Axum server. For Rust agents, use the REST API directly with reqwest:

use reqwest::Client;
use serde_json::json;

let client = Client::new();

// Remember
client.post("https://amr-memory-api.fly.dev/v1/memories")
    .bearer_auth("amr_sk_...")
    .json(&json!({
        "content": "User prefers Rust over Python",
        "tags": ["preference", "language"],
        "namespace": "default"
    }))
    .send().await?;

// Recall
let res = client.get("https://amr-memory-api.fly.dev/v1/memories/recall")
    .bearer_auth("amr_sk_...")
    .query(&[("query", "programming preferences"), ("limit", "5")])
    .send().await?
    .json::<serde_json::Value>().await?;

println!("{}", res);

CrewAI Integration

Add persistent memory to your CrewAI agents:

from crewai import Agent, Task, Crew
from mrmemory import AMR

memory = AMR("amr_sk_...", agent_id="researcher", namespace="research")

# Store context before task
memory.remember("Project deadline is March 30th", tags=["project", "deadline"])

# Create agent with memory-aware instructions
researcher = Agent(
    role="Research Analyst",
    goal="Research and report findings",
    backstory="You have access to long-term memory via MrMemory.",
    tools=[],
)

# Recall relevant context in your tool or callback
def recall_context(query: str) -> str:
    results = memory.recall(query, limit=5)
    return "\n".join(m.content for m in results)

# Use in task description
context = recall_context("project deadlines")
task = Task(
    description=f"Given this context:\n{context}\n\nAnalyze the timeline.",
    agent=researcher,
)

crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()

# Save results back to memory
memory.remember(f"Research findings: {result}", tags=["findings"])

Pricing

PlanPriceMemoriesAPI Calls/mo
Starter$5/mo10,00050,000
Pro$25/mo100,000500,000

Start Free → $5/mo