Your LLMs forget everything between sessions. Remember Ninja gives them structured, versioned, searchable memory — via REST, WebSocket, or MCP. No LLM on the backend. Pure heuristics. Sub-2ms reads.

Learn more Start building

Memory-as-a-service

Your AI agents deserve real memory +

Latest assertion

decisions.auth.provider

"Clerk" ← superseded "Auth0"

Active

Hybrid search

memory_search

Lexical Semantic

Problem 01

Context windows are not memory

LLMs lose everything when the session ends. Chat summaries are lossy, unstructured, and unsearchable. Your agents make the same mistakes twice because they can't remember what was decided — or why.

Problem 02

Intent without reason is useless

Most memory stores what was decided. Not why. Without reasoning, memory is a flat lookup table — not experience. The second layer is what turns memory into judgment.

Problem 03

Key-value stores weren't built for this

Redis or JSON files give you no history, no provenance, no search. When a fact changes, the old one vanishes. You can't ask "what did we used to believe?"

Problem 04

N tools × M backends

Claude, Cursor, ChatGPT, OpenClaw, custom agents — each needs memory, each talks a different protocol. You're rebuilding the same persistence layer over and over.

What it is

A memory layer built for AI agents

Every stored fact is a versioned assertion with a canonical keypath, status lifecycle, provenance, and optional vector embedding. Memory isn't gospel — it's experience. Assertions are the accumulated output of previous thinking. Was the last decision good? Did the context change? Your agent can compare, supersede, and evolve what it knows.

No LLM on the backend. Retrieval is pure heuristics — deterministic lookups, full-text search, vector similarity, reciprocal rank fusion. No inference in the hot path. Sub-2ms reads. No GPU costs. Predictable latency at any scale.

How it works

1

Store with reason

{
  "keypath": "decisions.auth.provider",
  "value": "Auth0",
  "context": "Evaluated Auth0, Clerk, and Okta. Chose Auth0 for SSO support."
}

Not just the decision — the reasoning behind it. Canonical keypath, timestamp, actor trail, and the why.

2

Retrieve — exact or fuzzy

# Exact keypath — O(1), sub-2ms
GET /v1/assertions/key/decisions.auth.provider

# Hybrid search — lexical + semantic
POST /v1/assertions/search
{ "query": "why did we choose this auth provider" }

Keypath lookups for known facts. Hybrid search when the agent needs to explore past reasoning.

3

Evolve — supersede, never delete

// PUT /v1/assertions/:id/supersede
{
  "value": "Clerk",
  "context": "Auth0 raised prices 3x. Migrating to Clerk."
}

Old fact → superseded. New fact → active. Full chain preserved. That's experience, not just storage.

Scoping

Memory that mirrorsyour structure

Per project

Each codebase gets isolated memory. Frontend decisions don't bleed into backend.

Per product

Each product in your SaaS suite has its own memory space.

Per user

Every end user gets scoped memory. Isolated, searchable, GDPR-erasable in one call.

Per domain

Share memory across products in a group. Internal tooling patterns, company-wide decisions.

Nested

Scopes compose naturally. User within project within product group. Memory resolves at the right level.

Capabilities

What's underthe hood

Memory as experience

Assertions capture what was decided and why. Agents use memory as accumulated judgment — compare outcomes, detect shifts, evolve understanding.

No LLM in the loop

Pure heuristics. Deterministic lookups, FTS, vector similarity, rank fusion. Sub-2ms reads. No GPU costs. Predictable latency at any scale.

Always-on WebSocket

Agents stay connected. Persistent, bidirectional memory channel. Store and retrieve mid-conversation. Sub-5ms reads.

Versioned assertions

Active, superseded, retracted. Full history. Nothing silently overwritten. "I changed my mind" is a first-class operation.

Hybrid search

Four modes: exact keypath, lexical FTS, semantic vector, hybrid with reciprocal rank fusion. Conflicts surfaced automatically.

MCP-native

12 tools via Model Context Protocol. Claude Code, Cursor, and any MCP client discover memory automatically. Zero glue code.

GDPR by design

Hard-delete per scope or keypath pattern. Compliance receipts with zero PII. Built in, not bolted on.

Audit trail

Append-only event log. Every mutation tracked: who, what, when, why. Cursor-based pagination. Enterprise-grade provenance.

Admin panel

See insidethe brain

The memory backend is pure heuristics. No LLM. But the people managing it need intelligence. The admin panel uses AI for analytics and steering — pattern detection, drift monitoring, anomaly surfacing — while the memory layer stays fast and deterministic.

AI-powered analytics

Drift detection across scopes. Memory health monitoring. Usage patterns. Anomaly surfacing when assertion patterns change unexpectedly.

Pre-seed memory

Inject baseline knowledge before agents start. Global standards, per-project context, per-account preferences. Template scopes for new projects.

Direct management

Browse, search, and edit assertions across all scopes. View supersession chains. Retract or correct facts when agents get it wrong.

Operational controls

API key rotation and revocation. Webhook configuration. GDPR erasure with dry-run preview. Full event log. Per-service usage metrics.

Performance

Fast becausethere's no LLMin the way

Operation	Cloud API	CLI (local)
Keypath read	< 2ms	< 1ms
Write	< 5ms	< 10ms
Lexical search	10–20ms	< 15ms
Hybrid search	60–120ms	60–220ms
MCP tool call	—	1–10ms
Auth overhead	< 1ms cached	—

p50 < 20ms p99 < 100ms ~100K WebSocket connections/instance

Open source

Free CLI.Run it locally.No account needed.

The Remember Ninja CLI is free and open source. Install it, run it on your own machine with a local SQLite database. No server, no sign-up, no network calls. Full assertion model, full search, full MCP server — completely offline.

Use it standalone with OpenClaw or any MCP-capable tool for free local memory. Or connect the same CLI to the remember.ninja cloud service when you need shared, multi-user, cross-device memory.

View on GitHub Or try the cloud

Install

npm install -g @remember-ninja/cli
remember init

Use

# Store a decision with reasoning
remember put "decisions.stack.framework" "Fastify" \
  --context "Benchmarked against Express and Hono"

# Search your memory
remember search "framework decision"

# View decision history
remember history "decisions.stack.framework"

Add to Claude Code / Cursor

{
  "mcpServers": {
    "remember-ninja": {
      "command": "remember",
      "args": ["mcp", "start"]
    }
  }
}

Optional: connect to cloud

# Same CLI, now syncing to remember.ninja
remember config set api.endpoint "https://api.remember.ninja"
remember config set api.key "your-api-key"

Cloud integration

Three transports.One memory layer.

REST API

Any HTTP client. No SDK required. TypeScript SDK available for convenience.

npm install @remember-ninja/client

WebSocket — always connected

Persistent bidirectional memory channel. Sub-5ms reads. The recommended path for agents that need constant memory access.

{
  "transport": "websocket",
  "endpoint": "wss://api.remember.ninja/v1/ws"
}

MCP — Claude Code, Cursor, any MCP client

12 tools auto-discovered. Zero glue code. Your LLM assistant gets persistent memory in one config block.

{
  "mcpServers": {
    "remember-ninja-cloud": {
      "url": "https://api.remember.ninja/v1/mcp/sse"
    }
  }
}

OpenClaw — multi-agent shared memory

Plug Remember Ninja into OpenClaw as the shared memory layer across sessions and agents. Every agent in the swarm reads and writes to the same scoped memory. WebSocket keeps all agents in sync. Works with the free local CLI or the cloud service.

Use cases

Built for theseproblems

AI coding assistants

Claude Code and Cursor forget your project decisions every session. With Remember Ninja, every session recalls not just "we use Postgres" but why — and whether that decision worked out.

Agents that improve each turn

Each turn, your agent stores what it learned and retrieves what it knows. New information supersedes old, with reason captured. Over time: real experience, at heuristic speed.

Multi-agent + OpenClaw

Every agent in the swarm shares consistent context. Canonical keypaths as source of truth. When one agent updates a fact, others see the change with reasoning attached.

Customer-facing AI products

Thousands of users, each with scoped memory. Preferences, history, context — searchable, versioned, GDPR-erasable per user in one API call.

Decision audit trails

What did your AI believe at any point in time? Append-only event log and supersession history. Complete, immutable record with full provenance.

Pre-seeded knowledge bases

Inject company policies, architectural standards, and compliance rules before agents start. Starting points they can evolve — not locked-in rules.

Who it's for

Developers building AI agents

You're wiring LLMs into products. You need memory that's structured, searchable, and doesn't lose history. Remember Ninja gives your agents a real memory layer — not a chat log dump.

Developers using AI coding tools

Claude Code and Cursor are powerful, but they forget your project decisions every session. The free CLI gives them persistent, project-scoped memory that survives context window limits.

Development managers

Your team builds with LLMs. You need auditability, compliance, and a memory layer that doesn't become another integration to maintain. One service, three transports, full provenance. Admin panel with AI analytics.

Architecture

What's runningunderneath

Cloud infrastructure

PostgreSQL 16 — pgvector, FTS, JSONB, HNSW indexes
Cloudflare (CDN, WAF, DDoS, SSL Full Strict)
Google Cloud Run (auto-scaling, zero cold starts)
Cloud SQL (managed Postgres, HA, PITR)
Three transports on a single endpoint

CLI (local)

SQLite with WAL mode
Native C++ bindings for speed
FTS5 for full-text search
Sub-50ms cold start
Open source, free to use

Security

API keys: 32-byte crypto random, SHA-256 hashed
JWT authentication (HS256)
Parameterized queries (no string interpolation)
Compiled input validation schemas (< 0.5ms)
Rate limiting: 100 rps/key, 1000 req/min/IP

Enterprise

Deploy on yourinfrastructure

Full on-premise or private cloud installation. Same API, same performance, same MCP integration — running inside your security perimeter.

Contact sales

Full on-site or private cloud installation

Same REST + WebSocket + MCP API surface

Your database, your network, your compliance

Dedicated onboarding and deployment support

SSO/SCIM integration for your identity provider

SLA, support agreements, audit trail included

FAQ

How long does integration take?

REST: any HTTP client, minutes. MCP: add a config block, zero code. CLI: npm install -g, one command.

Is the CLI really free?

Yes. The CLI is open source and runs entirely on your local machine with SQLite. No account, no server, no limits. Use it standalone with OpenClaw or any MCP client. Optionally connect it to the cloud when you need shared memory.

Do I need an embedding provider?

No. Lexical search works out of the box. Semantic search is optional — plug in OpenAI, Voyage, or a local model if you want vector retrieval.

What happens to my data if I stop?

Export your full state as JSON or YAML at any time. The CLI stores everything in a local SQLite file you own.

Is there vendor lock-in?

MIT licensed. Standard protocols (HTTP, WebSocket, MCP). Your data is structured JSON with keypaths — portable by design.

Does it work with OpenClaw / multi-agent frameworks?

Yes. Any framework that supports MCP or HTTP can use Remember Ninja as shared memory. Works locally (free CLI) or via the cloud. WebSocket for real-time sync.

Can admins pre-load memory before agents start?

Yes. Pre-seed assertions at any scope — global standards, per-project context, per-account preferences. Agents can supersede pre-seeded facts as they learn.

What about GDPR?

Hard-delete per scope or keypath pattern. Compliance receipts generated automatically. No PII in receipts.

Pricing

Start free.Scale when ready.

Open source CLI

Free forever. Local SQLite. Full assertion model, search, and MCP server. No account needed.

Get the CLI

Cloud

Free tier available. Usage-based pricing. Shared memory, WebSocket, admin panel, GDPR tools.

Start free

Enterprise

Full installation on your infrastructure. On-premise or private cloud. Dedicated support and SLA.

Contact sales

Your agentsdeserve experience,not just storage

Stop losing decisions and their reasoning between sessions. Give your AI a memory layer that's fast, structured, and gets smarter over time — without an LLM in the loop.

Start building Contact us for enterprise