Memory Infrastructure

Persistent memory
for AI agents.

One API to give any AI agent, on any model, access to shared context that survives across sessions, handoffs, and providers.

Read the docs
memory_save
memory_search
list_memory
0.0%
LongMemEval-S Score
#1
Among Production Systems
0
MCP Tools
<50ms
Retrieval Latency

Quick Start

Integrate in minutes

TypeScript, Python, or MCP — pick your path and add persistent memory to any AI agent.

memory.ts
import { NeutrallyMemory } from '@neutrally/sdk';

const memory = new NeutrallyMemory({
  apiKey: process.env.NEUTRALLY_API_KEY,
});

// Store a memory
await memory.store({
  fact: "User prefers TypeScript over JavaScript",
  type: "preference",
});

// Search memories semantically
const results = await memory.search("programming language preferences");

// Retrieve context for an AI prompt
const context = await memory.getContext({
  query: "What tech stack does the user prefer?",
  limit: 20,
});

Why Neutrally

Built for multi-model AI

Model agnostic

One memory layer across Claude, GPT, Gemini, Llama, and any future model. Switch providers without losing context.

Hybrid retrieval

Vector embeddings + full-text search with Reciprocal Rank Fusion. Semantic understanding meets keyword precision.

Smart extraction

Structured fact extraction with semantic deduplication. New information automatically supersedes outdated facts.

Privacy first

Connect your own API keys. Your keys stay server-side. LLM providers do not train on API data. AES-256-GCM key encryption.

MCP native

10 MCP tools out of the box. Connect from Claude Code, VS Code, Cursor, or any MCP-compatible client.

Production ready

Row-level security, rate limiting, token management. Running in production with real users today.

API

Simple, powerful endpoints

RESTful API with Bearer token authentication. Store, search, and retrieve memories with a few lines of code.

Memory

GET
/api/v1/memory

List all memories with optional type filtering

POST
/api/v1/memory

Store a new memory with type, title, and content

PATCH
/api/v1/memory

Update an existing memory

DELETE
/api/v1/memory

Delete a memory by ID

Search and Context

GET
/api/memory/search

Semantic search across all conversation memory

GET
/api/memory/context

Get aggregated user context (interests, projects, stack)

GET
/api/memory/fetch

Fetch complete conversation with all messages

GET
/api/memory/recent

Get recent messages across all conversations

Example

search-memories.ts
const response = await fetch(
  'https://neutrally.app/api/memory/search?query=typescript',
  {
    headers: {
      'Authorization': 'Bearer your-token',
    },
  }
);

const memories = await response.json();
// Returns:
// [
//   {
//     "conversation_id": "abc-123",
//     "title": "Tech Stack Discussion",
//     "summary": "User prefers TypeScript...",
//     "keywords": ["typescript", "react"]
//   }
// ]
store-memory.ts
await fetch('https://neutrally.app/api/v1/memory', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer your-token',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    type: 'preference',
    title: 'Language preference',
    content: 'User prefers TypeScript over JavaScript',
  }),
});

MCP Server

0 tools for your AI workflow

Connect Neutrally to Claude Code, VS Code, Cursor, or any MCP client. Your AI tools get persistent memory instantly.

terminal
npx neutrally@latest    # Connects to your Neutrally account via OAuth
search

Semantic search across all conversations and memories

list_conversations

List recent conversations with summaries and keywords

fetch_conversation

Get complete conversation with every message

get_context

Retrieve aggregated user context and preferences

recent

Get the most recent messages across all conversations

extract_code

Extract code snippets from conversation history

summarize_activity

Summarize recent activity and conversation topics

save_memory

Store a new structured memory item

list_memory

List all stored memory items with filtering

delete_memory

Remove a memory item by ID

Architecture

How the memory layer works

01

Conversations flow through Neutrally

Your AI agent chats with users through any model. Every message is stored and indexed in real-time.

02

Facts are extracted and embedded

Structured extraction captures personal facts, dates, numbers, preferences, and decisions. Each fact is embedded as a vector for semantic retrieval.

03

Semantic deduplication resolves conflicts

When new information contradicts old information, the system detects the conflict and keeps the most recent version. "I live in London" gets superseded by "I moved to Berlin."

04

Hybrid retrieval surfaces the right context

Vector similarity + full-text search with Reciprocal Rank Fusion. The most relevant memories surface regardless of which model stored them or when.

Give your AI persistent memory

Start with the free tier. One memory layer, every model, every agent.

Read documentation