Skip to main content
Every time a new conversation starts, most AI agents forget everything — the user’s name, their past decisions, their preferences, their frustrations. MemLayer solves this blank-slate problem by giving your agent a persistent memory layer that lives outside the context window. You store memories when meaningful information surfaces, retrieve the most relevant ones on demand, and inject them directly into your system prompt so every session feels like a continuation, not a cold start.

How MemLayer Works

MemLayer works in three stages that map naturally onto your agent’s request lifecycle. Store — Call client.remember() whenever your agent encounters information worth keeping: a user preference, a completed task, a key decision, or a session summary. MemLayer assigns the memory a type, an importance score, and an optional TTL, then persists it against a user_id and agent_id so it’s always scoped correctly. Retrieve — Call client.recall() with a natural-language query. MemLayer scores every candidate memory using a hybrid of semantic similarity and importance weighting, then returns a ranked list. You get the most relevant memories first, regardless of when they were stored. Inject — Call client.context() to fetch a pre-formatted block of the most relevant memories for a given user and agent. Paste the result directly into your system prompt and your agent instantly has the context it needs to respond as if it has known the user for months.

Key Features

Semantic Search

Query memories with natural language. MemLayer matches meaning, not just keywords, so “what does this user like?” surfaces preferences even if the exact words never appear.

Hybrid Scoring

Every recall ranks results by a blend of semantic similarity and importance score, so high-signal memories rise to the top even when recency or exact wording would bury them.

Memory Types

Store episodic, semantic, and summary memories to match the nature of the information — raw events, distilled facts, or condensed session overviews.

Auto-Deduplication

MemLayer detects near-duplicate memories before writing. The is_duplicate flag on every remember() response tells you whether the memory was new or already existed.

TTL Expiry

Set a time-to-live on any memory so short-lived context — like a one-time promo code or a temporary preference — expires automatically without manual cleanup.

Async Support

Use AsyncMemLayerClient to call every method with await in fully async agent frameworks like LangChain, AutoGen, or custom asyncio pipelines.

Memory Types

Choose the memory type that matches the nature of the information you are storing. Each type is retrieved and scored independently, giving you fine-grained control over what surfaces in context.
TypeDescriptionExample
episodicA specific event or interaction that happened at a point in time.”User completed onboarding on 2024-03-15.”
semanticA general fact, preference, or piece of knowledge about the user or domain.”User prefers dark mode and concise replies.”
summaryA compressed overview of a past session or a series of interactions.”In the last session, user troubleshot a login issue and resolved it by resetting their password.”

Next Steps

Quickstart

Follow the step-by-step guide to store your first memory and retrieve it with a semantic query in under 5 minutes.

API Reference

Dig into every method, parameter, and response field in the full API reference.