id is your handle for future updates or deletions.
Request
Headers
| Name | Required | Description |
|---|---|---|
X-API-Key | Yes | Your MemLayer API key (ml_live_xxx) |
Content-Type | Yes | application/json |
Body
The memory text to embed and store. Must be between 1 and 10,000 characters. Leading and trailing whitespace is stripped automatically.
Identifier for the end user this memory belongs to. Max 255 characters. Use a stable, unique ID from your own system (e.g. your database user UUID).
Identifier for the agent storing the memory. Max 255 characters. Use this to namespace memories per bot or assistant (e.g.
"support_bot", "sales_agent").Categorises the memory for filtering and retrieval. One of
episodic, semantic, or summary.episodic— something that happened ("User complained about late delivery on 2 June").semantic— a stable fact ("User is based in Lagos, Nigeria").summary— a condensed recap of a prior session.
Priority weight from
0.0 (low) to 1.0 (high). Importance contributes 10% to the hybrid retrieval score, so memories marked 0.9 will surface more reliably than those at 0.1. Default 0.5 is appropriate for most memories.Arbitrary key-value data to store alongside the memory — session IDs, tags, source URLs, etc. Not used in retrieval scoring but returned in every response.
Number of days until this memory auto-expires. Range: 1–3650. Omit (or pass
null) for a permanent memory. Useful for session-scoped context you never want to retrieve again after a few days.Response
Success — 201
UUID of the newly stored memory. Save this if you need to update or delete it later.
Human-readable confirmation. Always
"Memory stored successfully" on a 201.Errors
| Code | Meaning |
|---|---|
401 | Invalid or missing X-API-Key |
402 | Plan memory limit reached — upgrade your plan |
403 | Account suspended |
409 | Duplicate detected — a near-identical memory already exists; nothing was stored |
422 | Validation error — check field types and constraints |
Notes
How duplicate detection works
How duplicate detection works
Before writing, MemLayer embeds the new content and runs a cosine-similarity search against all existing memories for the same
user_id and agent_id. If any stored memory scores ≥ 0.95, the request returns a 409 and nothing is written. This prevents semantically identical facts from accumulating across sessions — for example, storing "User is in Lagos" fifty times. Use POST /memories/duplicate-check if you want to probe for duplicates without committing.Choosing memory_type
Choosing memory_type
Use
episodic for time-bound events: things that happened during a specific interaction. Use semantic for enduring facts that remain true across all sessions. Use summary to store compressed recaps you generate yourself at the end of a long conversation. Typing memories correctly lets you filter with memory_type in search and list calls.Setting importance
Setting importance
Importance feeds into the hybrid retrieval score at a 10% weight. In practice:
0.8–1.0— critical facts (user’s name, hard constraints, billing status). Always surfaced.0.4–0.7— general context. Surfaced when relevant.0.0–0.3— low-signal observations. Retrieved only when very similar to the query.
0.5 is a safe choice if you’re unsure.Using TTL for temporary context
Using TTL for temporary context
Pass
ttl_days=1 for within-session notes that shouldn’t outlive the conversation. MemLayer expires them automatically — no cron job or manual deletion required. A memory created with ttl_days=7 will stop appearing in search and context results after seven days.