Skip to main content
At the start of every conversation, your agent has no idea who it’s talking to — unless you tell it. MemLayer’s context() method fetches the most relevant memories for a user before any message is exchanged, so you can inject them directly into your system prompt and give your agent instant, accurate personalization from the first response.

Basic Context Loading

Call client.context() before building your system prompt. The returned ContextResult contains a memories list you can iterate over:
from memlayer import MemLayerClient

client = MemLayerClient(api_key="ml_live_xxx")

context = client.context(
    user_id="user_123",
    agent_id="support_bot",
    top_k=10,  # number of memories to load (1-50, default 10)
)

# Inject into the system prompt
system_prompt = "You are a helpful support assistant.\n\nWhat you know about this user:\n"
for m in context.memories:
    system_prompt += f"- {m.content}\n"

print(system_prompt)
# You are a helpful support assistant.
#
# What you know about this user:
# - User is based in Lagos, Nigeria
# - User prefers concise bullet points
# - User complained about slow delivery in November
Without current_message, memories are ranked by a combination of importance and recency — the most significant and recently accessed memories surface first.

Smart Context with current_message

Pass the user’s first message as current_message to switch the ranking from importance/recency to semantic relevance. MemLayer uses it as a search query and returns memories most related to what the user is actually asking about.
# Returns memories ranked by importance + recency
context = client.context(
    user_id="user_123",
    agent_id="support_bot",
    top_k=10,
)
# May surface general facts: location, plan type, language preference, etc.
The difference is significant when users have many memories. Without current_message, you might load 10 general facts. With it, you load the 10 most relevant facts for the specific conversation that’s about to happen.
If your UI captures the user’s first message before calling the LLM, always pass it as current_message. It’s the single highest-impact improvement you can make to context quality.

Tuning top_k

The top_k parameter controls how many memories are returned (range: 150, default: 10).
ValueWhen to use
5–10Most applications — keeps system prompts short and focused
15–25Agents that rely heavily on history (e.g., long-term coaching bots)
30–50When you have large context windows and want maximum recall
Start at 10. If your system prompts are growing too long or LLM latency increases, reduce top_k or raise min_score in recall() to filter lower-relevance results.

Full Conversation Pattern

Here is a complete pattern for a memory-enabled conversation turn: load context, build the system prompt, call the LLM, then save what was learned.
import os
from memlayer import MemLayerClient

client = MemLayerClient(api_key=os.environ["MEMLAYER_API_KEY"])

def chat(user_id: str, user_message: str) -> str:
    # 1. Load the most relevant memories for this user
    context = client.context(
        user_id=user_id,
        agent_id="support_bot",
        top_k=10,
        current_message=user_message,
    )

    # 2. Build the system prompt with injected context
    memory_block = "\n".join(f"- {m.content}" for m in context.memories)
    system_prompt = (
        "You are a helpful support assistant.\n\n"
        f"What you know about this user:\n{memory_block}\n\n"
        "Use this context to personalize your response."
    )

    # 3. Call your LLM (example using a generic interface)
    response = your_llm.chat(
        system=system_prompt,
        user=user_message,
    )

    # 4. Save what was learned from this exchange
    client.remember(
        content=f"User asked: {user_message}",
        user_id=user_id,
        agent_id="support_bot",
        memory_type="episodic",
        importance=0.5,
    )

    return response.content
1

Load context

Call context() with current_message to retrieve the most relevant memories before the LLM sees anything.
2

Build the system prompt

Format memories as a bullet list and prepend them to your existing system prompt. Keep it concise.
3

Call the LLM

Pass the enriched system prompt and the user’s message to your model as normal.
4

Save new memories

After the turn, call remember() to persist anything worth keeping for future sessions.

context() and recall() serve different purposes — use the right one at the right time.
  • context() is for session start. It loads a broad snapshot of the most important or relevant memories to prime your system prompt before any conversation begins.
  • recall() is for mid-conversation search. When the user asks something specific (“what did I tell you about my shipping address?”), use recall() with a targeted query to fetch the precise memories that answer it.
Mixing them up — running recall() at session start with a generic query, or using context() mid-conversation — reduces relevance and wastes tokens.