Load Session Context at the Start of Every Conversation

At the start of every conversation, your agent has no idea who it’s talking to — unless you tell it. MemLayer’s context() method fetches the most relevant memories for a user before any message is exchanged, so you can inject them directly into your system prompt and give your agent instant, accurate personalization from the first response.

Basic Context Loading

Call client.context() before building your system prompt. The returned ContextResult contains a memories list you can iterate over:

from memlayer import MemLayerClient

client = MemLayerClient(api_key="ml_live_xxx")

context = client.context(
    user_id="user_123",
    agent_id="support_bot",
    top_k=10,  # number of memories to load (1-50, default 10)
)

# Inject into the system prompt
system_prompt = "You are a helpful support assistant.\n\nWhat you know about this user:\n"
for m in context.memories:
    system_prompt += f"- {m.content}\n"

print(system_prompt)
# You are a helpful support assistant.
#
# What you know about this user:
# - User is based in Lagos, Nigeria
# - User prefers concise bullet points
# - User complained about slow delivery in November

Without current_message, memories are ranked by a combination of importance and recency — the most significant and recently accessed memories surface first.

Smart Context with `current_message`

Pass the user’s first message as current_message to switch the ranking from importance/recency to semantic relevance. MemLayer uses it as a search query and returns memories most related to what the user is actually asking about.

Without current_message
With current_message

# Returns memories ranked by importance + recency
context = client.context(
    user_id="user_123",
    agent_id="support_bot",
    top_k=10,
)
# May surface general facts: location, plan type, language preference, etc.

# Returns memories semantically relevant to the opening message
context = client.context(
    user_id="user_123",
    agent_id="support_bot",
    top_k=10,
    current_message="I need help tracking my order",
)
# Surfaces shipping preferences, past delivery complaints, order history — not unrelated facts

The difference is significant when users have many memories. Without current_message, you might load 10 general facts. With it, you load the 10 most relevant facts for the specific conversation that’s about to happen.

If your UI captures the user’s first message before calling the LLM, always pass it as current_message. It’s the single highest-impact improvement you can make to context quality.

Tuning `top_k`

The top_k parameter controls how many memories are returned (range: 1–50, default: 10).

Value	When to use
`5–10`	Most applications — keeps system prompts short and focused
`15–25`	Agents that rely heavily on history (e.g., long-term coaching bots)
`30–50`	When you have large context windows and want maximum recall

Start at 10. If your system prompts are growing too long or LLM latency increases, reduce top_k or raise min_score in recall() to filter lower-relevance results.

Full Conversation Pattern

Here is a complete pattern for a memory-enabled conversation turn: load context, build the system prompt, call the LLM, then save what was learned.

import os
from memlayer import MemLayerClient

client = MemLayerClient(api_key=os.environ["MEMLAYER_API_KEY"])

def chat(user_id: str, user_message: str) -> str:
    # 1. Load the most relevant memories for this user
    context = client.context(
        user_id=user_id,
        agent_id="support_bot",
        top_k=10,
        current_message=user_message,
    )

    # 2. Build the system prompt with injected context
    memory_block = "\n".join(f"- {m.content}" for m in context.memories)
    system_prompt = (
        "You are a helpful support assistant.\n\n"
        f"What you know about this user:\n{memory_block}\n\n"
        "Use this context to personalize your response."
    )

    # 3. Call your LLM (example using a generic interface)
    response = your_llm.chat(
        system=system_prompt,
        user=user_message,
    )

    # 4. Save what was learned from this exchange
    client.remember(
        content=f"User asked: {user_message}",
        user_id=user_id,
        agent_id="support_bot",
        memory_type="episodic",
        importance=0.5,
    )

    return response.content

Load context

Call context() with current_message to retrieve the most relevant memories before the LLM sees anything.

Build the system prompt

Format memories as a bullet list and prepend them to your existing system prompt. Keep it concise.

Call the LLM

Pass the enriched system prompt and the user’s message to your model as normal.

Save new memories

After the turn, call remember() to persist anything worth keeping for future sessions.

Context vs. Search

context() and recall() serve different purposes — use the right one at the right time.

context() is for session start. It loads a broad snapshot of the most important or relevant memories to prime your system prompt before any conversation begins.
recall() is for mid-conversation search. When the user asks something specific (“what did I tell you about my shipping address?”), use recall() with a targeted query to fetch the precise memories that answer it.

Mixing them up — running recall() at session start with a generic query, or using context() mid-conversation — reduces relevance and wastes tokens.

​Basic Context Loading

​Smart Context with current_message

​Tuning top_k

​Full Conversation Pattern

​Context vs. Search

Basic Context Loading

Smart Context with `current_message`

Tuning `top_k`

Full Conversation Pattern

Context vs. Search