How MemLayer Ranks Memories with Hybrid Scoring

When you call recall() or context(), MemLayer doesn’t just match keywords — it ranks every candidate memory using a hybrid score that combines three independent signals into a single float between 0.0 and 1.0. The memory most likely to be useful to your agent right now rises to the top, even if it wasn’t the closest semantic match in isolation.

The Formula

MemLayer computes a final score for every memory before returning results:

final_score = 0.70 × cosine + 0.20 × recency + 0.10 × importance

Each component contributes a value between 0.0 and 1.0, weighted as above:

Cosine (70%) — how semantically close the memory is to your query
Recency (20%) — how recently the memory was stored
Importance (10%) — how important you rated the memory when storing it

The weights are designed so that semantic relevance always dominates, while recency and importance act as meaningful tie-breakers that surface fresh, high-signal memories over stale or low-priority ones.

Component Breakdown

Component	Weight	Range	What drives it
Cosine similarity	70%	0.0 – 1.0	Embedding distance between query and memory
Recency	20%	0.0 – 1.0	Exponential decay based on time since storage
Importance	10%	0.0 – 1.0	The `importance` float you set when storing

Cosine Similarity (70%)

Cosine similarity measures the semantic distance between your query embedding and the stored memory embedding. MemLayer uses dense vector representations, so matches are language-agnostic and concept-aware — not keyword-dependent. A query like "where does the user live?" will match a memory containing "User is based in Lagos, Nigeria" even though none of the query words appear in the memory text. This makes recall robust across paraphrases, different languages, and varying levels of detail. Memories with a cosine score below the min_score threshold are excluded from results entirely before the hybrid score is applied.

Recency (20%)

Recency is computed using exponential decay based on how many days have passed since the memory was stored. A memory stored right now scores 1.0, and the score decreases smoothly as the memory ages. This means your agent naturally surfaces recent context without you having to filter by date manually.

# Memories from today will rank higher than memories from last month,
# all else being equal — no extra configuration needed.
results = client.recall(
    query="What did the user say about pricing?",
    user_id="user_abc",
    agent_id="sales_agent",
)

Importance (10%)

The importance field is a float between 0.0 and 1.0 that you set when storing a memory. It defaults to 0.5. Use higher values for memories that carry lasting strategic weight — confirmed user intent, key preferences, or critical constraints — so they continue to surface even as they age.

# Mark a high-signal memory as high-importance so it persists in recall
client.remember(
    "User explicitly said they will never switch from the annual plan.",
    user_id="user_abc",
    agent_id="sales_agent",
    memory_type="semantic",
    importance=0.95,  # Will stay near the top even weeks from now
)

# Routine log entry — leave importance at the default
client.remember(
    "User viewed the pricing page.",
    user_id="user_abc",
    agent_id="sales_agent",
    memory_type="episodic",
    # importance defaults to 0.5
)

Reserve importance >= 0.85 for memories that represent confirmed facts or strong user intent. Inflating importance on every memory dilutes its effect as a tie-breaker.

Inspecting score_detail

Every memory returned by recall() includes a score_detail object with the individual component scores alongside the final hybrid score. Use it to understand exactly why a memory ranked where it did.

results = client.recall(
    query="Has the user had billing issues?",
    user_id="user_abc",
    agent_id="support_bot",
)

for m in results:
    print(m.content)
    print(f"  cosine:     {m.score_detail.cosine:.3f}")
    print(f"  recency:    {m.score_detail.recency:.3f}")
    print(f"  importance: {m.score_detail.importance:.3f}")
    print(f"  final:      {m.score_detail.final:.3f}")
    print()

Example output:

User opened a billing dispute on 2024-11-03.
  cosine:     0.912
  recency:    0.634
  importance: 0.800
  final:      0.771

score_detail is invaluable when tuning min_score or debugging unexpected recall results — you can see immediately which component is dragging a memory down.

Tuning min_score

The min_score parameter sets the minimum final_score a memory must achieve to appear in results. Adjust it based on how strict you need retrieval to be:

# Default — balanced precision and recall (recommended starting point)
results = client.recall(query=query, user_id=uid, agent_id=aid, min_score=0.70)

# Strict — only highly relevant, recent, or important memories
results = client.recall(query=query, user_id=uid, agent_id=aid, min_score=0.85)

# Open — return everything above noise floor (useful for debugging)
results = client.recall(query=query, user_id=uid, agent_id=aid, min_score=0.0)

Setting min_score=0.0 during development lets you inspect all candidate memories and their score_detail values, making it easy to calibrate the right threshold for your use case before going to production.

​The Formula

​Component Breakdown

​Cosine Similarity (70%)

​Recency (20%)

​Importance (10%)

​Inspecting score_detail

​Tuning min_score

The Formula

Component Breakdown

Cosine Similarity (70%)

Recency (20%)

Importance (10%)

Inspecting score_detail

Tuning min_score