Auto memory (Hybrid)
A customer comes back after a week and asks “what did we settle on?” Your agent needs three things at once: the last few messages of THIS conversation (recent), the extracted facts about THIS customer (semantic), and the decision evidence from past loan reviews (causal). One memory type can’t do all three. The hybrid pattern stacks them.
What hybrid memory is
Section titled “What hybrid memory is”The defineMemory factory returns ONE memory definition. Real production agents need MULTIPLE — different types for different time horizons:
| Layer | Type × Strategy | Time horizon | Cost |
|---|---|---|---|
| Recent window | EPISODIC × WINDOW | Last N turns of THIS conversation | Free |
| Extracted facts | SEMANTIC × EXTRACT | All known facts about this user | One LLM call per write |
| Causal snapshots | CAUSAL × TOP_K | Decision evidence from past runs | Embedding call per query |
Stack them via multiple .memory(...) calls on the same agent. Each layer’s read subflow runs independently and contributes to the messages slot:
// 1. Short-term: last 10 turns (cheap, fast)const recent = defineMemory({ id: 'recent', type: MEMORY_TYPES.EPISODIC, strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 }, store: recentStore,});
// 2. Semantic facts: pattern-extracted, recency-loadedconst facts = defineMemory({ id: 'facts', type: MEMORY_TYPES.SEMANTIC, strategy: { kind: MEMORY_STRATEGIES.EXTRACT, extractor: 'pattern', maxPerTurn: 5, }, store: factsStore,});
// 3. Causal: snapshots of past runs, retrieved by semantic matchconst causal = defineMemory({ id: 'causal', type: MEMORY_TYPES.CAUSAL, strategy: { kind: MEMORY_STRATEGIES.TOP_K, topK: 1, threshold: 0.5, embedder, }, store: causalStore,});The agent sees a layered context: short-term turns (most recent), extracted facts (always-relevant), and the matching past snapshot (when cosine clears the threshold). Each layer can use its OWN store — recentStore could be Redis-hot, factsStore Postgres, causalStore S3+pgvector — each tuned to the layer’s read-frequency vs durability needs.
Why three layers, not one
Section titled “Why three layers, not one”Different parts of “what the agent knows” have different update rates and retention needs:
- Recent window turns over every minute (every conversation turn).
- Extracted facts turn over every day (new facts learned about the user).
- Causal snapshots turn over every quarter (every meaningful decision the agent made).
A single store optimized for one cadence is wrong for the other two. RedisStore is great for recent (sub-ms latency, TTL). Postgres is great for facts (queryable, joinable). pgvector / Pinecone is great for causal (vector search). The hybrid pattern lets each layer use the right backend.
Per-layer scope keys prevent collision
Section titled “Per-layer scope keys prevent collision”Multiple memories on the same agent each get their own scope key — memoryInjection_${id} — so they layer cleanly. The unique IDs (recent, facts, causal above) become observability event identifiers; you can filter agentfootprint.context.injected by e.payload.source === 'memory:facts' to see exactly which layer fired when.
Anti-patterns
Section titled “Anti-patterns”- Don’t share one store across layers — defeats the per-layer backend tuning. Use one store per layer, each adapter chosen for that layer’s access pattern.
- Don’t mix incompatible strategies on one type —
EXTRACTwrites structured facts;WINDOWreads raw messages. Mixing them on one definition produces nonsense. - Don’t forget identity. All three layers scope by the same
MemoryIdentitytuple — a singleagent.run({ identity })propagates to all of them.
Next steps
Section titled “Next steps”- Memory guide — single-layer fundamentals + the 4-types × 7-strategies matrix
- Causal deep-dive — snapshot shape, projection, replay (coming v2.4)
- Memory store adapters — Redis · AgentCore · planned backends