Auto memory (Hybrid)

A customer comes back after a week and asks “what did we settle on?” Your agent needs three things at once: the last few messages of THIS conversation (recent), the extracted facts about THIS customer (semantic), and the decision evidence from past loan reviews (causal). One memory type can’t do all three. The hybrid pattern stacks them.

What hybrid memory is

The defineMemory factory returns ONE memory definition. Real production agents need MULTIPLE — different types for different time horizons:

Layer	Type × Strategy	Time horizon	Cost
Recent window	EPISODIC × WINDOW	Last N turns of THIS conversation	Free
Extracted facts	SEMANTIC × EXTRACT	All known facts about this user	One LLM call per write
Causal snapshots	CAUSAL × TOP_K	Decision evidence from past runs	Embedding call per query

Stack them via multiple .memory(...) calls on the same agent. Each layer’s read subflow runs independently and contributes to the messages slot:

// 1. Short-term: last 10 turns (cheap, fast)
const recent = defineMemory({
  id: 'recent',
  type: MEMORY_TYPES.EPISODIC,
  strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 },
  store: recentStore,
});

// 2. Semantic facts: pattern-extracted, recency-loaded
const facts = defineMemory({
  id: 'facts',
  type: MEMORY_TYPES.SEMANTIC,
  strategy: {
    kind: MEMORY_STRATEGIES.EXTRACT,
    extractor: 'pattern',
    maxPerTurn: 5,
  },
  store: factsStore,
});

// 3. Causal: snapshots of past runs, retrieved by semantic match
const causal = defineMemory({
  id: 'causal',
  type: MEMORY_TYPES.CAUSAL,
  strategy: {
    kind: MEMORY_STRATEGIES.TOP_K,
    topK: 1,
    threshold: 0.5,
    embedder,
  },
  store: causalStore,
});

The agent sees a layered context: short-term turns (most recent), extracted facts (always-relevant), and the matching past snapshot (when cosine clears the threshold). Each layer can use its OWN store — recentStore could be Redis-hot, factsStore Postgres, causalStore S3+pgvector — each tuned to the layer’s read-frequency vs durability needs.

Why three layers, not one

Different parts of “what the agent knows” have different update rates and retention needs:

Recent window turns over every minute (every conversation turn).
Extracted facts turn over every day (new facts learned about the user).
Causal snapshots turn over every quarter (every meaningful decision the agent made).

A single store optimized for one cadence is wrong for the other two. RedisStore is great for recent (sub-ms latency, TTL). Postgres is great for facts (queryable, joinable). pgvector / Pinecone is great for causal (vector search). The hybrid pattern lets each layer use the right backend.

Per-layer scope keys prevent collision

Multiple memories on the same agent each get their own scope key — memoryInjection_${id} — so they layer cleanly. The unique IDs (recent, facts, causal above) become observability event identifiers; you can filter agentfootprint.context.injected by e.payload.source === 'memory:facts' to see exactly which layer fired when.

Anti-patterns

Don’t share one store across layers — defeats the per-layer backend tuning. Use one store per layer, each adapter chosen for that layer’s access pattern.
Don’t mix incompatible strategies on one type — EXTRACT writes structured facts; WINDOW reads raw messages. Mixing them on one definition produces nonsense.
Don’t forget identity. All three layers scope by the same MemoryIdentity tuple — a single agent.run({ identity }) propagates to all of them.

Next steps

Memory guide — single-layer fundamentals + the 4-types × 7-strategies matrix
Causal deep-dive — snapshot shape, projection, replay (coming v2.4)
Memory store adapters — Redis · AgentCore · planned backends