Key Concepts
The 5-layer taxonomy. 2 primitives, 4 compositions, N patterns, the Injection cross-cut, infrastructure. Once these are in your head, every agent paper and every framework can be located on the grid.
This page is the model. 2 primitives, 4 compositions, N named patterns, the Injection cross-cut, the infrastructure layer underneath. Once these are in your head, every agent paper, every framework, every system you encounter can be located on the grid — including yours.
If you haven't yet, read Why agentfootprint? first. It's the why. This page is the what.
Key terms
| Term | Meaning |
|---|---|
| LLM | One prompt in, one response out. The atomic invocation. |
| Agent | An LLM that loops: calls tools, reads results, calls more tools, responds. Agent = ReAct — if it doesn't loop-with-tools, it isn't an Agent. |
| ReAct | "Reasoning + Acting" — the loop where an LLM thinks, acts (calls a tool), observes the result, and repeats. (Yao et al. 2022) |
| Composition | How primitives arrange: Sequence (one after another), Parallel (at the same time), Conditional (branch on a decider), Loop (iterate until quality bar). |
| Pattern | A named configuration — Reflexion, Tree-of-Thoughts, Hierarchy. Every named paper is a recipe of primitives + compositions, not a new class. |
| Context engineering | What you inject into an Agent's three slots (system / messages / tools). Skills, Steering, Guardrails, RAG, Memory, Tool APIs — all live here. |
| Injection | The unified primitive behind every flavor of context engineering — slot × trigger × cache. Skills, Steering, Instructions, Facts, Memory, RAG all reduce to this. |
| Slot | One of three fixed regions of every LLM call: system, messages, tools. Fixed by the LLM API surface. |
| Trigger | When an Injection fires: always (build-time), rule (predicate), on-tool-return (after a tool result), llm-activated (LLM calls read_skill). |
| Cache | How stable the injection's content is across iterations. Determines where the framework places provider cache markers (80–90% cheaper prefixes for stable content). |
| Narrative | A structured execution trace — what happened, in what order, with what data. Not logs — connected entries with provenance. |
| Recorder | A passive observer that collects data (tokens, cost, tool usage) during execution without affecting behavior. |
| Provider | The LLM backend — Claude, GPT, Bedrock, Ollama, or your own. Swap with one line. |
The taxonomy
agentfootprint organizes AI work into five layers. Each layer is built from the one above it.
PRIMITIVES (2 — atomic invocation units)
1. LLMCall — one call
2. Agent — a loop of calls + tools + decisions (= ReAct)
COMPOSITIONS (4 — how primitives arrange)
1. Sequence — one after another
2. Parallel — at the same time, with merge
3. Conditional — branch on a decider
4. Loop — iterate until quality bar
PATTERNS (N — named configurations; every named paper is a recipe)
ReAct = Agent (default)
Reflexion = Sequence(Agent, LLM-critique, Agent)
Tree-of-Thoughts = Parallel(Agent × N) + LLM-rank
Self-Consistency = Parallel(Agent × N) + majority-vote
Debate = Loop(Agent × 2 + judge)
Map-Reduce = Parallel(Agent × N) + LLM-merge
Swarm = Agent whose tools are other Agents
CONTEXT ENGINEERING (cross-cutting — Injection primitive into Agent slots)
Steering → system-prompt (always-on)
Instruction → system-prompt|messages (rule-gated)
Skill → system-prompt + tools (LLM-activated)
Fact → system-prompt|messages (always-on data)
Memory → messages (cross-run)
RAG → messages (retrieve + score-threshold)
FEATURES (infrastructure)
Providers · Mocks-first · Observability · Pause/Resume · Resilience · Reliability gateTwo theses to hold in your head:
- Agent = ReAct. Not "Agent with ReAct as default." The Agent primitive IS the ReAct loop. If it doesn't loop-with-tools, it isn't an Agent — it's an LLM call.
- Every named pattern = a composition of the 2 primitives + 4 compositions. Don't invent new Agent classes for every paper (that's LangChain's mistake). Express papers as recipes.
All runners share the same shape — create → configure → build → run → observe.
PRIMITIVES
1. LLMCall — one call
The atom. One prompt in, one response out. No tools, no loop.
const llm = LLMCall.create({ provider: provider ?? exampleProvider('core', { reply: "It's sunny in San Francisco." }), model: 'mock-weather', temperature: 0.2,}) .system('You are a terse weather assistant. One sentence answers.') .build();Use for: summarization, classification, translation, extraction.
2. Agent — a loop of calls + tools + decisions (= ReAct)
Agent = ReAct. This is the definition. The Agent primitive IS the ReAct loop — thinks, acts (tool call), observes, repeats.
const agent = Agent.create({ provider: provider ?? exampleProvider('feature', { respond: weatherRespond }), model: 'mock', maxIterations: 5, // reactMode: 'dynamic-grouped' wraps the LLM turn in an sf-llm-call subflow, // so Lens renders the agent's reasoning as an LLM group with its context // slots (system-prompt / messages / tools) nested inside — the SAME shape // the LLMCall primitive shows — instead of a bare "Final · RUNNER" card. reactMode: 'dynamic-grouped',}) .system('You answer weather questions using the `weather` tool.') .tool({ schema: { name: 'weather', description: 'Get current weather for a city.', inputSchema: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'], }, }, execute: async (args) => `${(args as { city: string }).city}: sunny, 72°F`, }) .build();Use for: research, code generation, customer support, data analysis — anywhere you need tools-in-a-loop.
Dynamic vs Classic ReAct
agentfootprint's Agent loops back to SystemPrompt every iteration, not just CallLLM. Slots recompose every pass — injections that fired on the previous tool result get a chance to recompose the next prompt. Classic ReAct freezes the slots at the top of the turn.
| Iteration | Classic ReAct | Dynamic ReAct (agentfootprint) |
|---|---|---|
| 1 | 12 tools shown | 1 tool (read_skill) |
| 2 | 12 tools shown | 5 tools (skill activated) |
| 3 | 12 tools shown | 5 tools |
📖 Dynamic ReAct guide for the full taxonomy of what this unlocks.
COMPOSITIONS
The four ways primitives arrange. Every named pattern is built from these.
1. Sequence — one after another
Chains multiple runners into a sequential pipeline. Each step's output flows into the next via pipeVia (when shapes don't match):
const pipeline = Sequence.create({ name: 'IntakePipeline' }) .step('classify', classify) .pipeVia((label) => ({ message: `Intent: ${label.trim()}` })) .step('respond', respond) .build();Use for: multi-step workflows, classify-then-respond, data (ETL) pipelines.
2. Parallel — at the same time
Run multiple runners simultaneously and merge their results. Merge can be a function (deterministic) or an LLM (synthesis):
if (mode === 'strict') { // STRICT (default), 2 BRANCHES: any branch failure → whole Parallel throws. // The smaller committee is the most common shape — two specialists vote. // Both votes are REQUIRED (losing 1 of 2 is not fine), so each branch // is marked `{ required: true }`: with every branch required, the // fan-out runs fail-fast — the first failure rejects the whole run // immediately, naming the branch, without waiting on the sibling. const committee = Parallel.create({ name: 'Committee' }) .branch('legal', brief('legal'), { required: true }) .branch('ethics', brief('ethics'), { required: true }) .mergeWithFn((results) => Object.entries(results) .map(([id, r]) => ` ${id}: ${r}`) .join('\n'), ) .build(); console.log('--- strict mode (2 agents) ---'); const strict = await committee.run({ message: cleanInput }); console.log(strict); return { mode, strict };}// TOLERANT, 3 BRANCHES: the merge fn receives the full outcomes map so// it can decide how to handle partial failure. Larger committees are// also where tolerant mode pays off — losing 1 of 3 voices is fine,// losing 1 of 2 is not.const tolerantCommittee = Parallel.create({ name: 'TolerantCommittee' }) .branch('legal', brief('legal')) .branch('ethics', brief('ethics')) .branch('cost', brief('cost')) .mergeOutcomesWithFn((outcomes) => { const lines = Object.entries(outcomes).map(([id, o]) => o.ok ? ` ${id}: ${o.value}` : ` ${id}: [FAILED] ${o.error}`, ); return lines.join('\n'); }) .build();Use for: multi-perspective analysis, A/B comparison, ensemble approaches.
3. Conditional — branch on a decider
if/else routing between runners. First matching predicate wins; fallback runs .otherwise().
const triage = Conditional.create({ name: 'Triage' }) .when( 'urgent', (input) => /\b(urgent|asap|outage|down|critical)\b/i.test(input.message), urgent, ) .otherwise('normal', normal) .build();Use for: triage, content classification, intent-based routing.
4. Loop — iterate until quality bar
Repeat a runner until a condition is met or a budget is exhausted. The Reflexion recipe (below) uses Loop under the hood.
See examples/core-flow/04-loop.ts for the full pattern.
PATTERNS — named compositions
Every paper in the agent literature is a composition of 2 primitives + 4 compositions. Same alphabet, different word.
| Pattern | Built from | Paper |
|---|---|---|
| ReAct | Agent (default) | Yao 2022 |
| Reflexion | Sequence(Agent, LLM-critique, Agent) — wraps Loop | Shinn 2023 |
| Tree-of-Thoughts | Parallel(Agent × N) + LLM-rank | Yao 2023 |
| Self-Consistency | Parallel(Agent × N) + majority-vote | Wang 2022 |
| Debate | Loop(Agent × 2 + judge) | Du 2023 |
| Map-Reduce | Parallel(Agent × N) + LLM-merge | Dean 2004 |
| Swarm | Agent whose tools are other Agents | OpenAI 2024 |
Reflexion — worked example
Iterative propose/critique/revise. The recipe is Sequence(Agent, critique-LLM, Agent) wrapped in a Loop that exits when the critic says DONE:
const runner = reflection({ provider: provider ?? exampleProvider('pattern', { respond: () => replies[i++ % replies.length]! }), model: 'mock', proposerPrompt: 'Write or revise a short poem about night.', criticPrompt: 'Critique the poem. When it is good enough include the marker DONE.', maxIterations: 5,});Swarm — worked example
LLM-driven routing to specialist runners. A route function (sync, pure) decides which agent handles each turn; handoffs continue until route returns undefined:
const router = swarm({ agents: [ { id: 'triage', runner: triage }, { id: 'billing', runner: billing }, { id: 'tech', runner: tech }, ], // Route function — pure sync over the current message. First turn goes // to triage, then to billing or tech based on content, then halts. route: (input) => { const msg = input.message.toLowerCase(); if (msg.includes('[billing]')) return undefined; // billing done → halt if (msg.includes('[tech]')) return undefined; // tech done → halt if (msg.includes('refund') || msg.includes('bill')) return 'billing'; if (msg.includes('status') || msg.includes('error')) return 'tech'; return 'triage'; // first turn }, maxHandoffs: 5,});Use for: customer support triage, multi-domain assistants, expert routing.
See examples/patterns/ for the rest — Tree-of-Thoughts, Self-Consistency, Debate, Map-Reduce all run end-to-end with npm run example examples/patterns/<file>.ts.
CONTEXT ENGINEERING (cross-cutting)
These are not primitives. They're flavors of the Injection primitive — content that lands in one of the agent's three slots (system / messages / tools) under one of four triggers.
Injection = slot × trigger × cache| Flavor | Slot | Trigger | Purpose |
|---|---|---|---|
| Steering | system | always | Persona, tone, output format, safety rules |
| Instruction | system or messages | rule | Conditional behavior (activeWhen predicate) |
| Skill | system + tools | llm-activated | LLM calls read_skill('billing') to unlock body + tools |
| Fact | system or messages | always | Developer-supplied data (user profile, env, current time) |
| Memory | messages | rule | Persist conversation / facts across turns |
| RAG | messages | rule + score | Retrieve relevant chunks at query time |
Slot is a default, not a coupling
The same Skill can live in tools (schema only, discovered via read_skill), messages (body injected on activation), or system (baked into the prompt as steering). Strategy is config; flavor is independent of (slot, trigger).
Why this grid matters — four backtrack questions
Because the framework owns the injection, every LLM call backtracks to four typed answers:
| Question | What the trace tells you |
|---|---|
| What was injected? | Every flavor of content the LLM saw on this iteration |
| Who triggered it? | Which rule fired (always / rule / on-tool-return / llm-activated) |
| When it fired? | Which iteration of the ReAct loop, after which event |
| How it landed? | Which slot, what position, what cache strategy |
Those four answers are why contextual errors stop being invisible. See Why agentfootprint? for the full bug-class argument.
RAG (worked example)
defineRAG is sugar over defineMemory({ type: SEMANTIC, strategy: TOP_K }) with RAG-friendly defaults — same plumbing, different intent label:
const docs = defineRAG({ id: 'product-docs', description: 'Product documentation chunks', store, embedder, topK: 2, // up to 2 most-relevant docs per query threshold: 0.5, // strict — drop weak matches asRole: 'user', // chunks land as user-role context (RAG default)});const agent = Agent.create({ provider: provider ?? mock({ reply: 'Refunds are processed within 3 business days.' }), model: 'mock', maxIterations: 1,}) .system('You answer support questions using the retrieved docs.') .rag(docs) .build();See dedicated guides: Memory, Skills, Instructions, RAG, Tools.
Common patterns
Every primitive and composition supports:
| Feature | Method | Description |
|---|---|---|
| Recorders | .attach(recorder) | Passive observation (tokens, cost, narrative) |
| Event listeners | .on('agentfootprint.<domain>.<event>', fn) | Typed event subscription |
| Streaming | provider.stream(...) | Token-by-token output |
| Pause / Resume | askHuman / pauseHere | JSON-checkpointed, cross-server resumable |
| Memory | .memory(defineMemory({...})) | Cross-run state with multi-tenant identity |
| Reliability gate | .reliability({ preCheck, postDecide, ... }) | Rules-based retry / fallback / fail-fast (v2.11.5+) |
Next steps
- Quick Start — first agent runs in 5 minutes
- Memory guide — defineMemory across 4 types × 7 strategies
- Skills guide — context engineering for instructions
- Reliability gate — rules-based retry / fallback / fail-fast
- Observability guide — recorders, event taxonomy
Why agentfootprint?
AI agents fail in a way classic debuggers can't see — the code is right, but the answer is wrong because of the context the model saw. agentfootprint records that context so you can trace any wrong answer back to its cause.
vs Other Frameworks
How agentfootprint sits relative to LangChain, LangGraph, CrewAI, AutoGen, Mastra, Genkit, Pydantic AI, DSPy, and Inngest AgentKit. We didn't have to choose between them.
