Key Concepts

This page is the model. 2 primitives, 4 compositions, N named patterns, the Injection cross-cut, the infrastructure layer underneath. Once these are in your head, every agent paper, every framework, every system you encounter can be located on the grid — including yours.

If you haven’t yet, read Why agentfootprint? first. It’s the why. This page is the what.

Key terms

Term	Meaning
LLM	One prompt in, one response out. The atomic invocation.
Agent	An LLM that loops: calls tools, reads results, calls more tools, responds. Agent = ReAct — if it doesn’t loop-with-tools, it isn’t an Agent.
ReAct	”Reasoning + Acting” — the loop where an LLM thinks, acts (calls a tool), observes the result, and repeats. (Yao et al. 2022)
Composition	How primitives arrange: Sequence (one after another), Parallel (at the same time), Conditional (branch on a decider), Loop (iterate until quality bar).
Pattern	A named configuration — Reflexion, Tree-of-Thoughts, Hierarchy. Every named paper is a recipe of primitives + compositions, not a new class.
Context engineering	What you inject into an Agent’s three slots (`system` / `messages` / `tools`). Skills, Steering, Guardrails, RAG, Memory, Tool APIs — all live here.
Injection	The unified primitive behind every flavor of context engineering — `slot × trigger × cache`. Skills, Steering, Instructions, Facts, Memory, RAG all reduce to this.
Slot	One of three fixed regions of every LLM call: `system`, `messages`, `tools`. Fixed by the LLM API surface.
Trigger	When an Injection fires: `always` (build-time), `rule` (predicate), `on-tool-return` (after a tool result), `llm-activated` (LLM calls `read_skill`).
Cache	How stable the injection’s content is across iterations. Determines where the framework places provider cache markers (80–90% cheaper prefixes for stable content).
Narrative	A structured execution trace — what happened, in what order, with what data. Not logs — connected entries with provenance.
Recorder	A passive observer that collects data (tokens, cost, tool usage) during execution without affecting behavior.
Provider	The LLM backend — Claude, GPT, Bedrock, Ollama, or your own. Swap with one line.

The taxonomy

agentfootprint organizes AI work into five layers. Each layer is built from the one above it.

PRIMITIVES (2 — atomic invocation units)
  1. LLMCall  — one call
  2. Agent    — a loop of calls + tools + decisions (= ReAct)

COMPOSITIONS (4 — how primitives arrange)
  1. Sequence     — one after another
  2. Parallel     — at the same time, with merge
  3. Conditional  — branch on a decider
  4. Loop         — iterate until quality bar

PATTERNS (N — named configurations; every named paper is a recipe)
  ReAct              = Agent (default)
  Reflexion          = Sequence(Agent, LLM-critique, Agent)
  Tree-of-Thoughts   = Parallel(Agent × N) + LLM-rank
  Self-Consistency   = Parallel(Agent × N) + majority-vote
  Debate             = Loop(Agent × 2 + judge)
  Map-Reduce         = Parallel(Agent × N) + LLM-merge
  Swarm              = Agent whose tools are other Agents

CONTEXT ENGINEERING (cross-cutting — Injection primitive into Agent slots)
  Steering     → system-prompt           (always-on)
  Instruction  → system-prompt|messages  (rule-gated)
  Skill        → system-prompt + tools   (LLM-activated)
  Fact         → system-prompt|messages  (always-on data)
  Memory       → messages                (cross-run)
  RAG          → messages                (retrieve + score-threshold)

FEATURES (infrastructure)
  Providers · Mocks-first · Observability · Pause/Resume · Resilience · Reliability gate

Two theses to hold in your head:

Agent = ReAct. Not “Agent with ReAct as default.” The Agent primitive IS the ReAct loop. If it doesn’t loop-with-tools, it isn’t an Agent — it’s an LLM call.
Every named pattern = a composition of the 2 primitives + 4 compositions. Don’t invent new Agent classes for every paper (that’s LangChain’s mistake). Express papers as recipes.

All runners share the same shape — create → configure → build → run → observe.

PRIMITIVES

1. LLMCall — one call

The atom. One prompt in, one response out. No tools, no loop.

const llm = LLMCall.create({
  provider: provider ?? exampleProvider('core', { reply: "It's sunny in San Francisco." }),
  model: 'mock-weather',
  temperature: 0.2,
})
  .system('You are a terse weather assistant. One sentence answers.')
  .build();

Use for: summarization, classification, translation, extraction.

2. Agent — a loop of calls + tools + decisions (= ReAct)

Agent = ReAct. This is the definition. The Agent primitive IS the ReAct loop — thinks, acts (tool call), observes, repeats.

const agent = Agent.create({
  provider: provider ?? exampleProvider('feature', { respond: weatherRespond }),
  model: 'mock',
  maxIterations: 5,
})
  .system('You answer weather questions using the `weather` tool.')
  .tool({
    schema: {
      name: 'weather',
      description: 'Get current weather for a city.',
      inputSchema: {
        type: 'object',
        properties: { city: { type: 'string' } },
        required: ['city'],
      },
    },
    execute: async (args) => `${(args as { city: string }).city}: sunny, 72°F`,
  })
  .build();

Use for: research, code generation, customer support, data analysis — anywhere you need tools-in-a-loop.

Dynamic vs Classic ReAct

agentfootprint’s Agent loops back to SystemPrompt every iteration, not just CallLLM. Slots recompose every pass — injections that fired on the previous tool result get a chance to recompose the next prompt. Classic ReAct freezes the slots at the top of the turn.

Classic ReAct loops back to CallLLM (slots frozen, 12 tools every iteration). Dynamic ReAct loops back to SystemPrompt (slots recompose, tools shrink from 1 to 5 as skills activate).

Iteration	Classic ReAct	Dynamic ReAct (agentfootprint)
1	12 tools shown	1 tool (`read_skill`)
2	12 tools shown	5 tools (skill activated)
3	12 tools shown	5 tools

📖 Dynamic ReAct guide for the full taxonomy of what this unlocks.

COMPOSITIONS

The four ways primitives arrange. Every named pattern is built from these.

The 4 control flows: Sequence (linear chain A → B → C), Parallel (fan-out + fan-in), Conditional (diamond branch), Loop (cycle back).

1. Sequence — one after another

Chains multiple runners into a sequential pipeline. Each step’s output flows into the next via pipeVia (when shapes don’t match):

const pipeline = Sequence.create({ name: 'IntakePipeline' })
  .step('classify', classify)
  .pipeVia((label) => ({ message: `Intent: ${label.trim()}` }))
  .step('respond', respond)
  .build();

Use for: multi-step workflows, classify-then-respond, ETL pipelines.

2. Parallel — at the same time

Run multiple runners simultaneously and merge their results. Merge can be a function (deterministic) or an LLM (synthesis):

// STRICT (default): any branch failure → whole Parallel throws
const committee = Parallel.create({ name: 'Committee' })
  .branch('legal', brief('legal'))
  .branch('ethics', brief('ethics'))
  .branch('cost', brief('cost'))
  .mergeWithFn((results) =>
    Object.entries(results)
      .map(([id, r]) => `  ${id}: ${r}`)
      .join('\n'),
  )
  .build();

Use for: multi-perspective analysis, A/B comparison, ensemble approaches.

3. Conditional — branch on a decider

if/else routing between runners. First matching predicate wins; fallback runs .otherwise().

const triage = Conditional.create({ name: 'Triage' })
  .when(
    'urgent',
    (input) => /\b(urgent|asap|outage|down|critical)\b/i.test(input.message),
    urgent,
  )
  .otherwise('normal', normal)
  .build();

Use for: triage, content classification, intent-based routing.

4. Loop — iterate until quality bar

Repeat a runner until a condition is met or a budget is exhausted. The Reflexion recipe (below) uses Loop under the hood.

See examples/core-flow/04-loop.ts for the full pattern.

PATTERNS — named compositions

Every paper in the agent literature is a composition of 2 primitives + 4 compositions. Same alphabet, different word.

Pattern	Built from	Paper
ReAct	Agent (default)	Yao 2022
Reflexion	Sequence(Agent, LLM-critique, Agent) — wraps Loop	Shinn 2023
Tree-of-Thoughts	Parallel(Agent × N) + LLM-rank	Yao 2023
Self-Consistency	Parallel(Agent × N) + majority-vote	Wang 2022
Debate	Loop(Agent × 2 + judge)	Du 2023
Map-Reduce	Parallel(Agent × N) + LLM-merge	Dean 2004
Swarm	Agent whose tools are other Agents	OpenAI 2024

Reflexion — worked example

Iterative propose/critique/revise. The recipe is Sequence(Agent, critique-LLM, Agent) wrapped in a Loop that exits when the critic says DONE:

const runner = reflection({
  provider: provider ?? exampleProvider('pattern', { respond: () => replies[i++ % replies.length]! }),
  model: 'mock',
  proposerPrompt: 'Write or revise a short poem about night.',
  criticPrompt:
    'Critique the poem. When it is good enough include the marker DONE.',
  maxIterations: 5,
});

Swarm — worked example

LLM-driven routing to specialist runners. A route function (sync, pure) decides which agent handles each turn; handoffs continue until route returns undefined:

const router = swarm({
  agents: [
    { id: 'triage', runner: triage },
    { id: 'billing', runner: billing },
    { id: 'tech', runner: tech },
  ],
  // Route function — pure sync over the current message. First turn goes
  // to triage, then to billing or tech based on content, then halts.
  route: (input) => {
    const msg = input.message.toLowerCase();
    if (msg.includes('[billing]')) return undefined; // billing done → halt
    if (msg.includes('[tech]')) return undefined; // tech done → halt
    if (msg.includes('refund') || msg.includes('bill')) return 'billing';
    if (msg.includes('status') || msg.includes('error')) return 'tech';
    return 'triage'; // first turn
  },
  maxHandoffs: 5,
});

Use for: customer support triage, multi-domain assistants, expert routing.

See examples/patterns/ for the rest — Tree-of-Thoughts, Self-Consistency, Debate, Map-Reduce all run end-to-end with npm run example examples/patterns/<file>.ts.

CONTEXT ENGINEERING (cross-cutting)

These are not primitives. They’re flavors of the Injection primitive — content that lands in one of the agent’s three slots (system / messages / tools) under one of four triggers.

Injection = slot × trigger × cache

Every LLM call has 3 fixed slots (system, messages, tools); every flavor lands in one slot under one of 4 fixed triggers. The grid is the entire context-engineering surface.

Flavor	Slot	Trigger	Purpose
Steering	`system`	always	Persona, tone, output format, safety rules
Instruction	`system` or `messages`	rule	Conditional behavior (`activeWhen` predicate)
Skill	`system` + `tools`	llm-activated	LLM calls `read_skill('billing')` to unlock body + tools
Fact	`system` or `messages`	always	Developer-supplied data (user profile, env, current time)
Memory	`messages`	rule	Persist conversation / facts across turns
RAG	`messages`	rule + score	Retrieve relevant chunks at query time

Why this grid matters — four backtrack questions

Because the framework owns the injection, every LLM call backtracks to four typed answers:

Question	What the trace tells you
What was injected?	Every flavor of content the LLM saw on this iteration
Who triggered it?	Which rule fired (`always` / `rule` / `on-tool-return` / `llm-activated`)
When it fired?	Which iteration of the ReAct loop, after which event
How it landed?	Which slot, what position, what cache strategy

Those four answers are why contextual errors stop being invisible. See Why agentfootprint? for the full bug-class argument.

RAG (worked example)

defineRAG is sugar over defineMemory({ type: SEMANTIC, strategy: TOP_K }) with RAG-friendly defaults — same plumbing, different intent label:

const docs = defineRAG({
  id: 'product-docs',
  description: 'Product documentation chunks',
  store,
  embedder,
  topK: 2,           // up to 2 most-relevant docs per query
  threshold: 0.5,    // strict — drop weak matches
  asRole: 'user',    // chunks land as user-role context (RAG default)
});

const agent = Agent.create({
  provider: provider ?? mock({ reply: 'Refunds are processed within 3 business days.' }),
  model: 'mock',
  maxIterations: 1,
})
  .system('You answer support questions using the retrieved docs.')
  .rag(docs)
  .build();

See dedicated guides: Memory, Skills, Instructions, RAG, Tools.

Common patterns

Every primitive and composition supports:

Feature	Method	Description
Recorders	`.attach(recorder)`	Passive observation (tokens, cost, narrative)
Event listeners	`.on('agentfootprint.<domain>.<event>', fn)`	Typed event subscription
Streaming	`provider.stream(...)`	Token-by-token output
Pause / Resume	`askHuman` / `pauseHere`	JSON-checkpointed, cross-server resumable
Memory	`.memory(defineMemory({...}))`	Cross-run state with multi-tenant identity
Reliability gate	`.reliability({ preCheck, postDecide, ... })`	Rules-based retry / fallback / fail-fast (v2.11.5+)

Next steps

Quick Start — first agent runs in 5 minutes
Memory guide — defineMemory across 4 types × 7 strategies
Skills guide — context engineering for instructions
Reliability gate — rules-based retry / fallback / fail-fast
Observability guide — recorders, event taxonomy