How agentfootprint thinks
A framework’s first job is to decide what it won’t be. agentfootprint is a substrate for context engineering — not an LLM SDK, not an agent platform, not a workflow orchestrator. This page is the perspective behind those rejections.
Why we exist — agents have a new class of bug
Section titled “Why we exist — agents have a new class of bug”For fifty years, software bugs have been logic errors. A wrong condition, a missed edge case, an off-by-one. You step through the code with a debugger until you find the bad branch.
LLM-powered apps add a second class of bug: contextual errors. The code is correct. The model is correct. The answer is wrong because the LLM’s decision rests on context that was ambiguous, missing, or invalidated at the moment of inference.
Tracking which content the model actually saw, and why, is the entire debugging job. Without it, the failure mode is invisible:
| What got injected wrong | What the model did |
|---|---|
Wrong instruction landed in the system slot | Followed the wrong rule |
| Predicate fired one iteration too early | Reasoned with stale assumptions |
Skill body missing when LLM called read_skill | Invented its own |
| Cache prefix invalidated mid-iteration | Saw a silently rewritten stale version |
Tool returned but on-tool-return injection didn’t fire | Couldn’t interpret the result |
That’s the gap agentfootprint exists to close. A framework that owns the control flow can debug logic errors. A framework that owns the injection can debug contextual errors — because every injection is a typed event with a where, when, why, and how-it-cached.
What we are
Section titled “What we are”We are an abstraction over context engineering — the discipline of deciding what content lands in which slot of an LLM call, when, and why. Every LLM call has three slots: system, messages, tools. Every agent feature — Skills, Steering, Guardrails, RAG, Tool APIs, Memory — is content flowing into one of those slots, decided by some rule, at some moment in the iteration loop. agentfootprint models all of them as one primitive:
Injection = slot × trigger × cacheThree slots are fixed by the LLM API surface. Four triggers are fixed by when the framework can fire (always at build time, rule on a predicate, on-tool-return after a tool result, llm-activated when the LLM calls read_skill). Cache strategy is per-injection. Every flavor — present and future — fits this grid.
You describe injections declaratively. The framework evaluates every trigger every iteration, composes the slots, observes every decision as a typed event, and persists checkpoints you can replay six months later.
What that buys you
Section titled “What that buys you”Because we own the injection, every LLM call backtracks to four typed answers:
- What was injected (which flavor, which content)
- Who triggered it (which rule fired)
- When it fired (which iteration, after which event)
- How it landed (which slot, with what cache strategy)
Same trace, three workflows:
| Mode | What you do | What the trace gives you |
|---|---|---|
| Live | Debug as you build | Exactly which injection produced which token; which predicate fired this iteration; which prefix actually got cached |
| Offline | Monitor what shipped | Replay any past run from its trace. Alert on drift. Attribute cost per injection. |
| Detailed | Improve via export | Every successful trajectory is labeled training data for SFT, DPO, or process-RL — no separate data-collection phase |
And a fourth, novel: the agent can read its own trace. Six months after the agent rejected loan #42, “why did you reject it?” answers from the recorded evidence (creditScore=580, threshold=600), not a rerun. Causal memory turns the trace into the agent’s working memory.
What footprintjs gives us for free
Section titled “What footprintjs gives us for free”The agent space has many credible primary abstractions — pipelines (LangChain), graphs (LangGraph), crews (CrewAI · AutoGen), typed bundles (Mastra · Genkit · Pydantic AI), compiled prompts (DSPy), durable workflows (Inngest AgentKit). We didn’t have to choose between them.
agentfootprint is built on footprintjs — the flowchart pattern for backend code. footprintjs gives us every one of those abstractions out of the box:
| Capability | What footprintjs hands us |
|---|---|
| Composition | Sequence · Parallel · Conditional · Loop |
| State machines | The ReAct loop is a flowchart |
| Multi-agent crews | Compose Agents through control flow — no special class needed |
| Durable workflows | pauseHere() plus JSON-portable resume() |
| Typed observation | 57+ events for free, because the framework owns the loop |
So we used the budget those abstractions would have cost us to invest deeply in something they all leave to the developer: the injection loop.
What we are not
Section titled “What we are not”- We are not an LLM SDK wrapper. Provider adapters are 100-line shims around vendor SDKs. They translate
LLMRequest↔ vendor format. They do not invent abstractions. - We are not an agent platform. No deployment dashboard, no managed runtime, no saved-agent registry. You bring your own infrastructure.
- We are not a workflow orchestrator. Sequence / Parallel / Conditional / Loop are 4 compositions, not 40 step types. You will not find scheduled triggers, queue management, or retry-with-DLQ semantics here.
- We are not multi-modal.
LLMMessage.contentis a string. Images and video would multiply the surface; we said no. - We are not a graph DSL. Compositions are typed functions, not a separate graph language. You will not write JSON or YAML to describe an agent.
These rejections are deliberate. Each thing we are not is a thing we DO not have to maintain, document, observe, or version.
What we believe
Section titled “What we believe”The framework should own the loop
Section titled “The framework should own the loop”The biggest lesson from React, autograd, Prisma, Kubernetes — every load-bearing dev tool of the last decade — is that the framework owns the traversal. When the framework owns it, the framework can record everything that happens inside it. Without a single round-trip on your part. Without an instrumentation layer.
agentfootprint’s flowchart-pattern substrate (footprintjs) is what makes our typed-event stream + replayable traces automatic. You do not write agent.observe(...). You write .steering(rule) and the framework already knows when that rule fired, on which iteration, against which context.
Owning the loop means recomposing context every iteration — Dynamic ReAct
Section titled “Owning the loop means recomposing context every iteration — Dynamic ReAct”The corollary that makes “context engineering” worth the name. Static prompt assembly is what every framework does. Per-iteration recomposition — re-running every Injection trigger, recomputing the system prompt, recomputing the tool list, all based on the latest tool result + accumulated state — is what makes context engineering compositional instead of static.
This is structurally distinct from LangChain (assembles prompts once per turn), LangGraph (composes state per node, not per loop iteration), or CrewAI (tool-aware but not iteration-aware). It’s the closest a framework comes to “executive-function-like” behavior — context that adapts to what the agent just observed, not just what it was originally told.
The use cases that emerge are real, not theoretical: tool-by-tool LLM steering, adaptive tool exposure (per-skill gating), cost guardrails, iterative format refinement, failure adaptation, few-shot evolution, long-context skill body refresh. See the Dynamic ReAct guide for the full taxonomy.
If “context engineering” is the discipline, Dynamic ReAct is what makes the discipline expressive. Without it, the bar drops to static prompt assembly and we’d have nothing distinctive to say.
Every named pattern is a recipe, not a class
Section titled “Every named pattern is a recipe, not a class”Reflexion, Tree-of-Thoughts, Self-Consistency, Debate, Map-Reduce, Swarm — every named pattern in the agent literature reduces to a composition of our 2 primitives + 4 compositions. We were one paper away from shipping a thirteenth Agent class when we stopped and asked: what if every new pattern is just a composition of what we already have? That question is why our core stays small as the field grows. New paper drops? New recipe in examples/patterns/. No new engine code.
Mocks are first-class, not an afterthought
Section titled “Mocks are first-class, not an afterthought”Generative AI development is expensive when every iteration hits a paid API. We treat $0 development as a first-class workflow — the entire app (agent, context, memory, RAG, MCP) builds against in-memory mocks. Real infrastructure swaps in one boundary at a time. The flowchart, narrative, recorders, and tests do not change between dev and prod.
This is structurally different from “you can use mocks if you want.” We assume you will, and design for it.
Multi-tenant isolation is enforced at the storage boundary
Section titled “Multi-tenant isolation is enforced at the storage boundary”Every memory call takes a MemoryIdentity tuple — { tenant?, principal?, conversationId }. Adapters MUST namespace internal keys by the full tuple. A bug passing the wrong tenant surfaces as “no data” — never as a cross-tenant leak. This is a non-negotiable property for any framework that wants to be deployed in regulated multi-tenant environments. We chose it as a default, not an option.
We will say “coming in vN.x” instead of pretending
Section titled “We will say “coming in vN.x” instead of pretending”When a feature isn’t shipped yet, we say so explicitly — even when prose would read smoother by glossing over it. CHANGELOG is the source of truth for what’s actually released. Code blocks for unshipped features are clearly marked as illustrative pseudo-code. This rule cost us prose pretty in the short term and bought us trust in the long term.
What we ask of you
Section titled “What we ask of you”If you adopt this framework, you adopt the discipline that comes with it:
- Pass per-tenant
identityat everyagent.run()in production. The default is for prototypes only. - Use the typed event stream, not
console.loginside tools. The framework already knows; subscribe. - Build with mocks first. Run the agent end-to-end against
mock()before you wire the real provider. Catch bugs in CI for $0; pay for tokens once you ship. - Express new patterns as compositions of what exists. If you find yourself wanting a new primitive, ask first whether your idea is a Sequence-of-existing-stuff in disguise. It usually is.
The unbuilt future
Section titled “The unbuilt future”We have rejected building several things that other frameworks ship, in some cases vigorously requested. Examples:
- A graph DSL — JavaScript is the DSL.
- A managed runtime — bring your own Lambda / container / cron.
- Per-step retry queues — that’s your queue’s job.
- A saved-agent registry — your filesystem + git already do this.
- Model-output validation built into the agent — use a tool for that and call it.
Each of these would expand the surface. None of them would make the abstraction sharper. We will continue saying no to additions that do not earn a place in the substrate.