Skip to content

agentfootprint

Context engineering, abstracted. Same kind of move PyTorch made for autograd, Express made for HTTP, Prisma made for SQL — agentfootprint owns the agent loop, so the framework can do the work other frameworks leave to you: per-iteration recomposition, prompt caching, and replayable typed-event traces.

Every other agent framework ships agents you debug with logs. agentfootprint ships agents whose every decision, tool call, memory write, and context injection is captured as a typed event during one DFS traversal — no instrumentation, no post-processing.

The framework owns the loop, so the framework can record everything that happens inside it. Same way autograd’s forward-pass traversal makes gradient inspection automatic, agentfootprint’s flowchart traversal makes the typed-event stream + replayable traces automatic.

The mental model — three slots, four triggers, one Injection

Section titled “The mental model — three slots, four triggers, one Injection”

Every agentfootprint Agent has three context slots: system, messages, tools. Everything you’d want to put into an agent’s context — Skills, Steering, Instructions, Facts, Memory, RAG, tool definitions — is one of N Injections. Each Injection has a trigger that decides when it activates.

┌─────────────────────────────────────────────────────────────────┐
│ Agent (one loop) │
│ │
│ Iteration N: ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ system │ │ messages │ │ tools │ │
│ └────▲─────┘ └────▲─────┘ └────▲─────┘ │
│ │ │ │ │
│ │ Injections (re-evaluated every iter) │
│ │ ┌──────────────────────────────┐ │
│ │ │ Skill / Steering / Instruct │ → system │
│ │ │ Fact / Memory / RAG │ → messages│
│ │ │ ToolProvider / gatedTools │ → tools │
│ │ └──────────────────────────────┘ │
│ │
│ Trigger taxonomy: 4 kinds │
│ │
│ always-on — every iteration │
│ turn-start — first iteration of a turn │
│ on-tool-return — after each tool call's result │
│ llm-activated — model picks via Skills tool │
│ │
└─────────────────────────────────────────────────────────────────┘
TriggerFiresUse for
always-onEvery iterationSteering (“respond JSON”), constant facts
turn-startFirst iteration of a turnUser-specific Skills, session memory
on-tool-returnAfter every tool callContext based on what the tool just returned
llm-activatedModel decides via listSkills/readSkill toolsHeavy / costly Skills the model loads on demand

You don’t compose context; you declare what activates when, and the framework recomposes the three slots per iteration.

In 30 seconds — runs offline, no API key

Section titled “In 30 seconds — runs offline, no API key”
examples/memory/01-window-strategy.ts (region: define)
const memory = defineMemory({
id: 'last-10',
description: 'Keep the last 10 turns of conversation.',
type: MEMORY_TYPES.EPISODIC,
strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 },
store,
});
Terminal window
npm install agentfootprint footprintjs
npm run example examples/memory/01-window-strategy.ts

Outputs “You just asked about your previous message.” in under 100ms. Deterministic, free, no API key. Swap mock(...) for anthropic(...) / openai(...) / bedrock(...) / ollama(...) for production. Nothing else changes.

Each one alone is good. Together they multiply — the cache amplifies Dynamic ReAct’s per-iteration recomposition; the trace export captures the result of every recomposition + every cache hit. Strip any one and you’re back to a hand-rolled framework.

1. Dynamic ReAct — per-iteration recomposition

Section titled “1. Dynamic ReAct — per-iteration recomposition”

Every other ReAct framework runs prompt → llm → tool in a loop where the prompt is frozen at iteration 1. agentfootprint recomposes all three slots every iteration based on the latest tool result. That makes possible:

  • Tool-output-driven steeringon-tool-return injection adds “the user has admin privileges” the moment a permission-check tool returns yes
  • Per-iteration tool gatinggatedTools(inner, scope.role === 'admin') exposes a different tool set per iteration based on what the agent has learned
  • Skills that load themselves — model uses listSkills to discover, then readSkill to load only the Skills it needs (zero token cost for unused Skills)
  • Output-schema validation per iteration — re-prompt with the schema-violation message until structured output validates
  • Loop-target redirectionloopTo() an earlier stage when validation fails, with a fresh injection cycle each retry

This is what “the framework owns the loop” unlocks — see Dynamic ReAct guide.

2. The cache layer — provider-agnostic prompt caching

Section titled “2. The cache layer — provider-agnostic prompt caching”

Per-iteration recomposition would burn tokens (everything re-builds every iteration) without caching. v2.6 ships a provider-agnostic cache layer — declare cache: 'persistent' on a Skill / Steering / Instruction and the framework picks the right cache mechanism per provider:

ProviderMechanismActivated by cache: 'persistent'
Anthropiccache_control: { type: 'ephemeral' } blocks
OpenAIAutomatic (cached prefix discount)✅ (no-op — OpenAI handles internally)
BedrockSame as Anthropic via Claude on Bedrock
AllnoOp fallback for providers without prompt caching

Real Sonnet result with v2.6 cache layer: a Dynamic ReAct workload dropped 36,322 → 6,535 input tokens (−82%) end-to-end. The cache compounds with Dynamic ReAct: iteration 2+ replays the cached system prompt + Skills, only the new tool result is uncached.

3. Replayable causal traces — debug why, six months later

Section titled “3. Replayable causal traces — debug why, six months later”

Every run exports as one JSON checkpoint. Persist it (Redis, Postgres, S3); resume on a different process, day, or server. defineMemory({ type: CAUSAL }) stores the agent’s decision evidence — every decide() value the flowchart captured.

New questions cosine-match past queries; matching snapshots inject the prior decision evidence; the LLM answers from EXACT past facts, not reconstruction. See Memory guide § Causal.

The same JSON shape feeds SFT / DPO / process-RL training pipelines (exportForTraining({ format }) on the roadmap).

🎓 New to agents

Start with the 5-min Quick Start. First agent runs offline against mock(). No API key needed.

🛠️ LangChain / CrewAI / LangGraph user

Start with vs other frameworks — honest comparison, migration sketch.

🏗️ Architecting an enterprise rollout

Start with Deployment — multi-tenant identity, audit trails, peer-deps, observability stack.

🏛️ Doing production due diligence

Start with Architecture — dependency graph, subsystem boundaries, scope semantics, the one DFS traversal that powers everything.

🔬 Researcher / extending the framework

Start with Skills, explained — context-engineering essay; the strongest writing in the docs.

Three example shapes, all runnable end-to-end with npm run example examples/<file>.ts:

  • Customer support agent with skills, memory, and an audit trail — examples/context-engineering/06-mixed-flavors.ts
  • Research pipeline with multi-agent fan-out + LLM merge — examples/patterns/05-tot.ts (Tree-of-Thoughts) or 01-self-consistency.ts
  • Streaming chat agent with token-by-token output to a browser — see streaming guide for the SSE pattern
BoundaryDev (mock)Prod (one-line swap)
LLM providermock({ replies })anthropic() · openai() · bedrock() · ollama()
EmbeddermockEmbedder()OpenAI / Cohere / Bedrock embedder (planned)
Memory storeInMemoryStoreRedisStore · AgentCoreStore (planned: pgvector / Pinecone)
Cache strategynoOpCache()anthropicCache() · openaiCache() · bedrockCache() (v2.6)
Observabilitysync inlineagentcoreObservability() + detach: { driver } (v2.8.1)
MCP servermockMcpClient({ tools })mcpClient({ transport })
Tool executeinline closurereal implementation

Ship the patterns first; pay for tokens last.

  • 47 typed events across 13 domains — context, stream, agent, cost, permission, eval, memory, skill, composition, cache, and more
  • 6 detach drivers for non-blocking observability (microtask, immediate, setImmediate, setTimeout, sendBeacon, worker-thread)
  • Provider-agnostic prompt cache — Anthropic + OpenAI + Bedrock + NoOp, with a strategy registry for vendor-specific subpaths
  • 4 observability vendor adapters (v2.8.1 → v2.9.0) — agentcoreObservability, cloudwatchObservability, xrayObservability (hierarchical AWS X-Ray traces), otelObservability (OpenTelemetry — unlocks Honeycomb / Datadog / Splunk / etc. via OTLP). All under agentfootprint/observability-providers.
  • Strategy-pattern grouped enablers (v2.8) — enable.observability/cost/liveStatus accept typed strategies + optional detach drivers
  • Reliability subsystem (v2.10.x) — withCircuitBreaker (vendor outage detection, sub-µs fail-fast), outputFallback (3-tier degradation on schema failure), resumeOnError (mid-run failure recovery via JSON-serializable checkpoint). Guide →
VersionWhatStatus
v2.8.1agentcoreObservability (CloudWatch Logs)
v2.8.2cloudwatchObservability
v2.8.3xrayObservability (AWS distributed tracing)
v2.9.0otelObservability (the industry-standard unlock — covers ~every backend via OTLP)
v2.10.0Reliability — withCircuitBreaker
v2.10.1Reliability — outputFallback
v2.10.2Reliability — resumeOnError + RunCheckpointError
v2.11.0Reliability guide + runnable example + integration test (this release)
v2.12.xcost-providers subpath + first cost adapter (TBD: Stripe metered billing OR pricing-data refresh)planned
v2.13.xlens-browser / lens-cli (visual debugger backends)planned