🎓 New to agents
Start with the 5-min Quick Start. First agent runs offline against mock(). No API key needed.
Every other agent framework ships agents you debug with logs. agentfootprint ships agents whose every decision, tool call, memory write, and context injection is captured as a typed event during one DFS traversal — no instrumentation, no post-processing.
The framework owns the loop, so the framework can record everything that happens inside it. Same way autograd’s forward-pass traversal makes gradient inspection automatic, agentfootprint’s flowchart traversal makes the typed-event stream + replayable traces automatic.
Every agentfootprint Agent has three context slots: system, messages, tools. Everything you’d want to put into an agent’s context — Skills, Steering, Instructions, Facts, Memory, RAG, tool definitions — is one of N Injections. Each Injection has a trigger that decides when it activates.
┌─────────────────────────────────────────────────────────────────┐│ Agent (one loop) ││ ││ Iteration N: ┌──────────┐ ┌──────────┐ ┌──────────┐ ││ │ system │ │ messages │ │ tools │ ││ └────▲─────┘ └────▲─────┘ └────▲─────┘ ││ │ │ │ ││ │ Injections (re-evaluated every iter) ││ │ ┌──────────────────────────────┐ ││ │ │ Skill / Steering / Instruct │ → system ││ │ │ Fact / Memory / RAG │ → messages││ │ │ ToolProvider / gatedTools │ → tools ││ │ └──────────────────────────────┘ ││ ││ Trigger taxonomy: 4 kinds ││ ││ always-on — every iteration ││ turn-start — first iteration of a turn ││ on-tool-return — after each tool call's result ││ llm-activated — model picks via Skills tool ││ │└─────────────────────────────────────────────────────────────────┘| Trigger | Fires | Use for |
|---|---|---|
always-on | Every iteration | Steering (“respond JSON”), constant facts |
turn-start | First iteration of a turn | User-specific Skills, session memory |
on-tool-return | After every tool call | Context based on what the tool just returned |
llm-activated | Model decides via listSkills/readSkill tools | Heavy / costly Skills the model loads on demand |
You don’t compose context; you declare what activates when, and the framework recomposes the three slots per iteration.
const memory = defineMemory({ id: 'last-10', description: 'Keep the last 10 turns of conversation.', type: MEMORY_TYPES.EPISODIC, strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 }, store,});npm install agentfootprint footprintjsnpm run example examples/memory/01-window-strategy.tsOutputs “You just asked about your previous message.” in under 100ms. Deterministic, free, no API key. Swap mock(...) for anthropic(...) / openai(...) / bedrock(...) / ollama(...) for production. Nothing else changes.
Each one alone is good. Together they multiply — the cache amplifies Dynamic ReAct’s per-iteration recomposition; the trace export captures the result of every recomposition + every cache hit. Strip any one and you’re back to a hand-rolled framework.
Every other ReAct framework runs prompt → llm → tool in a loop where the prompt is frozen at iteration 1. agentfootprint recomposes all three slots every iteration based on the latest tool result. That makes possible:
on-tool-return injection adds “the user has admin privileges” the moment a permission-check tool returns yesgatedTools(inner, scope.role === 'admin') exposes a different tool set per iteration based on what the agent has learnedlistSkills to discover, then readSkill to load only the Skills it needs (zero token cost for unused Skills)loopTo() an earlier stage when validation fails, with a fresh injection cycle each retryThis is what “the framework owns the loop” unlocks — see Dynamic ReAct guide.
Per-iteration recomposition would burn tokens (everything re-builds every iteration) without caching. v2.6 ships a provider-agnostic cache layer — declare cache: 'persistent' on a Skill / Steering / Instruction and the framework picks the right cache mechanism per provider:
| Provider | Mechanism | Activated by cache: 'persistent' |
|---|---|---|
| Anthropic | cache_control: { type: 'ephemeral' } blocks | ✅ |
| OpenAI | Automatic (cached prefix discount) | ✅ (no-op — OpenAI handles internally) |
| Bedrock | Same as Anthropic via Claude on Bedrock | ✅ |
| All | noOp fallback for providers without prompt caching | ✅ |
Real Sonnet result with v2.6 cache layer: a Dynamic ReAct workload dropped 36,322 → 6,535 input tokens (−82%) end-to-end. The cache compounds with Dynamic ReAct: iteration 2+ replays the cached system prompt + Skills, only the new tool result is uncached.
Every run exports as one JSON checkpoint. Persist it (Redis, Postgres, S3); resume on a different process, day, or server. defineMemory({ type: CAUSAL }) stores the agent’s decision evidence — every decide() value the flowchart captured.
New questions cosine-match past queries; matching snapshots inject the prior decision evidence; the LLM answers from EXACT past facts, not reconstruction. See Memory guide § Causal.
The same JSON shape feeds SFT / DPO / process-RL training pipelines (exportForTraining({ format }) on the roadmap).
🎓 New to agents
Start with the 5-min Quick Start. First agent runs offline against mock(). No API key needed.
🛠️ LangChain / CrewAI / LangGraph user
Start with vs other frameworks — honest comparison, migration sketch.
🏗️ Architecting an enterprise rollout
Start with Deployment — multi-tenant identity, audit trails, peer-deps, observability stack.
🏛️ Doing production due diligence
Start with Architecture — dependency graph, subsystem boundaries, scope semantics, the one DFS traversal that powers everything.
🔬 Researcher / extending the framework
Start with Skills, explained — context-engineering essay; the strongest writing in the docs.
Three example shapes, all runnable end-to-end with npm run example examples/<file>.ts:
examples/context-engineering/06-mixed-flavors.tsexamples/patterns/05-tot.ts (Tree-of-Thoughts) or 01-self-consistency.ts| Boundary | Dev (mock) | Prod (one-line swap) |
|---|---|---|
| LLM provider | mock({ replies }) | anthropic() · openai() · bedrock() · ollama() |
| Embedder | mockEmbedder() | OpenAI / Cohere / Bedrock embedder (planned) |
| Memory store | InMemoryStore | RedisStore · AgentCoreStore (planned: pgvector / Pinecone) |
| Cache strategy | noOpCache() | anthropicCache() · openaiCache() · bedrockCache() (v2.6) |
| Observability | sync inline | agentcoreObservability() + detach: { driver } (v2.8.1) |
| MCP server | mockMcpClient({ tools }) | mcpClient({ transport }) |
| Tool execute | inline closure | real implementation |
Ship the patterns first; pay for tokens last.
agentcoreObservability, cloudwatchObservability, xrayObservability (hierarchical AWS X-Ray traces), otelObservability (OpenTelemetry — unlocks Honeycomb / Datadog / Splunk / etc. via OTLP). All under agentfootprint/observability-providers.enable.observability/cost/liveStatus accept typed strategies + optional detach driverswithCircuitBreaker (vendor outage detection, sub-µs fail-fast), outputFallback (3-tier degradation on schema failure), resumeOnError (mid-run failure recovery via JSON-serializable checkpoint). Guide →enable.* API, vendor adapters, detach for non-blocking telemetry| Version | What | Status |
|---|---|---|
| v2.8.1 | agentcoreObservability (CloudWatch Logs) | ✅ |
| v2.8.2 | cloudwatchObservability | ✅ |
| v2.8.3 | xrayObservability (AWS distributed tracing) | ✅ |
| v2.9.0 | otelObservability (the industry-standard unlock — covers ~every backend via OTLP) | ✅ |
| v2.10.0 | Reliability — withCircuitBreaker | ✅ |
| v2.10.1 | Reliability — outputFallback | ✅ |
| v2.10.2 | Reliability — resumeOnError + RunCheckpointError | ✅ |
| v2.11.0 | Reliability guide + runnable example + integration test (this release) | ✅ |
| v2.12.x | cost-providers subpath + first cost adapter (TBD: Stripe metered billing OR pricing-data refresh) | planned |
| v2.13.x | lens-browser / lens-cli (visual debugger backends) | planned |