Deployment
Friday afternoon, 4:50 PM. You’re about to push your agent service to production for the first time. What’s the checklist? Not “did the tests pass” (CI proved that) — the higher-stakes things: tenant isolation, peer-dep installed, observability hooks wired, secrets out of the codebase. This guide is the checklist.
The mocks-first → prod-swap workflow
Section titled “The mocks-first → prod-swap workflow”If you developed against the mocks-first stack, the deployment is mechanical:
| Boundary | Dev (mock) | Prod (one-line swap) |
|---|---|---|
| LLM provider | mock({ reply }) | anthropic({...}) · openai({...}) · bedrock({...}) |
| Embedder | mockEmbedder() | OpenAI / Cohere / Bedrock embedder factory (planned v2.6) |
| Memory store | InMemoryStore | RedisStore (agentfootprint/memory-redis) · AgentCoreStore (agentfootprint/memory-agentcore) |
| MCP server | mockMcpClient({ tools }) | mcpClient({ transport }) |
| Tool execute | inline closure | real implementation |
The flowchart, recorders, narrative, tests don’t change. Ship the patterns first; pay for tokens last.
Production checklist
Section titled “Production checklist”1. Multi-tenant identity at every agent.run()
Section titled “1. Multi-tenant identity at every agent.run()”Every memory call namespaces by MemoryIdentity. The default { conversationId: '_global' } is fine for prototypes — DANGEROUS in production multi-tenant apps. Pass per-tenant identity at every call site:
const identity = { tenant: req.tenantId, principal: req.userId, conversationId: req.threadId,};await agent.run({ message: req.body.message, identity });A bug that omits tenant surfaces as “no data found” — never as a cross-tenant leak. Adapters refuse cross-tenant reads at the storage boundary.
2. Peer-dep SDKs installed
Section titled “2. Peer-dep SDKs installed”agentfootprint declares optional SDKs in peerDependenciesMeta. Install only what you use:
npm install agentfootprint footprintjs # core (always)npm install @anthropic-ai/sdk # if using anthropic()npm install openai # if using openai()npm install @aws-sdk/client-bedrock-runtime # if using bedrock()npm install ioredis # if using RedisStorenpm install @aws-sdk/client-bedrock-agent-runtime # if using AgentCoreStorenpm install @modelcontextprotocol/sdk # if using mcpClient()Lazy-required at first call with friendly install hints. npm install agentfootprint on its own works for mocks-first dev.
3. Secrets in env, not code
Section titled “3. Secrets in env, not code”Provider factories accept apiKey as a constructor arg — pass via process.env:
const provider = anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });Never commit keys. Use a secrets manager (AWS Secrets Manager, HashiCorp Vault, Doppler, etc.) for production.
4. Observability hooks wired
Section titled “4. Observability hooks wired”Pick one per concern:
- Status line —
agent.enable.thinking({ onStatus })(terminal UIs, chat typing indicators) - Structured logs —
agent.enable.logging({ domains, logger })(pino, winston, console) - Cost tracking — pass
pricingTable+costBudgettoAgent.create() - Custom recorders —
agent.attach(myRecorder)for aggregation
See Observability guide for the full surface.
5. Resilience decorators wrapping the provider
Section titled “5. Resilience decorators wrapping the provider”Production providers should be wrapped:
const provider = withRetry( withFallback( anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! }), openai({ apiKey: process.env.OPENAI_API_KEY! }), ), { maxAttempts: 5 },);See Resilience guide for withRetry / withFallback / resilientProvider decorators.
6. Pause/resume infrastructure
Section titled “6. Pause/resume infrastructure”If your agent uses pauseHere or askHuman, wire the checkpoint persistence + resume path:
// On pauseconst result = await agent.run({...});if (isPaused(result)) { await db.save('pause:' + sessionId, JSON.stringify(result.checkpoint)); triggerHumanWorkflow(result.pauseData); return; // request done}
// On human reply (different process / day)const checkpoint = JSON.parse(await db.get('pause:' + sessionId));const finalResult = await agent.resume(checkpoint, humanAnswer);See Pause/Resume guide.
Multi-instance considerations
Section titled “Multi-instance considerations”- CircuitBreaker state is per-process today. Multi-instance deploys won’t share circuit state. Acceptable for v2.4; v2.5 adds shared-state options. (Reliability subsystem.)
- Memory stores are inherently multi-instance friendly when backed by Redis / AgentCore / external DBs.
- Pause/resume checkpoints are JSON; any instance can resume any pause given the checkpoint.
What’s NOT here yet
Section titled “What’s NOT here yet”- Rate-limit budget enforcement → Reliability subsystem v2.5
- 3-tier output fallback → Reliability subsystem v2.5
- Per-agent IAM-style policy → Governance subsystem v2.6
- DynamoDB / Postgres / Pinecone memory adapters → v2.6
Anti-patterns
Section titled “Anti-patterns”- Don’t wire production keys into your test suite. Use
mock(...)for tests; reserve real keys for staged env. - Don’t omit identity in production. Default-global is a footgun.
- Don’t cache the agent instance across requests when memory is involved. Build fresh per request; the memory layer handles state.
Next steps
Section titled “Next steps”- Quick Start — the mocks-first workflow
- Memory store adapters — production backend matrix
- Resilience guide — production-grade decorators