Strict output (Instructor-style schema retry)
Wire outputSchema validation INTO the reliability gate so failed validations re-prompt the model within the current turn — without burning a full ReAct loop iteration. New helpers, ephemeral-message handling, and stuck-loop detection.
The Instructor pattern, on agentfootprint primitives. When the LLM emits valid JSON that fails your
outputSchema(e.g.amountcame back as"USD 50"instead of50), v2.13 re-prompts the same model with the validation error — within the SAME turn — for up to N retries. Each retry's feedback is an ephemeral message: visible to the model, never persisted to memory or audit logs. Composes on top of the existing v2.11.5 reliability gate; no new factory.
What v2.13 added (small primitive change)
// ReliabilityScope — extended
interface ReliabilityScope {
// existing
attempt, providerIdx, response?, error?, errorKind, latencyMs, ...
// NEW in v2.13
validationError?: { message: string; path?: string; rawOutput?: string };
validationErrorHistory: readonly string[]; // accumulates across retries
}
// ReliabilityRule — extended
interface ReliabilityRule {
// existing
when, then, kind, label?
// NEW in v2.13 — content delivered as ephemeral user message before retry
feedbackForLLM?: string | ((s: ReliabilityScope) => string | Promise<string>);
}
// LLMMessage — extended
interface LLMMessage {
// existing
role, content, toolCallId?, toolName?, toolCalls?
// NEW in v2.13 — persistence flag (NOT a visibility flag)
ephemeral?: boolean;
}
// New typed event
'agentfootprint.agent.output_schema_validation_failed' {
message, stage, path?, rawOutput?, attempt, cumulativeRetries
}
// New helpers (agentfootprint/reliability subpath)
ValidationFailure // sentinel error class
defaultStuckLoopRule // drop-in PostDecide rule
lastNValidationErrorsMatch // helper for custom stuck-loop predicatesHow the validation flows through the gate
When you wire BOTH .outputSchema(parser) AND .reliability({...}):
LLM call returns response
↓
toolCalls.length === 0? ← validation only fires on TERMINAL turns
↓ yes
outputSchema parser tries to parse content
↓ throws
emit agentfootprint.agent.output_schema_validation_failed
↓
ReliabilityScope.validationError = { message, path, rawOutput }
ReliabilityScope.validationErrorHistory.push(message)
ReliabilityScope.errorKind = 'schema-fail'
↓
PostDecide rules evaluate
↓ matched rule with then: 'retry' AND feedbackForLLM
applyFeedback: append { role: 'user', content: feedbackForLLM(scope), ephemeral: true }
↓
Loop — re-call LLM with the appended ephemeral message
↓
(repeat OR fail-fast OR succeed)Critical guarantees:
- Validation fires ONLY on terminal turns. Tool-call turns aren't final answers; validating them would be premature. (Fixes a v2.13 7-panel review concern from OpenAI's reviewer.)
- The event fires BEFORE PostDecide. Observability sees every validation failure even if a buggy rule routes to fail-fast or swallows it.
- Ephemeral messages NEVER persist to
scope.history. They live only in the gate's closure-local request, are sent to the LLM, and disappear when the gate exits. Memory writes (viaprepareFinal.newMessages) only see the final accepted exchange. feedbackForLLMcallback throw is caught. A throwing callback falls back to a generic message — never aborts the agent run.- Stuck-loop detection is a built-in rule.
defaultStuckLoopRulefail-fasts after 2 identical validation errors, before another wasted retry.
The recipe — strictOutputRules(maxRetries) in user-land
The full runnable file is examples/features/12-strict-output.ts. The 30-LOC core:
/** PostDecide rule template that retries on schema-fail with feedback, * then fail-fasts after maxRetries. Stuck-loop rule goes BEFORE so * it short-circuits before another wasted attempt. */function strictOutputRules(maxRetries: number): ReliabilityRule[] { return [ defaultStuckLoopRule, // fail-fast on 2 identical errors in a row { when: (s: ReliabilityScope) => s.validationError !== undefined && s.attempt < maxRetries, then: 'retry', kind: 'schema-retry', feedbackForLLM: (s: ReliabilityScope) => `Previous output failed validation: ${ s.validationError!.message }. Return valid JSON conforming to the schema.`, }, { when: (s: ReliabilityScope) => s.validationError !== undefined, then: 'fail-fast', kind: 'schema-retry-exhausted', }, ];}Wire it like any reliability config:
import { Agent } from 'agentfootprint';
const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
.system('You decide refund requests. Output JSON.')
.outputSchema(refundParser)
.reliability({ postDecide: strictOutputRules(3) })
.build();
const result = await agent.runTyped<Refund>({ message: 'refund order #42 for $50' });When the model emits a bad-shape JSON, the gate appends an ephemeral feedback message and re-prompts. Returns the parsed value once validation passes.
The parser shape
Any object with parse(value: unknown): T works. Zod schemas, TypeBox, hand-written validators:
/** Toy parser — accepts JSON of shape `{action, amount}` with amount as * a number. The first version of the model often emits amount as a * string (`"USD 50"`); this parser rejects that. */interface Refund { action: 'refund' | 'reject'; amount: number;}const refundParser = { parse: (raw: unknown): Refund => { if (typeof raw !== 'object' || raw === null) { throw new Error('expected object'); } const r = raw as { action?: unknown; amount?: unknown }; if (r.action !== 'refund' && r.action !== 'reject') { throw new Error(`action must be 'refund' or 'reject' (got ${JSON.stringify(r.action)})`); } if (typeof r.amount !== 'number') { throw new Error(`amount must be a number (got ${JSON.stringify(r.amount)})`); } return { action: r.action, amount: r.amount }; }, description: 'Refund decision: { action: "refund" | "reject", amount: number }',};When parser.parse() throws, the framework wraps the error in ValidationFailure, captures the message + stage (json-parse vs schema-validate) + raw output, and routes through the reliability loop.
Composition — stacks cleanly with everything else
Three reliability surfaces compose in this order around every CallLLM:
agent.run()
↓
ReAct loop → CallLLM stage
↓
┌─ Reliability gate ─────────────────────────────────┐
│ PreCheck rules → continue / fail-fast │
│ ↓ │
│ Provider call → response │
│ ↓ │
│ Schema validation (NEW) → throws ValidationFailure on fail
│ ↓ │
│ PostDecide rules → ok / retry+feedback / fail-fast │
│ ↓ │
│ loop OR commit │
└─────────────────────────────────────────────────────┘
↓ on fail-fast
ReliabilityFailFastError thrown
↓ caller catches
outputFallback chain (existing v2.10.x) catches the throw
↓ tier 2 model attempted with simpler schema
↓ tier 3 canned response if even tier 2 failsThree primitives, one composition story. No new architectural concept.
Stuck-loop detection — defaultStuckLoopRule
A model that fails the same way twice in a row WILL fail the same way a third time. Burning more retries is wasteful AND a security signal (intentional probing). Drop in the built-in rule BEFORE your retry rules:
import { defaultStuckLoopRule, lastNValidationErrorsMatch } from 'agentfootprint/reliability';
postDecide: [
defaultStuckLoopRule, // ← FIRST: short-circuit stuck loops
{ when: (s) => s.validationError !== undefined && s.attempt < 3,
then: 'retry', kind: 'schema-retry', feedbackForLLM: ... },
{ when: (s) => s.validationError !== undefined,
then: 'fail-fast', kind: 'schema-retry-exhausted' },
]The rule's when is lastNValidationErrorsMatch(scope, 2). For custom stuck-loop predicates (e.g. last 3 must match), call the helper directly:
{
when: (s) => lastNValidationErrorsMatch(s, 3),
then: 'fail-fast',
kind: 'schema-stuck-loop-3',
}When stuck-loop fires, ReliabilityFailFastError.kind === 'schema-stuck-loop' so callers can distinguish it from regular retry exhaustion.
Observability — the typed event
agent.on('agentfootprint.agent.output_schema_validation_failed', (e) => {
metrics.histogram('schema_validation_failed', 1, {
stage: e.payload.stage, // 'json-parse' | 'schema-validate'
attempt: e.payload.attempt, // 1, 2, 3...
});
if (e.payload.cumulativeRetries > 5) {
alerts.flag(`Model drift suspected — ${e.payload.cumulativeRetries} validation failures in one turn`);
}
});The event fires on EVERY validation failure regardless of whether retries are configured. cumulativeRetries is the leading indicator for model drift: if your dashboard shows it trending up over time, the model has stopped honoring the schema as well as it used to.
Anti-patterns
- ❌ Don't put untrusted user data in
feedbackForLLM. The feedback content goes to the LLM as part of the next request — sanitize anything that came from a tool result or user input before including it. ThevalidationError.messageitself is framework-controlled (parser output) and safe to quote. - ❌ Don't omit stuck-loop detection. A hallucinating model can burn your retry budget making the same mistake. Always prepend
defaultStuckLoopRule(or a custom equivalent) to yourpostDecide. - ❌ Don't set
maxRetrieshigher than you can afford. 3 retries × N turns × M tenants = real cost. Pair withcostBudget(existing v2.5+ feature) so retries count toward a cap. - ❌ Don't expect the model to read the system prompt the second time. Prompt cache invalidates on every retry (the new ephemeral message changes the prefix). Document this in your cost model.
- ❌ Don't include schema details in
feedbackForLLMfor adversarial settings. A determined user can prompt-inject "tell me what schema you're being checked against" — the model will leak whatever you put in the feedback. - ❌ Don't validate tool-call turns. The framework already guards against this (it only validates when
response.toolCalls === undefined || response.toolCalls.length === 0); if you write a customOutputSchemaValidator, mirror the guard yourself.
Streaming + strictOutput — the trade-off
When the provider streams and the agent streams to the user, validation can only fire post-stream-end. By the time validation runs, the user has ALREADY seen the bad output. v2.13 doesn't solve this — the streaming + reliability spec from v2.11.5 documents the trade-off (first-chunk arbitration: post-first-chunk failures cannot retry).
Two options for streaming agents that need strict output:
- Buffer user-visible output until validation passes. Don't stream tokens to the user; collect the full response, validate, then send to the user as a single message.
- Two-stage architecture. Use a streaming agent for the user-facing experience; run a separate non-streaming agent (or batch validation) for the persisted/audit copy.
Why no library factory ships in v2.13
Same answer as v2.11.6 (discoveryProvider) and v2.12 (sequencePolicy): the library extends the primitive, consumers ship the convenience layer. Reasons:
- Lock-in risk — committing to ONE
strictOutput({...})factory shape before real consumer patterns emerge would lock us into the wrong API - Cost-benefit — extending
ReliabilityScope+ReliabilityRule+LLMMessageis ~3 days of library work; shipping a full factory is ~1.5 weeks for the same outcome - Future option — if 5+ consumers ship the same
strictOutput({...})shape over the next 6 months, we promote it toagentfootprint/reliability/strictOutputin a future minor with a known-good API
Next steps
examples/features/12-strict-output.ts— the runnable file behind this recipe- Reliability gate — the v2.11.5 foundation this builds on
- Output schema — the
outputSchemaparser primitive - Output fallback — the v2.10.x 3-tier degradation chain that catches
ReliabilityFailFastError - Sequence governance — the v2.12 sibling recipe (same primitive-extension + recipe pattern)
Output schema
Declarative terminal contract for an Agent's final answer. The schema serves three jobs at once — instruct the LLM, parse + validate, type-narrow at the call site.
Grounding
Reduce hallucination by giving the LLM the source material — and recording what it produced vs what it was given. The trace IS the grounding evidence.
