Skip to content

Custom provider

Your team uses an internal model serving cluster, or a provider not yet shipped as a built-in adapter. You don’t need agentfootprint to support it explicitly — you just implement two methods. The library treats your custom provider identically to anthropic() or openai().

import type { LLMProvider, LLMRequest, LLMResponse, LLMChunk } from 'agentfootprint';
interface LLMProvider {
readonly name: string;
complete(req: LLMRequest): Promise<LLMResponse>;
stream?(req: LLMRequest): AsyncIterable<LLMChunk>;
}

That’s the whole interface. complete is required; stream is optional (the library falls back to complete if absent).

class MyCustomProvider implements LLMProvider {
readonly name = 'my-custom';
private apiKey: string;
constructor(opts: { apiKey: string }) {
this.apiKey = opts.apiKey;
}
async complete(req: LLMRequest): Promise<LLMResponse> {
const res = await fetch('https://my-llm-api.example.com/v1/chat', {
method: 'POST',
headers: { 'Authorization': `Bearer ${this.apiKey}` },
body: JSON.stringify(toMyApiFormat(req)),
});
const json = await res.json();
return fromMyApiFormat(json);
}
}

The two converters (toMyApiFormat, fromMyApiFormat) are the only API-specific code. The agent loop, recorders, memory, skills — all work unchanged.

For UI-streaming support:

async *stream(req: LLMRequest): AsyncIterable<LLMChunk> {
const res = await fetch('https://my-llm-api.example.com/v1/stream', { ... });
const reader = res.body!.getReader();
let tokenIndex = 0;
let accumulated = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = new TextDecoder().decode(value);
accumulated += text;
yield { tokenIndex: tokenIndex++, content: text, done: false };
}
// Final chunk MUST carry the authoritative LLMResponse
yield {
tokenIndex,
content: '',
done: true,
response: {
content: accumulated,
toolCalls: [], // populate from API's terminal payload
usage: { input: 0, output: 0 },
stopReason: 'stop',
},
};
}

The agent reads response off the final chunk to drive ReAct decisioning. UI consumers read intermediate content chunks for token streaming. One round-trip, two consumers.

Honor req.signal so consumers can cancel mid-stream:

async complete(req: LLMRequest): Promise<LLMResponse> {
const res = await fetch(url, {
signal: req.signal,
// ...
});
// ...
}

The library’s resilience decorators (withRetry, withFallback) wrap your custom provider transparently — no special integration needed.

Drop into the standard composition:

import { withRetry } from 'agentfootprint';
const provider = withRetry(new MyCustomProvider({ apiKey: '...' }), {
maxAttempts: 5,
});

If your provider throws errors that should be retryable, wrap them in LLMError with retryable: true so the standard retry policy catches them:

import { LLMError } from 'agentfootprint';
throw new LLMError({
code: 'rate_limit',
provider: 'my-custom',
message: '429 from upstream',
statusCode: 429,
retryable: true,
});
  • Don’t bake retry logic into complete. Let withRetry handle it; your provider should be a thin SDK wrapper.
  • Don’t return raw API errors. Wrap in LLMError so the framework’s classification works (retryable vs terminal).
  • Don’t omit usage from the response — recorders depend on it for cost tracking.
  • Anthropic — reference adapter implementation
  • OpenAI — reference adapter implementation
  • Resilience — your custom provider gets retry + fallback for free