Skip to content

Ollama

Ollama runs local models on your machine. agentfootprint’s ollama() factory wraps Ollama’s OpenAI-compatible endpoint as an LLMProvider — your agent code is identical between local-Ollama dev and cloud-Anthropic production.

Ollama itself: see ollama.com/download. Pull a model:

Terminal window
ollama pull llama3.1

agentfootprint side: ollama() uses the openai SDK under the hood (via OpenAI-compatible endpoint), so install the OpenAI peer dep:

Terminal window
npm install openai
import { Agent, ollama } from 'agentfootprint';
const provider = ollama({
model: 'llama3.1',
baseURL: 'http://localhost:11434/v1', // default
});
const agent = Agent.create({
provider,
model: 'llama3.1',
}).build();
  • Local development — no API key, no cost, no network.
  • Privacy-sensitive workloads — model + data never leave the machine.
  • Cost-sensitive batch jobs — local inference for high-volume background processing.
  • Edge / offline deployments — bundle the model with your app.
  • Local models are smaller and less capable than frontier models. Tool-calling reliability varies by model — llama3.1 + qwen2.5 + deepseek work; smaller models may struggle.
  • No streaming for some local models (depends on Ollama config).
  • Multi-modal not exposed.

Switching to Anthropic / OpenAI for production

Section titled “Switching to Anthropic / OpenAI for production”

Same agent code; swap the provider one line:

const provider = process.env.NODE_ENV === 'production'
? anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! })
: ollama({ model: 'llama3.1' });