Build

Skills

defineSkill — LLM-activated body + tools. The LLM calls read_skill('billing') to load a body of guidance plus unlock a set of tools for the rest of the turn. The shipped Skills surface today; full conceptual essay in skills-explained.

A support agent handles billing 5% of the time and tech 80% of the time and refunds 15% of the time. Putting all three playbooks in the system prompt wastes tokens 95% of the time billing isn't relevant. Putting them as tools the LLM activates on demand is what defineSkill does — the body + tools land only when the LLM asks for them.

What a Skill is

A Skill is the llm-activated flavor of the Injection primitive. It bundles:

  • A body — the playbook text the LLM follows when the skill is active
  • A set of tools — capabilities the LLM can call only after activating
  • A description — what the LLM sees BEFORE activating (used to decide whether to read the skill)

The library auto-attaches a read_skill tool to the agent. The LLM activates by calling read_skill('billing'); the framework looks up the skill, formats the body as the tool result, and lets the LLM use it on the next iteration.

Define and attach a skill

const billingSkill = defineSkill({  id: 'billing',  description: 'Read for refund / charge / billing questions. Unlocks process_refund.',  body: 'When handling billing: confirm the order id, then call process_refund. Always state the amount + payment method in the final reply.',  tools: [refundTool],});

The body reads the way you'd brief a junior employee. The tools array holds capabilities only available once the skill is active — process_refund is locked away until the LLM has explicitly chosen to handle billing.

Why progressive disclosure matters

Three reasons:

  1. Token cost — skill bodies are LARGE (200–2000 tokens of playbook text). Loading all of them on every turn is wasteful. Loading on demand is amortized.
  2. Context discipline — the LLM sees ONLY the playbook for what it's currently doing. No cross-domain confusion.
  3. Capability scopingprocess_refund exists in the agent's tool registry only when billing is active. The LLM literally cannot call it during an unrelated turn — defense-in-depth against tool-call mistakes.

This is the pattern Anthropic shipped as "Agent SDK Skills" — agentfootprint reimplements it with cross-provider correctness via the Injection primitive so it works on mock(), OpenAI, Bedrock, Ollama identically.

When a skill activates

The LLM's tool-call to read_skill({ id: 'billing' }) is the activation event. The framework:

  1. Looks up billing in the agent's registered skills
  2. Formats the body as a tool result the LLM sees on the next iteration
  3. Adds billing.tools to the agent's tool registry for the rest of the turn
  4. Emits agentfootprint.skill.activated for observability

The skill stays active until the agent run ends. The next agent.run() starts fresh.

Per-provider surface selection (surfaceMode)

import { defineSkill, resolveSurfaceMode } from 'agentfootprint';

const billingSkill = defineSkill({
  id: 'billing',
  description: 'Refund / charge / billing.',
  body: 'When handling billing: confirm the order id...',
  tools: [refundTool],
  surfaceMode: 'auto',  // resolves to 'both' on Claude ≥ 3.5; 'tool-only' elsewhere
});

// Pure inspector — see what 'auto' will resolve to in your stack:
resolveSurfaceMode('anthropic', 'claude-sonnet-4-5-20250929');  // → 'both'
resolveSurfaceMode('openai', 'gpt-4o');                         // → 'tool-only'

Four modes: 'system-prompt', 'tool-only', 'both', 'auto'. See Skills, explained for the full per-provider attention argument.

Per-mode runtime dispatch

What each mode actually does at runtime when the LLM activates the skill via read_skill('id'):

surfaceModeSystem slot (next iteration)read_skill tool result
'system-prompt'body lands hereconfirmation only
'tool-only'body SUPPRESSEDbody delivered verbatim
'both'body lands herebody delivered verbatim
'auto' (default)body lands hereconfirmation only

Two consequences worth knowing:

  1. 'tool-only' is recency-first by protocol. The LLM sees the body as the most recent tool result on the next iteration — providers' attention to the latest message is consistently strong. No reliance on system-prompt training adherence.
  2. 'auto' keeps the body in the system slot so existing consumers see no surprises. To get provider-aware resolution (Claude ≥ 3.5 → 'both'; everything else → 'tool-only'), call resolveSurfaceMode(provider, model) yourself and pass the concrete mode, or set a registry-level default via new SkillRegistry({ surfaceMode }).

Long-context refresh (refreshPolicy)

defineSkill({
  id: 'critical-rule',
  description: 'Critical reasoning rule for long-context runs',
  body: 'When the value is ambiguous, ask for clarification before acting.',
  refreshPolicy: { afterTokens: 50_000, via: 'tool-result' },
});

Re-injects the body via tool result past a token threshold. The field is reserved + typed today — the engine ignores it until the long-context refresh hook is implemented, so specifying refreshPolicy is non-breaking.

SkillRegistry — centralized governance

For shared skill catalogs across multiple agents:

import { Agent, SkillRegistry } from 'agentfootprint';

const registry = new SkillRegistry();
registry.register(billingSkill).register(refundSkill).register(complianceSkill);

const supportAgent = Agent.create({ provider }).skills(registry).build();
const escalationAgent = Agent.create({ provider }).skills(registry).build();

// Add a skill — every consumer Agent picks it up at next build.
registry.register(newSkill);

agent.skills(registry) is the bulk-register companion to .skill(t). Use the registry pattern when 2+ agents share overlapping skills; use .skill(...) directly when one agent has its own catalog.

SkillRegistry methods: register(skill) · replace(id, skill) · unregister(id) · get(id) · has(id) · list() · clear() · size · toTools() · resolveForSkill(skillOrId, provider?, model?). Throws on duplicate register (use replace for explicit overwrites). Throws on non-Skill flavor inputs.

Registry-level defaults — new SkillRegistry({ surfaceMode, providerHint }) (v2.5)

When every skill in a registry should share the same surfaceMode, set it once on the constructor instead of repeating it on every defineSkill:

import { SkillRegistry } from 'agentfootprint';

// All skills here default to 'tool-only' (overrides defineSkill's 'auto')
const registry = new SkillRegistry({ surfaceMode: 'tool-only' });
registry.register(billingSkill);   // billingSkill.surfaceMode 'auto' → resolves to 'tool-only'
registry.register(refundSkill);

// providerHint helps when the registry is composed far from the agent
// (test fixtures, design-time inspectors, multi-provider routing).
const registry2 = new SkillRegistry({ providerHint: 'anthropic' });

The cascade for surfaceMode resolution is:

  1. Per-skill explicit surfaceMode wins. defineSkill({ surfaceMode: 'both' }) is honored regardless of registry default.
  2. Registry's surfaceMode ctor opt (if set + not 'auto').
  3. Global resolveSurfaceMode(provider, model) — Claude ≥ 3.5 → 'both', everything else → 'tool-only'.

Inspect the resolved mode for any registered skill:

registry.resolveForSkill('billing', 'anthropic', 'claude-sonnet-4-5');
// → returns 'system-prompt' | 'tool-only' | 'both' (never 'auto')

Per-mode routing is live at runtime — 'tool-only' suppresses the body from the system slot and delivers it via the read_skill tool result, 'both' does both, and 'system-prompt' / 'auto' keep the body in the system slot (see Per-mode runtime dispatch). resolveForSkill(...) lets you inspect the resolved mode at design time.

registry.toTools() — explicit composition (v2.5)

When you want to wire skill discovery into a custom tool chain (e.g., a gatedTools layer that filters by role) instead of the Agent's auto-attached read_skill, use toTools():

import { SkillRegistry } from 'agentfootprint';
import { gatedTools, staticTools } from 'agentfootprint/tool-providers';
import { PermissionPolicy } from 'agentfootprint/security';

const registry = new SkillRegistry();
registry.register(billingSkill).register(refundSkill);

const { listSkills, readSkill } = registry.toTools();
// listSkills: Tool — no-arg discovery (LLM calls to enumerate skills)
// readSkill:  Tool — same as the auto-attached one (activation by id)

const policy = PermissionPolicy.fromRoles({...}, 'support');
const allTools = [listSkills!, readSkill!, lookupTool, refundTool];
const provider = gatedTools(staticTools(allTools), (n) => policy.isAllowed(n));

Two reasons to choose toTools() over the auto-attach:

  1. Token-efficient discovery — the auto-attached read_skill embeds the catalog in its description (every iteration's tool list pays the cost). list_skills lets the LLM browse on demand; read_skill's description can stay terse. For ~20+ skill registries, this matters.
  2. Permission gating — pass read_skill through gatedTools like any other tool, so a readonly role can see list_skills but not read_skill, or vice versa.

toTools() returns { listSkills: undefined, readSkill: undefined } for an empty registry — filter with .filter(Boolean) before adding to a tool list.

Per-skill tool gating — autoActivate

By default, a Skill's tools array is ADDED to the agent's tool registry on activation. With ~3 skills and ~5 tools each, that's fine. With 20 skills and 100+ tools, the LLM's choice space gets noisy — every iteration's tool list pays the cost.

autoActivate: 'currentSkill' narrows the choice space: when you set it, the skill's tools are EXCLUDED from the static tool list and surface to the LLM ONLY on iterations after the skill is activated by read_skill('id'). Skills WITHOUT autoActivate keep the additive behavior (their tools are always visible).

import { Agent, defineSkill } from 'agentfootprint';

const billingSkill = defineSkill({
  id: 'billing',
  description: 'Billing assistance',
  body: '...',
  tools: [refundTool, chargeTool],
  autoActivate: 'currentSkill',
});

// The Agent reads skill.metadata.autoActivate and wires the gate for you —
// billing's tools stay hidden until the LLM calls read_skill('billing').
const agent = Agent.create({ provider })
  .tool(lookupOrderTool)   // always-visible baseline
  .skill(billingSkill)
  .skill(refundSkill)      // also autoActivate: 'currentSkill'
  .build();

For a tool chain you compose yourself (outside the Agent's auto-attach — e.g., a gatedTools permission layer), materialize the same gate with skillScopedTools(id, tools), reading ctx.activeSkillId per iteration:

import { skillScopedTools, staticTools, type ToolProvider } from 'agentfootprint/tool-providers';

const baseline = staticTools([lookupOrderTool, listSkills, readSkill]);
const billingScope = skillScopedTools('billing', [refundTool, chargeTool]);
const refundScope = skillScopedTools('refund', [reverseTool]);

const provider: ToolProvider = {
  id: 'composite',
  list: (ctx) => [
    ...baseline.list(ctx),
    ...billingScope.list(ctx),
    ...refundScope.list(ctx),
  ],
};

What the LLM sees per iteration:

ctx.activeSkillIdVisible tools
undefined (no skill)lookup_order, list_skills, read_skill
'billing'baseline + refund, charge
'refund'baseline + reverse

This is a Dynamic ReAct payoff: the next iteration's tool list reshapes based on what just happened. 3× context-budget reduction in large catalogs + sharper LLM tool-choice.

The autoActivate field is also stored on skill.metadata.autoActivate, so custom ToolProvider chains can read it to drive their own composition.

When to use Skills vs Steering vs Instruction

You wantUse
Always-on persona / tone / formatdefineSteering
Conditional rule (predicate-based)defineInstruction({ activeWhen })
LLM-activated playbook + toolsdefineSkill (this guide)
Cross-run statedefineMemory

Anti-patterns

  • Don't put always-relevant content in a skill. If it's relevant 100% of the time, it belongs in the system prompt or as Steering. Skills are for sometimes-relevant.
  • Don't define dozens of tiny skills. The LLM picks by description; too many descriptions to scan = analysis paralysis. 3–10 focused skills is the sweet spot.
  • Don't put sensitive credentials in a skill body. Skill bodies are LLM-readable plaintext; treat them as you would the system prompt.

Next steps

  • Skills, explained — the conceptual essay (why this design, cross-provider correctness, three-stage anatomy)
  • Tools guide — the underlying tool primitive Skills compose over

On this page