Skills
defineSkill — LLM-activated body + tools. The LLM calls read_skill('billing') to load a body of guidance plus unlock a set of tools for the rest of the turn. The shipped Skills surface today; full conceptual essay in skills-explained.
A support agent handles billing 5% of the time and tech 80% of the time and refunds 15% of the time. Putting all three playbooks in the system prompt wastes tokens 95% of the time billing isn't relevant. Putting them as tools the LLM activates on demand is what
defineSkilldoes — the body + tools land only when the LLM asks for them.
What a Skill is
A Skill is the llm-activated flavor of the Injection primitive. It bundles:
- A
body— the playbook text the LLM follows when the skill is active - A set of
tools— capabilities the LLM can call only after activating - A
description— what the LLM sees BEFORE activating (used to decide whether to read the skill)
The library auto-attaches a read_skill tool to the agent. The LLM activates by calling read_skill('billing'); the framework looks up the skill, formats the body as the tool result, and lets the LLM use it on the next iteration.
Define and attach a skill
const billingSkill = defineSkill({ id: 'billing', description: 'Read for refund / charge / billing questions. Unlocks process_refund.', body: 'When handling billing: confirm the order id, then call process_refund. Always state the amount + payment method in the final reply.', tools: [refundTool],});The body reads the way you'd brief a junior employee. The tools array holds capabilities only available once the skill is active — process_refund is locked away until the LLM has explicitly chosen to handle billing.
Why progressive disclosure matters
Three reasons:
- Token cost — skill bodies are LARGE (200–2000 tokens of playbook text). Loading all of them on every turn is wasteful. Loading on demand is amortized.
- Context discipline — the LLM sees ONLY the playbook for what it's currently doing. No cross-domain confusion.
- Capability scoping —
process_refundexists in the agent's tool registry only when billing is active. The LLM literally cannot call it during an unrelated turn — defense-in-depth against tool-call mistakes.
This is the pattern Anthropic shipped as "Agent SDK Skills" — agentfootprint reimplements it with cross-provider correctness via the Injection primitive so it works on mock(), OpenAI, Bedrock, Ollama identically.
When a skill activates
The LLM's tool-call to read_skill({ id: 'billing' }) is the activation event. The framework:
- Looks up
billingin the agent's registered skills - Formats the body as a tool result the LLM sees on the next iteration
- Adds
billing.toolsto the agent's tool registry for the rest of the turn - Emits
agentfootprint.skill.activatedfor observability
The skill stays active until the agent run ends. The next agent.run() starts fresh.
Per-provider surface selection (surfaceMode)
import { defineSkill, resolveSurfaceMode } from 'agentfootprint';
const billingSkill = defineSkill({
id: 'billing',
description: 'Refund / charge / billing.',
body: 'When handling billing: confirm the order id...',
tools: [refundTool],
surfaceMode: 'auto', // resolves to 'both' on Claude ≥ 3.5; 'tool-only' elsewhere
});
// Pure inspector — see what 'auto' will resolve to in your stack:
resolveSurfaceMode('anthropic', 'claude-sonnet-4-5-20250929'); // → 'both'
resolveSurfaceMode('openai', 'gpt-4o'); // → 'tool-only'Four modes: 'system-prompt', 'tool-only', 'both', 'auto'. See Skills, explained for the full per-provider attention argument.
Per-mode runtime dispatch
What each mode actually does at runtime when the LLM activates the skill via read_skill('id'):
surfaceMode | System slot (next iteration) | read_skill tool result |
|---|---|---|
'system-prompt' | body lands here | confirmation only |
'tool-only' | body SUPPRESSED | body delivered verbatim |
'both' | body lands here | body delivered verbatim |
'auto' (default) | body lands here | confirmation only |
Two consequences worth knowing:
'tool-only'is recency-first by protocol. The LLM sees the body as the most recent tool result on the next iteration — providers' attention to the latest message is consistently strong. No reliance on system-prompt training adherence.'auto'keeps the body in the system slot so existing consumers see no surprises. To get provider-aware resolution (Claude ≥ 3.5 →'both'; everything else →'tool-only'), callresolveSurfaceMode(provider, model)yourself and pass the concrete mode, or set a registry-level default vianew SkillRegistry({ surfaceMode }).
Long-context refresh (refreshPolicy)
defineSkill({
id: 'critical-rule',
description: 'Critical reasoning rule for long-context runs',
body: 'When the value is ambiguous, ask for clarification before acting.',
refreshPolicy: { afterTokens: 50_000, via: 'tool-result' },
});Re-injects the body via tool result past a token threshold. The field is reserved + typed today — the engine ignores it until the long-context refresh hook is implemented, so specifying refreshPolicy is non-breaking.
SkillRegistry — centralized governance
For shared skill catalogs across multiple agents:
import { Agent, SkillRegistry } from 'agentfootprint';
const registry = new SkillRegistry();
registry.register(billingSkill).register(refundSkill).register(complianceSkill);
const supportAgent = Agent.create({ provider }).skills(registry).build();
const escalationAgent = Agent.create({ provider }).skills(registry).build();
// Add a skill — every consumer Agent picks it up at next build.
registry.register(newSkill);agent.skills(registry) is the bulk-register companion to .skill(t). Use the registry pattern when 2+ agents share overlapping skills; use .skill(...) directly when one agent has its own catalog.
SkillRegistry methods: register(skill) · replace(id, skill) · unregister(id) · get(id) · has(id) · list() · clear() · size · toTools() · resolveForSkill(skillOrId, provider?, model?). Throws on duplicate register (use replace for explicit overwrites). Throws on non-Skill flavor inputs.
Registry-level defaults — new SkillRegistry({ surfaceMode, providerHint }) (v2.5)
When every skill in a registry should share the same surfaceMode, set it once on the constructor instead of repeating it on every defineSkill:
import { SkillRegistry } from 'agentfootprint';
// All skills here default to 'tool-only' (overrides defineSkill's 'auto')
const registry = new SkillRegistry({ surfaceMode: 'tool-only' });
registry.register(billingSkill); // billingSkill.surfaceMode 'auto' → resolves to 'tool-only'
registry.register(refundSkill);
// providerHint helps when the registry is composed far from the agent
// (test fixtures, design-time inspectors, multi-provider routing).
const registry2 = new SkillRegistry({ providerHint: 'anthropic' });The cascade for surfaceMode resolution is:
- Per-skill explicit
surfaceModewins.defineSkill({ surfaceMode: 'both' })is honored regardless of registry default. - Registry's
surfaceModector opt (if set + not'auto'). - Global
resolveSurfaceMode(provider, model)— Claude ≥ 3.5 →'both', everything else →'tool-only'.
Inspect the resolved mode for any registered skill:
registry.resolveForSkill('billing', 'anthropic', 'claude-sonnet-4-5');
// → returns 'system-prompt' | 'tool-only' | 'both' (never 'auto')Per-mode routing is live at runtime — 'tool-only' suppresses the body from the system slot and delivers it via the read_skill tool result, 'both' does both, and 'system-prompt' / 'auto' keep the body in the system slot (see Per-mode runtime dispatch). resolveForSkill(...) lets you inspect the resolved mode at design time.
registry.toTools() — explicit composition (v2.5)
When you want to wire skill discovery into a custom tool chain (e.g., a gatedTools layer that filters by role) instead of the Agent's auto-attached read_skill, use toTools():
import { SkillRegistry } from 'agentfootprint';
import { gatedTools, staticTools } from 'agentfootprint/tool-providers';
import { PermissionPolicy } from 'agentfootprint/security';
const registry = new SkillRegistry();
registry.register(billingSkill).register(refundSkill);
const { listSkills, readSkill } = registry.toTools();
// listSkills: Tool — no-arg discovery (LLM calls to enumerate skills)
// readSkill: Tool — same as the auto-attached one (activation by id)
const policy = PermissionPolicy.fromRoles({...}, 'support');
const allTools = [listSkills!, readSkill!, lookupTool, refundTool];
const provider = gatedTools(staticTools(allTools), (n) => policy.isAllowed(n));Two reasons to choose toTools() over the auto-attach:
- Token-efficient discovery — the auto-attached
read_skillembeds the catalog in itsdescription(every iteration's tool list pays the cost).list_skillslets the LLM browse on demand;read_skill's description can stay terse. For ~20+ skill registries, this matters. - Permission gating — pass
read_skillthroughgatedToolslike any other tool, so areadonlyrole can seelist_skillsbut notread_skill, or vice versa.
toTools() returns { listSkills: undefined, readSkill: undefined } for an empty registry — filter with .filter(Boolean) before adding to a tool list.
Per-skill tool gating — autoActivate
By default, a Skill's tools array is ADDED to the agent's tool registry on activation. With ~3 skills and ~5 tools each, that's fine. With 20 skills and 100+ tools, the LLM's choice space gets noisy — every iteration's tool list pays the cost.
autoActivate: 'currentSkill' narrows the choice space: when you set it, the skill's tools are EXCLUDED from the static tool list and surface to the LLM ONLY on iterations after the skill is activated by read_skill('id'). Skills WITHOUT autoActivate keep the additive behavior (their tools are always visible).
import { Agent, defineSkill } from 'agentfootprint';
const billingSkill = defineSkill({
id: 'billing',
description: 'Billing assistance',
body: '...',
tools: [refundTool, chargeTool],
autoActivate: 'currentSkill',
});
// The Agent reads skill.metadata.autoActivate and wires the gate for you —
// billing's tools stay hidden until the LLM calls read_skill('billing').
const agent = Agent.create({ provider })
.tool(lookupOrderTool) // always-visible baseline
.skill(billingSkill)
.skill(refundSkill) // also autoActivate: 'currentSkill'
.build();For a tool chain you compose yourself (outside the Agent's auto-attach — e.g., a gatedTools permission layer), materialize the same gate with skillScopedTools(id, tools), reading ctx.activeSkillId per iteration:
import { skillScopedTools, staticTools, type ToolProvider } from 'agentfootprint/tool-providers';
const baseline = staticTools([lookupOrderTool, listSkills, readSkill]);
const billingScope = skillScopedTools('billing', [refundTool, chargeTool]);
const refundScope = skillScopedTools('refund', [reverseTool]);
const provider: ToolProvider = {
id: 'composite',
list: (ctx) => [
...baseline.list(ctx),
...billingScope.list(ctx),
...refundScope.list(ctx),
],
};What the LLM sees per iteration:
ctx.activeSkillId | Visible tools |
|---|---|
undefined (no skill) | lookup_order, list_skills, read_skill |
'billing' | baseline + refund, charge |
'refund' | baseline + reverse |
This is a Dynamic ReAct payoff: the next iteration's tool list reshapes based on what just happened. 3× context-budget reduction in large catalogs + sharper LLM tool-choice.
The autoActivate field is also stored on skill.metadata.autoActivate, so custom ToolProvider chains can read it to drive their own composition.
When to use Skills vs Steering vs Instruction
| You want | Use |
|---|---|
| Always-on persona / tone / format | defineSteering |
| Conditional rule (predicate-based) | defineInstruction({ activeWhen }) |
| LLM-activated playbook + tools | defineSkill (this guide) |
| Cross-run state | defineMemory |
Anti-patterns
- Don't put always-relevant content in a skill. If it's relevant 100% of the time, it belongs in the system prompt or as Steering. Skills are for sometimes-relevant.
- Don't define dozens of tiny skills. The LLM picks by description; too many descriptions to scan = analysis paralysis. 3–10 focused skills is the sweet spot.
- Don't put sensitive credentials in a skill body. Skill bodies are LLM-readable plaintext; treat them as you would the system prompt.
Next steps
- Skills, explained — the conceptual essay (why this design, cross-provider correctness, three-stage anatomy)
- Tools guide — the underlying tool primitive Skills compose over
Instructions
Rule-gated context injection. The Instruction primitive activates a prompt when a predicate matches the current iteration's context.
Skills, explained
Skills are context engineering for instructions — abstracted so you don't do it by hand and get it wrong. A conceptual walk through what they actually are, and why the abstraction exists.
