Skills
A support agent handles billing 5% of the time and tech 80% of the time and refunds 15% of the time. Putting all three playbooks in the system prompt wastes tokens 95% of the time billing isn’t relevant. Putting them as tools the LLM activates on demand is what
defineSkilldoes — the body + tools land only when the LLM asks for them.
What a Skill is
Section titled “What a Skill is”A Skill is the llm-activated flavor of the Injection primitive. It bundles:
- A
body— the playbook text the LLM follows when the skill is active - A set of
tools— capabilities the LLM can call only after activating - A
description— what the LLM sees BEFORE activating (used to decide whether to read the skill)
The library auto-attaches a read_skill tool to the agent. The LLM activates by calling read_skill('billing'); the framework looks up the skill, formats the body as the tool result, and lets the LLM use it on the next iteration.
Define and attach a skill
Section titled “Define and attach a skill”const billingSkill = defineSkill({ id: 'billing', description: 'Read for refund / charge / billing questions. Unlocks process_refund.', body: 'When handling billing: confirm the order id, then call process_refund. Always state the amount + payment method in the final reply.', tools: [refundTool],});The body reads the way you’d brief a junior employee. The tools array holds capabilities only available once the skill is active — process_refund is locked away until the LLM has explicitly chosen to handle billing.
Why progressive disclosure matters
Section titled “Why progressive disclosure matters”Three reasons:
- Token cost — skill bodies are LARGE (200–2000 tokens of playbook text). Loading all of them on every turn is wasteful. Loading on demand is amortized.
- Context discipline — the LLM sees ONLY the playbook for what it’s currently doing. No cross-domain confusion.
- Capability scoping —
process_refundexists in the agent’s tool registry only when billing is active. The LLM literally cannot call it during an unrelated turn — defense-in-depth against tool-call mistakes.
This is the pattern Anthropic shipped as “Agent SDK Skills” — agentfootprint reimplements it with cross-provider correctness via the Injection primitive so it works on mock(), OpenAI, Bedrock, Ollama identically.
When a skill activates
Section titled “When a skill activates”The LLM’s tool-call to read_skill({ id: 'billing' }) is the activation event. The framework:
- Looks up
billingin the agent’s registered skills - Formats the body as a tool result the LLM sees on the next iteration
- Adds
billing.toolsto the agent’s tool registry for the rest of the turn - Emits
agentfootprint.skill.activatedfor observability
The skill stays active until the agent run ends. The next agent.run() starts fresh.
Per-provider surface selection (surfaceMode)
Section titled “Per-provider surface selection (surfaceMode)”import { defineSkill, resolveSurfaceMode } from 'agentfootprint';
const billingSkill = defineSkill({ id: 'billing', description: 'Refund / charge / billing.', body: 'When handling billing: confirm the order id...', tools: [refundTool], surfaceMode: 'auto', // resolves to 'both' on Claude ≥ 3.5; 'tool-only' elsewhere});
// Pure inspector — see what 'auto' will resolve to in your stack:resolveSurfaceMode('anthropic', 'claude-sonnet-4-5-20250929'); // → 'both'resolveSurfaceMode('openai', 'gpt-4o'); // → 'tool-only'Four modes: 'system-prompt', 'tool-only', 'both', 'auto'. See Skills, explained for the full per-provider attention argument.
Per-mode runtime dispatch (v2.5+)
Section titled “Per-mode runtime dispatch (v2.5+)”What each mode actually does at runtime when the LLM activates the skill via read_skill('id'):
surfaceMode | System slot (next iteration) | read_skill tool result |
|---|---|---|
'system-prompt' | body lands here | confirmation only |
'tool-only' | body SUPPRESSED | body delivered verbatim |
'both' | body lands here | body delivered verbatim |
'auto' (default) | body lands here | confirmation only |
Two consequences worth knowing:
'tool-only'is recency-first by protocol. The LLM sees the body as the most recent tool result on the next iteration — providers’ attention to the latest message is consistently strong. No reliance on system-prompt training adherence.'auto'preserves v2.4 behavior so existing consumers see no surprises. The Block A4 cascade resolves'auto'against provider/model context (Claude ≥ 3.5 →'both'; everything else →'tool-only'); that runtime resolution lands in a future v2.5.x — for v2.5 today,'auto'is treated as'system-prompt'at the dispatch layer.
Long-context refresh (refreshPolicy)
Section titled “Long-context refresh (refreshPolicy)”defineSkill({ id: 'critical-rule', description: 'Critical reasoning rule for long-context runs', body: 'When the value is ambiguous, ask for clarification before acting.', refreshPolicy: { afterTokens: 50_000, via: 'tool-result' },});Re-injects the body via tool result past a token threshold. The runtime hook lands in v2.5 (long-context attention work); the API surface ships in v2.4 + is non-breaking.
SkillRegistry — centralized governance
Section titled “SkillRegistry — centralized governance”For shared skill catalogs across multiple agents:
import { Agent, SkillRegistry } from 'agentfootprint';
const registry = new SkillRegistry();registry.register(billingSkill).register(refundSkill).register(complianceSkill);
const supportAgent = Agent.create({ provider }).skills(registry).build();const escalationAgent = Agent.create({ provider }).skills(registry).build();
// Add a skill — every consumer Agent picks it up at next build.registry.register(newSkill);agent.skills(registry) is the bulk-register companion to .skill(t). Use the registry pattern when 2+ agents share overlapping skills; use .skill(...) directly when one agent has its own catalog.
SkillRegistry methods: register(skill) · replace(id, skill) · unregister(id) · get(id) · has(id) · list() · clear() · size · toTools() · resolveForSkill(skillOrId, provider?, model?). Throws on duplicate register (use replace for explicit overwrites). Throws on non-Skill flavor inputs.
Registry-level defaults — new SkillRegistry({ surfaceMode, providerHint }) (v2.5)
Section titled “Registry-level defaults — new SkillRegistry({ surfaceMode, providerHint }) (v2.5)”When every skill in a registry should share the same surfaceMode, set it once on the constructor instead of repeating it on every defineSkill:
import { SkillRegistry } from 'agentfootprint';
// All skills here default to 'tool-only' (overrides defineSkill's 'auto')const registry = new SkillRegistry({ surfaceMode: 'tool-only' });registry.register(billingSkill); // billingSkill.surfaceMode 'auto' → resolves to 'tool-only'registry.register(refundSkill);
// providerHint helps when the registry is composed far from the agent// (test fixtures, design-time inspectors, multi-provider routing).const registry2 = new SkillRegistry({ providerHint: 'anthropic' });The cascade for surfaceMode resolution is:
- Per-skill explicit
surfaceModewins.defineSkill({ surfaceMode: 'both' })is honored regardless of registry default. - Registry’s
surfaceModector opt (if set + not'auto'). - Global
resolveSurfaceMode(provider, model)— Claude ≥ 3.5 →'both', everything else →'tool-only'.
Inspect the resolved mode for any registered skill:
registry.resolveForSkill('billing', 'anthropic', 'claude-sonnet-4-5');// → returns 'system-prompt' | 'tool-only' | 'both' (never 'auto')Forward-compat: Block C (v2.5+) wires this cascade into the runtime so per-mode routing diversity (suppressing system-prompt for 'tool-only', etc.) takes effect. Today consumers express intent; runtime tightens later without API change.
registry.toTools() — explicit composition (v2.5)
Section titled “registry.toTools() — explicit composition (v2.5)”When you want to wire skill discovery into a custom tool chain (e.g., a gatedTools layer that filters by role) instead of the Agent’s auto-attached read_skill, use toTools():
import { SkillRegistry } from 'agentfootprint';import { gatedTools, staticTools } from 'agentfootprint/tool-providers';import { PermissionPolicy } from 'agentfootprint/security';
const registry = new SkillRegistry();registry.register(billingSkill).register(refundSkill);
const { listSkills, readSkill } = registry.toTools();// listSkills: Tool — no-arg discovery (LLM calls to enumerate skills)// readSkill: Tool — same as the auto-attached one (activation by id)
const policy = PermissionPolicy.fromRoles({...}, 'support');const allTools = [listSkills!, readSkill!, lookupTool, refundTool];const provider = gatedTools(staticTools(allTools), (n) => policy.isAllowed(n));Two reasons to choose toTools() over the auto-attach:
- Token-efficient discovery — the auto-attached
read_skillembeds the catalog in itsdescription(every iteration’s tool list pays the cost).list_skillslets the LLM browse on demand;read_skill’s description can stay terse. For ~20+ skill registries, this matters. - Permission gating — pass
read_skillthroughgatedToolslike any other tool, so areadonlyrole can seelist_skillsbut notread_skill, or vice versa.
toTools() returns { listSkills: undefined, readSkill: undefined } for an empty registry — filter with .filter(Boolean) before adding to a tool list.
Per-skill tool gating — autoActivate (v2.5)
Section titled “Per-skill tool gating — autoActivate (v2.5)”By default, a Skill’s tools array is ADDED to the agent’s tool registry on activation. With ~3 skills and ~5 tools each, that’s fine. With 20 skills and 100+ tools, the LLM’s choice space gets noisy — every iteration’s tool list pays the cost.
autoActivate: 'currentSkill' declares intent: when this skill is active, ONLY this skill’s tools should be visible (plus whatever baseline you compose alongside).
import { defineSkill } from 'agentfootprint';import { skillScopedTools, staticTools, type ToolProvider } from 'agentfootprint/tool-providers';
const billingTools = [refundTool, chargeTool];
const billingSkill = defineSkill({ id: 'billing', description: 'Billing assistance', body: '...', tools: billingTools, autoActivate: 'currentSkill',});
// Materialize the gate manually today (Block C v2.5 wires this from// skill.metadata.autoActivate automatically):const baseline = staticTools([lookupOrderTool, listSkills, readSkill]);const billingScope = skillScopedTools('billing', billingTools);const refundScope = skillScopedTools('refund', [reverseTool]);
const provider: ToolProvider = { id: 'composite', list: (ctx) => [ ...baseline.list(ctx), ...billingScope.list(ctx), ...refundScope.list(ctx), ],};What the LLM sees per iteration:
ctx.activeSkillId | Visible tools |
|---|---|
undefined (no skill) | lookup_order, list_skills, read_skill |
'billing' | baseline + refund, charge |
'refund' | baseline + reverse |
This is a Dynamic ReAct payoff: the next iteration’s tool list reshapes based on what just happened. 3× context-budget reduction in large catalogs + sharper LLM tool-choice.
Today: the autoActivate field is stored on skill.metadata.autoActivate; consumers wire skillScopedTools(...) manually. Block C (v2.5+): the runtime reads skill.metadata.autoActivate and wires the scope automatically when you agent.skills(registry).
When to use Skills vs Steering vs Instruction
Section titled “When to use Skills vs Steering vs Instruction”| You want | Use |
|---|---|
| Always-on persona / tone / format | defineSteering |
| Conditional rule (predicate-based) | defineInstruction({ activeWhen }) |
| LLM-activated playbook + tools | defineSkill (this guide) |
| Cross-run state | defineMemory |
Anti-patterns
Section titled “Anti-patterns”- Don’t put always-relevant content in a skill. If it’s relevant 100% of the time, it belongs in the system prompt or as Steering. Skills are for sometimes-relevant.
- Don’t define dozens of tiny skills. The LLM picks by description; too many descriptions to scan = analysis paralysis. 3–10 focused skills is the sweet spot.
- Don’t put sensitive credentials in a skill body. Skill bodies are LLM-readable plaintext; treat them as you would the system prompt.
Next steps
Section titled “Next steps”- Skills, explained — the conceptual essay (why this design, cross-provider correctness, three-stage anatomy)
- Tools guide — the underlying tool primitive Skills compose over