AI Agent Skills: What They Are and How They Work
AI Agent Skills: What They Are and How They Work
LLMs know facts. What they lack is procedural knowledge — the specific, ordered, reasoned steps that describe how work actually gets done in a real system. A release workflow. An incident triage protocol. The exact sequence to deploy a smart contract to mainnet.
OpenAI measured the impact on their own SDK repos: by using skills to automate verification, release review, and PR handoff, development throughput increased +44% in 3 months — 457 PRs merged versus 316, same team, same codebase.
Agent skills are the solution the industry has converged on. On December 18, 2025, Anthropic launched the SKILL.md format as an open standard. Two months later, OpenAI adopted it in Codex and ChatGPT. As of May 2026: 26+ platforms — Claude Code, OpenAI Codex, GitHub Copilot, Gemini CLI, Cursor, VS Code — all reading the same format.
This article covers the architecture, compares Anthropic and OpenAI implementations, and provides production patterns. Primary sources are listed — read them before the summaries.
The Number That Justifies Everything
OpenAI maintains their Agents SDK repos (Python + TypeScript) using Codex with repo-local skills. Documented results between December 2025 and February 2026:
| Metric | Before (Sept–Nov 2025) | After (Dec 2025–Feb 2026) | Delta |
|---|---|---|---|
| PRs merged — Python | 182 | 226 | +24% |
| PRs merged — TypeScript | 134 | 231 | +72% |
| Total | 316 | 457 | +44% |
Same team. Same codebase. Skills added for: automated verification, release review, PR draft generation.
Source: developers.openai.com/blog/skills-agents-sdk
Official Sources
| Platform | Official Documentation |
|---|---|
| Agent Skills Standard | agentskills.io — Apache 2.0 |
| Full Specification | agentskills.io/specification |
| Claude / Claude Code | docs.claude.com → agents-and-tools → agent-skills |
| Claude Code CLI | code.claude.com → docs → skills |
| OpenAI Codex skills | developers.openai.com/codex/skills |
| OpenAI blog — real case | developers.openai.com/blog/skills-agents-sdk |
| OpenAI skills repo | github.com/openai/skills |
| Official validator | github.com/agentskills/agentskills → skills-ref |
Read these first. This article adds what the docs don't have: production tradeoffs, architecture decisions, and the differences that actually matter between platforms.
What a Skill Is
A skill is a directory containing a SKILL.md file at its root.
The SKILL.md frontmatter has exactly two required fields. Constraints from the official spec:
| Field | Required | Rules |
|---|---|---|
name | Yes | Max 64 chars · lowercase letters, numbers, hyphens only · must match the parent directory name |
description | Yes | Max 1024 chars · describe what to do AND when to use it |
license | No | License name or reference to a bundled file |
compatibility | No | Max 500 chars · system requirements (git, docker, etc.) |
metadata | No | Key-value map — this is where version goes, not at the root |
allowed-tools | No | Experimental — space-separated pre-approved tools |
Minimal valid example:
Critical rule on
name: the name must exactly match the parent directory name. A skill atpr-summary-for-stakeholders/SKILL.mdmust havename: pr-summary-for-stakeholders. Uppercase letters and consecutive hyphens (--) are invalid and will fail validation.
To version a skill, use metadata — there is no version field at the frontmatter root in the official spec:
The body is raw markdown. Step-by-step instructions, conditional logic, error handling.
Validate your skill before using it in production:
This is procedural memory made explicit, versioned, and executable.
Progressive Disclosure: How 100 Skills Don't Saturate the Context Window
The standard solves the context window problem with progressive disclosure. From the official agentskills.io spec:
Matching is done by the LLM itself, not keyword search. The description is the only thing the agent reads to decide whether to activate a skill. That is why it is the most important part — not the instructions.
Practical implications for production:
- A dense 800-line
SKILL.md→ split intoreferences/ - Happy path in the body, edge cases in
references/edge-cases.md - Heavy scripts in
scripts/— never inline them in the body
How Anthropic Implements Skills: Two Distinct Surfaces
1. Claude Code CLI Skills
Claude Code loads skills in this priority order:
| Scope | Path |
|---|---|
| User global | ~/.claude/skills/ |
| Project | .claude/skills/ (relative to repo root) |
The format follows the open standard exactly. Claude Code-specific extensions:
- Invocation control: configurable via Claude Code settings —
allow_implicit_invocation: falserequires explicit invocation via/skill-name. This parameter is in the Claude Code configuration, not in theSKILL.mdfrontmatter - Bundled skills:
/claude-api,/code-review,/batch,/debug,/loop - Sub-agents: skills can spawn sub-agents for parallel work
2. Pre-built Skills via the Claude API
A separate surface: skills hosted by Anthropic, activated via the container parameter in the Messages API. Available skills: pptx, xlsx, docx, pdf.
⚠️ Both beta headers required:
code-execution-2025-08-25ANDskills-2025-10-02. Missing either one causes a silent failure with no clear error message.
List available skills:
Note: Beta headers contain a date in their name and may change. Verify the current header at platform.claude.com/docs/en/build-with-claude/skills-guide at integration time.
How OpenAI Codex Implements Skills
Official Paths
| Scope | Path | Use Case |
|---|---|---|
| User global | ~/.codex/skills/ | Personal skills across all projects |
| Project | .codex/skills/ | Team skills versioned in the repo |
| Cross-platform standard | ~/.agents/skills/ | agentskills.io standard path — also read by Codex, Gemini CLI, and others |
Which path to use? —
~/.codex/skills/for Codex-first workflows —~/.agents/skills/for a single folder shared across Claude Code + Codex + Gemini CLI —.codex/skills/inside the repo for team skills versioned in git
Full Path Comparison by Platform
| Agent | User skills | Project skills |
|---|---|---|
| Claude Code | ~/.claude/skills/ | .claude/skills/ |
| Codex CLI | ~/.codex/skills/ | .codex/skills/ |
| Gemini CLI / Open standard | ~/.agents/skills/ | .agents/skills/ |
| Cursor | N/A | .cursor/skills/ |
VS Code / GitHub Copilot: see code.visualstudio.com/docs/copilot/customization/agent-skills — structure varies by version.
The SKILL.md file is identical across all these platforms. Only the installation path changes.
openai.yaml — Codex-only Extension
Codex supports an optional openai.yaml file inside the skill folder. Other agents ignore it.
Installation
AGENTS.md + Skills: OpenAI's Production Pattern
A critical pattern that introduction articles miss: skills alone are not enough in production. They become powerful when combined with AGENTS.md.
AGENTS.md is a file at the root of a repo. It tells Codex which rules to always follow before starting any work. This is where you make skills mandatory.
Skills that OpenAI actually uses in production in their SDK repos (public, verifiable on GitHub):
| Skill | Role |
|---|---|
code-change-verification | Runs the full verification stack (format, lint, typecheck, tests) |
docs-sync | Audits docs against the codebase — finds gaps and stale content |
examples-auto-run | Runs examples in auto mode with structured logs |
final-release-review | Compares previous tag vs current RC — GREEN / BLOCKED decision |
implementation-strategy | Decides approach and compatibility boundary before touching code |
openai-knowledge | Pulls OpenAI docs via Docs MCP — prevents hallucination |
pr-draft-summary | Generates PR title + description at handoff time |
test-coverage-improver | Identifies coverage gaps and proposes high-value tests |
The model is clear:
AGENTS.mddefines when to call a skill- The skill defines how to do the work
- The separation is intentional — one without the other is incomplete
Documented result on both OpenAI Agents SDK repos: +44% throughput in 3 months.
The Ecosystem: 26+ Platforms, One Format
As of May 2026, the SKILL.md standard is implemented by 26+ platforms:
Claude Code · OpenAI Codex · GitHub Copilot · VS Code · Gemini CLI · Cursor · Amp · Junie (JetBrains) · OpenHands · Goose · Letta · Firebender · OpenCode · OpenClaw · Autohand · Mux · Piebald · and more...
The marketplace ecosystem has emerged. Platforms like Agensi.io distribute security-scanned skills before publication. The distinction from raw GitHub skills matters in production.
Skills vs RAG vs MCP vs Fine-Tuning
These four approaches handle different types of knowledge. They are not interchangeable.
| Approach | What it gives the agent | What it doesn't give |
|---|---|---|
| Skills | Procedural knowledge — how to do tasks, in what order | Factual lookup, external API access |
| MCP | Tool access — calling external APIs and services | When to call them, how to handle results |
| RAG | Factual knowledge — relevant chunks from a knowledge base | How to do things, procedural judgment |
| Fine-tuning | Permanent weight changes — baked-in knowledge | Flexibility; expensive to redo when procedures change |
The cognitive science framework is precise:
- Semantic memory (facts about the world) → RAG, knowledge bases
- Episodic memory (past experiences) → context windows, memory systems
- Procedural memory (how to do things) → skill files
In production, these complement each other. A skill provides the judgment for when and how to use an MCP tool. RAG provides the reference material a skill may need during execution.
Real OpenAI example: the openai-knowledge skill is a wrapper around OpenAI's Docs MCP. The skill says when to call the MCP and what to do with it. MCP provides access. Skill provides judgment.
The Security Constraint Nobody Mentions
A skill with scripts has access to the filesystem, environment variables, and API keys present in the shell. That is what makes it powerful. That is also its attack surface.
The three vectors to audit before installing any third-party skill:
1. Prompt injection in SKILL.md The SKILL.md body is loaded directly into the agent's context. An attacker can inject instructions that hijack behavior — data exfiltration, bypassing system rules.
2. Tool poisoning in scripts
A scripts/setup.sh can do anything: exfiltrate ~/.ssh/, call an external URL, silently modify a config file. The agent runs it on instruction from the SKILL.md — without asking for confirmation.
3. Credential harvesting in assets
An assets/config.json that asks for an API key "for testing" and sends it to a remote endpoint.
The distinction that matters in production:
| Source | Security scan | Recommendation |
|---|---|---|
| Agensi.io | ✅ Before publication | OK with a quick audit |
github.com/openai/skills official | ✅ Maintained by OpenAI | Trustworthy |
| Third-party / community GitHub | ❌ None | Read every file before executing |
Treat skill installation like installing an npm package with scripts.postinstall: you read before you run.
What Makes a Production-Grade Skill
Three things consistently determine whether a skill works reliably in production. These are the patterns OpenAI documented from their own repos.
1. The Description Is the Most Critical Line
The description is routing metadata, not a summary. It must state: when the skill applies, what type of changes trigger it, and explicit exclusions.
OpenAI's lesson: if routing is unreliable, fix the description before adding more code to the body.
2. Model vs Scripts: The Right Split
| What belongs in the model | What belongs in scripts |
|---|---|
| Interpretation, comparison, reporting | Deterministic, repeated shell work |
| Decisions requiring context | Fixed command sequences |
| Explaining results | Parsing, validation, formatting |
Scripts behave like mini-CLIs: deterministic stdout output, explicit error codes, outputs to known file paths.
3. Failure Modes Must Be Explicit
4. Version Your Skills Like Code
A skill that silently changes in production is a silent behavior change in your agent. Version like code, revert like code.
Complete Example: Release Review Skill
Complete skill inspired by final-release-review from the OpenAI Agents SDK repos — public and verifiable on GitHub.
Structure:
SKILL.md
scripts/fetch-diff.sh
references/release-criteria.md
Validation before use
Official Resources
- Agent Skills Standard: agentskills.io — Apache 2.0 spec
- Full Specification: agentskills.io/specification
- Claude Agent Skills overview: docs.claude.com
- Claude Code skills guide: code.claude.com
- Skills in the Claude API: platform.claude.com/docs/en/build-with-claude/skills-guide
- OpenAI — Skills in production (real case): developers.openai.com/blog/skills-agents-sdk
- OpenAI Codex skills: developers.openai.com/codex/skills
- OpenAI skills catalog: github.com/openai/skills
- Codex CLI paths: agensi.io/learn/where-are-codex-cli-skills-stored
- Companion repo: github.com/Komluc/agent-skills-production
Luc K. SEGBEDZI — AI Systems Engineer & Blockchain Architect Founder, MGS (MA Group Solutions) · Lomé, Togo Builder: every article has a GitHub repo