README.md
benchmark.json
# skill-maker
An Agent Skill that creates other agent skills.
$ "Create a skill for writing git commit messages"
$ "Build a SKILL.md that helps with data pipeline validation"
$ "Package this debugging process as a skill"
## What it does
Skill-maker guides an AI coding agent through the full skill-creation lifecycle: intent capture, drafting a SKILL.md, running an eval loop with isolated subagents, refining based on grading signals, and optimizing the trigger description.
## The 5 Phases
1
Capture Intent — Clarify what the skill should do
2
Draft — Generate SKILL.md, scripts, references, assets
3
Eval Loop — Spawn subagents, grade assertions, iterate
4
Refine — Fix failing assertions, improve instructions
5
Finalize — Validate, optimize description, install
## The Eval Loop
The core of skill-maker. For each iteration it:
- spawn isolated subagents per test case
- grade assertions with bundled Bun TypeScript scripts
- aggregate results into a benchmark
- iterate until pass_rate plateaus // delta < 2% for 3 consecutive runs
## Benchmark
Evaluated across 8 skills, 189 assertions // with-skill vs without-skill subagent pairs
100%
with skill
+73.6%
avg improvement
2.4
avg iterations
| Skill | Baseline | Delta |
|---|---|---|
| database-migration | 4.2% | +95.8% |
| pdf-toolkit | 4.2% | +95.8% |
| error-handling | 8.3% | +91.7% |
| api-doc-generator | 16.7% | +83.3% |
| pr-description | 20.8% | +79.2% |
| changelog-generator | 20.8% | +79.2% |
| monitoring-setup | 26.1% | +73.9% |
| code-reviewer | 41.7% | +58.3% |
| git-conventional-commits | 72.3% | +27.7% |
All skills reach 100% pass rate after the eval loop. See examples/README.md for convergence charts, timing data, and per-skill breakdowns.
TERMINAL
# Clone the repo and install the skill
~ $ git clone https://github.com/accolver/skill-maker.git
~ $ cd skill-maker
~/skill-maker $ mkdir -p ~/.agents/skills
~/skill-maker $ cp -r skill-maker ~/.agents/skills/skill-maker
~/skill-maker $