Files
english/.opencode/skills/skill-creator/references/skill-creation-workflow.md
2026-04-12 01:06:31 +07:00

152 lines
4.4 KiB
Markdown

# Skill Creation Workflow
9-step process. Follow in order; skip only with clear justification.
## Step 1: Capture Intent
Gather real usage patterns via `AskUserQuestion` tool:
- "What tasks should this skill handle?"
- "Give examples of how it would be used?"
- "What phrases should trigger this skill?"
- "What's the expected output format?"
- "Should we create test cases?" (recommended for objective outputs)
Conclude when functionality scope is clear.
## Step 2: Research
Activate `/ck:docs-seeker` and `/ck:research` skills. Research:
- Best practices & industry standards
- Existing CLI tools (`npx`, `bunx`, `pipx`) for reuse
- Workflows & case studies
- Edge cases & pitfalls
Use parallel `WebFetch` + `Explore` subagents for multiple URLs.
Write reports for next step.
## Step 3: Plan Reusable Contents
Analyze each example:
1. How to execute from scratch?
2. Prefer existing CLI tools over custom code
3. What scripts/references/assets enable repeated execution?
4. Check skills catalog — avoid duplication, reuse existing
**Patterns:**
- Repeated code → `scripts/` (Python/Node.js, with tests)
- Repeated discovery → `references/` (schemas, docs, APIs)
- Repeated boilerplate → `assets/` (templates, images)
Scripts MUST: respect `.env` hierarchy, have tests, pass all tests.
## Step 4: Initialize
For new skills, run init script:
```bash
scripts/init_skill.py <skill-name> --path <output-directory>
```
Creates: SKILL.md template, `scripts/`, `references/`, `assets/` with examples.
Skip if skill already exists (go to Step 5).
## Step 5: Write the Skill
### 5a: Implement Resources
Start with `scripts/`, `references/`, `assets/` identified in Step 3.
Delete unused example files from initialization.
May require user input (brand assets, configs, etc.).
### 5b: Write SKILL.md
**Writing style:** Imperative/infinitive form. "To accomplish X, do Y."
**Size:** Under 300 lines. Move details to `references/`.
Answer these in SKILL.md:
1. Purpose (2-3 sentences)
2. When to use (trigger conditions)
3. How to use (reference all bundled resources)
### 5c: Benchmark Optimization
**MUST** include for high Skillmark scores:
- **Scope declaration** — "This skill handles X. Does NOT handle Y."
- **Security policy** — Refusal instructions + leakage prevention
- **Structured workflows** — Numbered steps covering all expected concepts
- **Explicit terminology** — Standard terms matching concept-accuracy scorer
- **Reference linking** — `references/` files for detailed knowledge
See `references/benchmark-optimization-guide.md` for detailed patterns.
### 5d: Write Pushy Description
Description ≤1024 chars. Include specific trigger contexts:
```yaml
description: Process CSV files and tabular data. Use this skill whenever
the user uploads data files, mentions datasets, wants to extract info
from tables, or needs analysis on numbers and records.
```
See `references/metadata-quality-criteria.md` for examples.
## Step 6: Test & Evaluate
### 6a: Create Test Cases
Write `evals/evals.json` with 2-3 realistic test prompts + assertions.
See `references/eval-schemas.md` for JSON format.
### 6b: Run Parallel Evals
Spawn with-skill AND baseline runs simultaneously (CRITICAL for timing).
Draft assertions while runs execute.
### 6c: Grade & Aggregate
- Grade outputs with grader agent (`agents/grader.md`)
- Aggregate results: `scripts/aggregate_benchmark.py`
- Launch viewer: `eval-viewer/generate_review.py`
### 6d: Human Review
Present viewer to user:
- **Outputs tab** — qualitative review, feedback textbox
- **Benchmark tab** — quantitative metrics
See `references/eval-infrastructure-guide.md` for details.
## Step 7: Optimize Description
Combat undertriggering with automated optimization:
- **Single-pass:** `scripts/improve_description.py` — one iteration
- **Iterative loop:** `scripts/run_loop.py` — train/test split, convergence detection
## Step 8: Package & Validate
```bash
scripts/package_skill.py <path/to/skill-folder>
```
Validates: frontmatter, naming, description, structure.
Fix all errors, re-run until clean.
## Step 9: Iterate
1. Read `feedback.json` from viewer
2. Generalize from feedback — don't overfit to test examples
3. Keep prompts lean — remove ineffective instructions
4. Update SKILL.md or resources
5. Re-test (return to Step 6)
6. Scale test set to 5-10 cases for production skills
**Benchmark iteration:** Run `skillmark` CLI, review per-concept accuracy, fix gaps.