Skills¶
Skills are the unified reusable capability layer in this system. Every skill lives in skills/<name>/SKILL.md; its frontmatter sorts it into one of two sub-types:
- User-invocable skills (
user-invocable: true) — slash-command workflows (e.g./code-review), run under Orchestrator direction. - Agent-loaded skills — knowledge modules an agent reads for domain expertise.
Each row's description is the skill's own frontmatter description, verbatim.
User-invocable skills¶
| Skill | File | Description |
|---|---|---|
/adr-tools |
adr-tools/SKILL.md |
Create and manage Architecture Decision Records using the npryce adr-tools CLI. Use when the user asks to "add an ADR", "record this decision", "create an ADR", "supersede ADR N", "link ADRs", "generate the ADR table of contents", or any request involving the adr command. Pairs with the adr-author agent — this skill is the mechanics (commands, files, links); adr-author is the decision framework (when an ADR is warranted) and the prose authoring. |
/agent-audit |
agent-audit/SKILL.md |
Audit code-review agents, skills, and hooks for structural compliance. Use this when adding or modifying any agent, skill, or hook file, or for a periodic health check of the toolkit. Trigger phrases: "audit the agents", "check compliance", "validate the skills", "are the agents correct", or any time agent/skill files change. |
/agent-eval |
agent-eval/SKILL.md |
Run eval fixtures against review agents and grade results. Use this after adding or modifying a review agent, to validate detection accuracy, or when the user says "run the evals", "test the agents", "check for regressions", or "how accurate is the agent". |
/agent-readiness |
agent-readiness/SKILL.md |
Score how ready the current repository is for AI-assisted development against the Agent-Readiness Scorecard. Use when the user asks "how agent-ready is this repo", "score this repo for agents", "agent readiness", or wants a tiered readiness report. Scores YOUR project repo's readiness — not the dev-team plugin's own review agents and routing (for that, use /harness-audit). |
/api-design |
api-design/SKILL.md |
Contract-first API design for stable, evolvable interfaces. Use whenever defining a new API endpoint, inter-service boundary, or modifying an existing contract. Includes backward compatibility checklist and error contract specification. |
/apply-fixes |
apply-fixes/SKILL.md |
Apply correction prompts generated by /code-review. Use this whenever the user wants to apply, fix, or action the results of a code review — phrases like "apply the fixes", "fix the issues", "apply corrections", or after /code-review has run and produced a corrections/ directory. |
/artifact-lifecycle |
artifact-lifecycle/SKILL.md |
Report on skill and agent usage data from metrics/artifact-usage.json, classifying each artifact as active, stale (>= 30 days unused), or an archive candidate (>= 90 days unused). Proposes CLAUDE.md overrides for stale artifacts and exclusions for archive candidates. Pinned skills are always exempt. Use when the user asks to "review artifact lifecycle", "find stale skills", or "/artifact-lifecycle". |
/benchmark |
benchmark/SKILL.md |
Capture runtime performance metrics (Core Web Vitals, resource sizes, load times) for web pages. Compare against baselines and performance budgets. Use when the user says "benchmark", "check performance", "page speed", "web vitals", "performance regression", or "how fast is this page". |
/branch-workflow |
branch-workflow/SKILL.md |
Clean branch completion workflow — PR creation, merge strategy, and cleanup. Use this skill when implementation is complete and it's time to ship — after Phase 3 human gate passes. Also use when the user says "create a PR", "merge this", "ship it", "finish this branch", or asks about merge strategy. |
/browse |
browse/SKILL.md |
Launch a browser to navigate URLs, take screenshots, click elements, and fill forms. Use for visual verification, e2e testing, and interactive debugging. |
/build |
build/SKILL.md |
Execute an approved implementation plan using TDD. Reads the plan, implements each step with RED-GREEN-REFACTOR, runs inline review checkpoints, and produces verification evidence. Use when the user says "build this", "implement the plan", "start building", or after /plan has been approved. |
/careful |
careful/SKILL.md |
Toggle careful mode. When active, destructive commands (rm -rf, force-push, DROP TABLE, etc.) are blocked instead of just warned about. |
/cd-test-architecture |
cd-test-architecture/SKILL.md |
Evaluate an existing application's tests and recommend a CD-pipeline-aligned test architecture — fast, deterministic tests with minimal tooling that fully validate behavior (including cross-service interaction) and run in CI without configuring the rest of the system. Use when the user says "evaluate how this app is tested", "design a test architecture", "align our tests for CD", "make our CI tests deterministic", "our tests need the whole system configured", "our tests live in another repo / Postman / manual scripts", or asks for UI/service/batch test patterns. |
/ci-debugging |
ci-debugging/SKILL.md |
Systematic CI/CD failure diagnosis with hypothesis-first approach, environment delta analysis, and anti-patterns. Use when CI fails, pipelines break, or the user says "CI is failing", "build broke", "pipeline error", or "tests pass locally but fail in CI". |
/code-review |
code-review/SKILL.md |
Run all enabled review agents against target files. Use this whenever the user asks for a code review, wants feedback on their code, says "review my code", "check this before I PR", "what's wrong with this", "run the agents", or has just finished implementing a feature. Use proactively before commits and pull requests. |
/competitive-analysis |
competitive-analysis/SKILL.md |
Compare this plugin against external plugins, tools, feature sets, or ideas to find gaps and weaknesses. Produces a structured gap analysis report with rough specs for closing each gap. Use this skill whenever the user references capabilities from OUTSIDE the plugin — another plugin they found, a competitor's tool, a feature list from a different project, a repo URL, or a hypothetical concept for capabilities we lack. Trigger phrases include "how do we compare to X", "what does Y have that we don't", "what are we missing", "gap analysis", "competitive analysis", "weaknesses compared to", "stack up against", "where do we fall short", and "should we add X — I saw it in another tool". Also trigger when the user pastes a feature list or describes capabilities they saw elsewhere and asks whether we should have them. Do NOT trigger for internal operations like running reviews, auditing our own agents, adding skills, threat modeling, domain analysis, or debugging — those use other skills. |
/context-loading-protocol |
context-loading-protocol/SKILL.md |
Decide which agents and skills to load for a given task. Use at the start of every task to select the minimum viable context load, calculate the token budget, and stay below the 40% utilization ceiling. |
/context-summarization |
context-summarization/SKILL.md |
Compress conversation history when context utilization approaches 40%. Use when too many files have been read, the conversation is long, or output quality is degrading — write a structured summary to memory/ and start a fresh context window. |
/continue |
continue/SKILL.md |
Resume work from a prior session by reading phase progress files in memory/ and active plans. Use this when starting a new session on in-progress work, or when the user says "continue", "pick up where I left off", "resume", or "what was I working on". |
/cost-report |
cost-report/SKILL.md |
Report actual token spend and dollar cost of dispatched work — per agent and total — and flag cost regressions. Use when the user asks "how much did that cost", "token spend", "cost of this run", "cost report", or wants to check for a cost regression after /code-review or an orchestration run. |
/coverage-baseline |
coverage-baseline/SKILL.md |
Multi-workflow coverage baseline worker. Detects the repo's coverage tool from its build manifest, runs it, records the resulting line+branch percentages as the baseline, and posts the number to the parent issue (or local FEATURE.md). This number is the floor every later phase must improve on. Used by /test-upgrade (default) and /test-modernize (Phase 3), each via its own --workflow namespace. |
/coverage-delta |
coverage-delta/SKILL.md |
Multi-workflow coverage delta worker. Reads the baseline coverage, re-runs the same coverage tool against the current suite, computes the delta on line+branch percentages, and posts it to the parent issue (or local FEATURE.md). Called after each Story so the operator sees coverage move with every test added. Used by /test-upgrade (default) and /test-modernize (Phase 4), each via its own --workflow namespace. |
/design-doc |
design-doc/SKILL.md |
Produce a written design document in docs/specs/ with user approval before planning begins. Use this skill during the Research phase when a feature request, architectural change, or non-trivial task enters the pipeline. Ensures misunderstandings are caught before any planning or implementation work starts. Also use when the user says "brainstorm", "design", "spec", or "let's think through this". |
/design-interrogation |
design-interrogation/SKILL.md |
Relentlessly interview the user about a plan, design, or feature spec to surface unresolved decisions, hidden assumptions, and edge cases. Use when the user says "grill me", "stress-test this plan", "poke holes in my design", "what am I missing", or before committing to a plan that feels under-examined. Unlike /specs (which produces artifacts) this skill produces clarity — it's a thinking tool. Also use proactively in the Research phase when a design doc has implicit decisions that need to be made explicit. |
/design-it-twice |
design-it-twice/SKILL.md |
Generate multiple radically different interface designs for a module using parallel sub-agents, then compare and synthesize. Based on Ousterhout's "Design It Twice" principle. Use when the user wants to explore interface options, design an API, compare module shapes, or says "design it twice", "what are my options", or "show me alternatives". Also use when the Architect agent is designing a new module boundary or public interface. |
/docker-image-audit |
docker-image-audit/SKILL.md |
Audit Docker images and Dockerfiles for security vulnerabilities, bloat, and best-practice violations using hadolint, Trivy, and Grype. Produces a structured severity report with actionable fixes. Use this skill whenever the user wants to check a Docker image for security issues, scan a container for vulnerabilities, audit a Dockerfile, harden a Docker image, reduce image size, minimize attack surface, check for CVEs in a container, or says things like "is this Dockerfile secure?", "scan my image", "check my container for vulnerabilities", "how can I make this image smaller?", "audit my Docker setup", or "harden this container". Also trigger when the user has just created or modified a Dockerfile and wants validation before shipping it. |
/docker-image-create |
docker-image-create/SKILL.md |
Generate production-ready Dockerfiles from project source code. Detects language/framework automatically and produces multi-stage builds with minimal, distroless, or slim base images. Use this skill whenever the user wants to containerize an application, create a Dockerfile, dockerize a project, build a Docker image, or says things like "make this run in Docker", "create a container for this app", "I need a Dockerfile", "package this for deployment", or "containerize this service". Also trigger when the user has an existing Dockerfile and wants it rewritten for production use, or when they ask about Docker best practices for their project. |
/domain-analysis |
domain-analysis/SKILL.md |
Strategic DDD health assessment of an existing system. Use whenever someone asks to analyze their architecture, assess domain health, find coupling problems, map bounded contexts, trace event flows across services, or understand what is slowing down delivery. Trigger on phrases like "what's wrong with our architecture", "where is the coupling", "assess our domain", "event storming", "value stream", "friction report", "bounded contexts", or "why is everything so tangled". Apply to existing codebases — use domain-driven-design skill for greenfield modeling. |
/domain-driven-design |
domain-driven-design/SKILL.md |
Model software around the business domain. Use when designing bounded contexts, defining aggregates and value objects, mapping context relationships, or working with complex business logic. Apply before implementation to prevent model drift. |
/exploratory-testing |
exploratory-testing/SKILL.md |
Charter-driven exploratory testing — probe a running feature/endpoint with structured heuristics, evaluate charter quality, run adversarial expansion, classify defects, and auto-triage critical findings into an incremental report. Use when the user runs /explore, says "explore this endpoint", "poke at this feature", "find bugs in the running app", or wants hands-off exploratory testing of a live target. |
/explore |
explore/SKILL.md |
Charter-driven exploratory testing of a running feature or endpoint. Dispatches the QA Engineer in "Chaos Specialist" mode to probe with structured heuristics (Goldilocks, Happy-Path Divergence, Telemetry Deepening, Invariant Probing, CRUD Sweep), run adversarial expansion, and auto-triage critical defects into an incremental report. Use when the user says "explore this endpoint", "poke at this feature", or wants hands-off exploratory testing of a live target. |
/farley-score |
farley-score/SKILL.md |
Evaluate test quality using Dave Farley's 8 properties with a weighted Farley Score. Use when reviewing test suites, after writing tests, or when the user says "score my tests", "test quality", "Farley score", or "how good are my tests". |
/feature-file-validation |
feature-file-validation/SKILL.md |
Validate Gherkin feature files for structural quality, determinism, and implementation independence, then verify each scenario has matching test automation. Use this skill whenever reviewing test files, feature files, or BDD scenarios — including during /code-review when .feature files or step definition files appear in the changeset. Also use when a user asks to "check my feature files", "validate my Gherkin", "are my scenarios testable", or "do my feature files have tests". |
/feedback-learning |
feedback-learning/SKILL.md |
Capture amend/learn/remember/forget keywords from the user and update agent or skill configurations. Invoke immediately when the user issues any of these trigger words — parse the change, preview a diff, apply it, and log it to the audit trail. |
/freeze |
freeze/SKILL.md |
Scope-lock file editing to a specific glob pattern. Only files matching the pattern can be edited until /unfreeze is called. |
/frontend-architecture |
frontend-architecture/SKILL.md |
Frontend component architecture review — dispatch the component-architecture-review agent over the frontend component files to catch reusable components that should be extracted, duplicated UI patterns, prop drilling, component-granularity problems, and inconsistent component APIs as a frontend evolves. Use when the user says "review the frontend architecture", "are my components reusable", "is this UI duplicated", "should this be a shared component", "check for prop drilling", or before extracting a component library. Advisory — it recommends, it does not edit. |
/gherkin-derive |
gherkin-derive/SKILL.md |
Derive Gherkin scenarios directly from a codebase — standalone, with no prior legacy-modernization analysis. Discovers the public surface (OpenAPI, routes, existing tests, then exported signatures), recommends a BDD binding mode via the bdd-value-guide rubric, and writes .feature files plus (in bdd-runner mode) pending step-definition stubs. Use it on its own to capture intended behavior before changing tests, or as Phase 1b of /test-upgrade. Unlike /gherkin-public it needs no /test-modernize component map and creates no tracker Stories. |
/gherkin-public |
gherkin-public/SKILL.md |
Author Gherkin scenarios for the entire public interface of a repository — every API endpoint, UI screen, batch-job entry point, library export, and event type — at the observable boundary, not internal steps. The scenarios become the executable specification of intended behavior before any test or production-code change lands. After the operator approves the scenarios at the Phase-2 gate, this skill also creates the Phase-4 and Phase-5 [Component tests] Stories that will bind their test code to specific scenario names — so the component tests are written from the approved Gherkin, not from the assessment. |
/governance-compliance |
governance-compliance/SKILL.md |
Audit logging, quality gates, and ethics procedures for the agent team. Use for periodic compliance reviews, when logging task completion events, or when an ethical concern arises that requires human escalation. |
/guard |
guard/SKILL.md |
Activate both careful mode and freeze mode together. Blocks destructive commands and scope-locks editing to the specified pattern. Use for production-critical debugging sessions. |
/harness-audit |
harness-audit/SKILL.md |
Analyze review agent effectiveness, model routing, and orchestration complexity against actual usage data. Produces a report of harness components that may be candidates for simplification or removal. Use periodically to prevent harness staleness as model capabilities improve. Audits the dev-team plugin's OWN harness from runtime metrics — not your project repo's readiness (for that, use /agent-readiness). |
/help |
help/SKILL.md |
List all available slash commands with their descriptions. |
/hexagonal-architecture |
hexagonal-architecture/SKILL.md |
Design with ports and adapters to separate business logic from infrastructure. Use when designing a new service, reviewing structural compliance, or deciding how to introduce a new external dependency without coupling the domain. |
/human-oversight-protocol |
human-oversight-protocol/SKILL.md |
Approval gates, intervention commands, and transparency requirements. Use to classify any agent action as autonomous/notify/approve, respond to override/pause/stop commands, or structure a plan review before the implementation phase begins. |
/init-dev-team |
init-dev-team/SKILL.md |
Install required tools for the dev-team plugin. OS-aware (macOS, Linux, Windows Git Bash): installs jq and python3 as hard dependencies, then prompts for language selection (JS/TS, Java, C#) to install the matching mutation testing tool (Stryker, pitest, Stryker.NET). Run this when the mutation gate reports a missing tool. |
/issues-from-assessment |
issues-from-assessment/SKILL.md |
Convert a /cd-test-architecture assessment into a parent + Phase-tagged child issues on the tracker the operator points at (ADO, GitHub, GitLab, Jira). Dispatches by parent URL host to the tracker's own CLI (az boards, gh, glab, acli). When no parent URL is given, or when the required CLI is not installed, falls back to local plan files under ./plans/test-modernize/ after informing the operator. |
/issues-from-plan |
issues-from-plan/SKILL.md |
Break a plan into independently-grabbable GitHub issues. Use when the user says "create issues from this plan", "break this into tickets", "file issues", or wants to distribute plan steps across a team. |
/js-project-init |
js-project-init/SKILL.md |
Initialize a new JavaScript project with ES modules, functional style, prettier, eslint, editorconfig, vitest, and gitignore. Use this skill whenever the user wants to start a new JS project, scaffold a Node.js app, create a new package, bootstrap a JavaScript repo, or says things like "init a new project", "set up a JS project", "create a new node app", "start a new frontend project", or "bootstrap a new package". Also trigger when the user asks to add standard JS tooling (linting, formatting, testing) to an empty or near-empty directory. |
/legacy-code |
legacy-code/SKILL.md |
Safely modify code that lacks tests. Use whenever tasked with changing code without test coverage — apply characterization tests and dependency-breaking techniques before making any behavioral changes. |
/mermaid-diagramming |
mermaid-diagramming/SKILL.md |
Create Mermaid diagrams using the project's blue-gray theme. Use whenever the user asks to draw a diagram, create a flowchart, visualize a process, document architecture, or add any Mermaid diagram to a markdown file. Trigger on phrases like "draw a diagram", "create a flowchart", "visualize this", "add a mermaid diagram", "document the flow", "sequence diagram", "architecture diagram", or any request to diagram a process or system. |
/model-routing-check |
model-routing-check/SKILL.md |
Read-only diagnostic for effort-band model routing. Prints the effective band → model map (shipped defaults or the per-environment ladder), the ladder file (or a ready-to-edit starter when none exists), the captured session model, and the most recent routing-bump events from the resolver log. Touches no files; no side effects. |
/mutation-testing |
mutation-testing/SKILL.md |
Validate test suite quality by running a real mutation testing tool and triaging surviving mutants. Use after writing tests to verify assertions catch behavioral changes, when evaluating test coverage quality, or as a CI quality gate on critical modules. The AI value here is triage — classifying survivors, writing fix tests — not generating or estimating mutations. |
/performance-metrics |
performance-metrics/SKILL.md |
Log task completion data to metrics/. Use at the end of every task to record tokens, cost, agents used, rework cycles, and hallucination events. Also use for periodic reporting to identify efficiency and quality trends. |
/plan |
plan/SKILL.md |
Create a structured implementation plan with goal, acceptance criteria, incremental TDD steps, and a pre-PR quality gate. Use this for tasks that need a plan but not the full three-phase orchestration, or when the user says "plan this", "make a plan", "break this down", or "how should I implement this". |
/pr |
pr/SKILL.md |
Run a pre-PR quality gate (tests, typecheck, lint, code review) and then create a pull request with a structured summary. Use when the user says "create a PR", "open a PR", "submit for review", or "I'm done with this feature". |
/quality-gate-pipeline |
quality-gate-pipeline/SKILL.md |
Unified quality gate for agent output — self-validation, verification evidence, and review-correction loops. Consolidates accuracy-validation, verification-before-completion, and task-review-correction into a single three-phase pipeline. Use before delivery, at completion, and during rework. |
/quality-targets-converge |
quality-targets-converge/SKILL.md |
Phase-5 worker for /test-modernize. Closes the gap between the current test suite and the four quality targets (line+branch coverage ≥ 90%, zero surviving mutants, 100% deterministic, fastest pre-merge wall-clock achievable on-machine). Each iteration reads the latest measurements, picks the largest gap, and dispatches the smallest action that moves it. Stops only when all four targets are green or each gap is explicitly waived by the operator with a recorded reason. |
/review |
review/SKILL.md |
Alias for /code-review. Run all enabled review agents against target files. Use this whenever the user asks for a code review, wants feedback on their code, says "review my code", "check this before I PR", "what's wrong with this", "run the agents", or has just finished implementing a feature. |
/review-agent |
review-agent/SKILL.md |
Run a single named review agent against target files. Use this when the user names a specific agent (e.g. "run security-review", "check for test issues", "run js-fp-review on this file") rather than wanting the full suite. Prefer this over /code-review when only one concern is relevant or speed matters. Also used by the orchestrator for inline review checkpoints during Phase 3 implementation. |
/review-summary |
review-summary/SKILL.md |
Generate a compact summary of the most recent code review results and save it for future sessions. Use this at the end of a coding session after /code-review has run, or when the user says "summarize the review", "save the results", "generate a summary", or wants to preserve review context before closing a session. |
/semantic-duplication-scan |
semantic-duplication-scan/SKILL.md |
Detect business logic reimplemented in multiple architectural layers. Builds a persistent computation-register.json by annotating non-trivial computation functions with structured semantic descriptions, then clusters entries to surface duplicate domain concepts. Runs in full-scan mode on first use, incremental (git-diff-based) mode on subsequent runs. Use when the user wants to find logical duplication that linters and diff-scoped review agents miss — the same domain calculation independently reimplemented across layers. |
/semantic-scan |
semantic-scan/SKILL.md |
Build a computation register and detect semantic duplicates across architectural layers. Finds business logic reimplemented multiple times in different layers — the same domain calculation independently appearing in domain services, client adapters, and presentation components. Runs incrementally (git-diff-based) after the first scan. Produces a structured duplicate report with file:line references and canonical location suggestions. |
/semgrep-analyze |
semgrep-analyze/SKILL.md |
Run Semgrep static analysis on target files and return structured findings. Use this when the user wants static analysis, SAST scanning, or security scanning — phrases like "run semgrep", "scan for vulnerabilities", "static analysis on this code", or as a pre-review gate when security findings are needed before AI agents run. |
/session-review |
session-review/SKILL.md |
Mine real Claude Code session transcripts to suggest plugin improvements that cut token spend, reduce re-work, and improve accuracy. Use when the user asks to "review my sessions", "where am I wasting tokens", "why does this keep re-doing work", or "/session-review". |
/setup |
setup/SKILL.md |
Detect a project's tech stack and auto-generate project-level CLAUDE.md, PostToolUse hooks, and language-specific agent templates in one shot. Use this when onboarding a new project, or when the user says "setup", "bootstrap", "configure this project", or "detect my stack". |
/ship |
ship/SKILL.md |
Run the full spec-to-merge pipeline as one command: spec, plan, TDD build, code review, and a PR with auto-merge — pausing at the existing human gates. Use when the user says "ship this", "take this feature end to end", "implement this issue", "we need to build", or wants the spec->plan->build->PR flow without re-assembling it each time. |
/specs |
specs/SKILL.md |
Collaborative workflow for producing the three specification artifacts (intent, architecture notes, acceptance criteria) that describe a change and its goals before any implementation begins. Use when starting any new feature or behavior change — do not write code until artifacts pass the consistency gate. BDD/Gherkin scenarios are authored later, per slice, in /plan. |
/systematic-debugging |
systematic-debugging/SKILL.md |
Four-phase debugging protocol (reproduce, investigate, root-cause, fix) that prevents guess-and-fix thrashing. Use this skill whenever a test fails, a bug is reported, an error occurs during implementation, or any unexpected behavior is encountered. Prevents the common LLM failure mode of guessing at fixes without understanding the problem. |
/telemetry |
telemetry/SKILL.md |
Manage and report the opt-in, privacy-clean usage telemetry beacon. Use when the user asks to "enable/disable telemetry", "show telemetry", "usage stats", "which commands do I use", or "how often is the commit gate bypassed". |
/test-audit-disable |
test-audit-disable/SKILL.md |
Phase-3 worker for /test-modernize. Audits the existing test suite for tests that cannot fail — no assertions, assertions on constants, expect-true, swallowed exceptions, self-equality — and disables each one by skip-and-tag (never deletes). Records each disabled test plus its reason in a JSON log under memory/test-modernize/ so Phase 4 can later repair them. Pairs with /coverage-baseline to produce a true baseline coverage number. |
/test-design |
test-design/SKILL.md |
Deep test-design review: dispatch test-review (tactical quality) and test-smell-review (xUnit smells, double selection, pyramid placement) in parallel, then run the test-design-advisor skill to recommend how to test hard-to-test code. Use when the user says "review my tests", "how should I test this", "is this testable", "test design review", or before writing a suite for an untested module. Advisory — it recommends, it does not edit. |
/test-design-advisor |
test-design-advisor/SKILL.md |
Advise on test design — assess testability, recommend the right test-pyramid layer and test-double strategy, and propose a behavior-preserving refactor sequence to make hard-to-test code testable. Use when the user says "how should I test this", "is this testable", "design tests for this", "what's the right test for X", or before writing tests for an untested module. |
/test-driven-development |
test-driven-development/SKILL.md |
Enforce RED-GREEN-REFACTOR cycle with hard gates. Use this skill whenever writing new code, fixing bugs, or adding features — any time implementation code will be written or modified. Prevents the common LLM failure mode of writing implementation first and tests later (or never). Also use when reviewing code to verify TDD discipline was followed. |
/test-health |
test-health/SKILL.md |
Project-wide test-strategy audit — derive the suite's shape and shape-vs-architecture fit, map coverage to the Agile Testing Quadrants, roll up coverage + mutation health, flag flaky tests and automation maturity, and produce an ordered improvement plan. Delegates CD-determinism + pipeline assessment to cd-test-architecture. Use when the user says "audit our tests", "how healthy is our test suite", "test strategy review", or runs /test-health. Advisory — writes a report, does not edit. |
/test-modernize |
test-modernize/SKILL.md |
Modernize a legacy repository's tests for CD as one orchestrated workflow — assessment, public-interface Gherkin, disable cannot-fail tests with baseline coverage, add every test that needs no production-code refactoring, then minimum refactor-for-testability until coverage, mutation, determinism, and speed targets are met. Outputs phase issues to ADO, GitHub, GitLab, Jira, or local plans/specs files — whichever the parent issue URL resolves to (empty falls back to local files). |
/test-upgrade |
test-upgrade/SKILL.md |
General-purpose analyze-then-improve test workflow for any JS/TS, Go, Java, or C# codebase. A 4-phase orchestrator: analyze with /test-health, optionally derive Gherkin with /gherkin-derive, triage into work items, implement each Story with /build + /coverage-delta + the mutation-kill agent, and validate with /quality-targets-converge. Local-first, BDD optional, Go-aware. Use when the user says "upgrade our tests", "improve test quality", or runs /test-upgrade. Lighter and general-purpose where /test-modernize is a brownfield legacy rescue. |
/threat-modeling |
threat-modeling/SKILL.md |
Structured STRIDE security analysis for identifying threats, attack surfaces, and mitigations. Use before implementing any new API, service, authentication change, or data flow crossing trust boundaries — security analysis belongs in the design phase, not after. |
/triage |
triage/SKILL.md |
Investigate a bug, find its root cause, and write a portable triage record to .triage/ |
/ubiquitous-language |
ubiquitous-language/SKILL.md |
Build or refresh the project's ubiquitous language glossary — one markdown file per business concept at .plans/domain/<Concept>.md plus a _index.md. Mines grep-based signals (class names, enum values, interface names, domain-event names, BDD scenario names, validator rules) and applies a four-gate filter to keep only genuine business concepts. Optional interactive interview phase to refine definitions and capture behavior (state transitions, invariants, synonyms to avoid). Language-agnostic — works for JS/TS, C#, Java, Python, Go, or any mix. Use whenever the user says "build the glossary", "extract domain terms", "document the ubiquitous language", "what are the domain concepts", or when domain-review surfaces pervasive terminology inconsistency (3+ names for the same concept). |
/unfreeze |
unfreeze/SKILL.md |
Lift the scope lock set by /freeze. All files become editable again. |
/upgrade |
upgrade/SKILL.md |
Check for and apply plugin updates using the official Claude Code plugin update mechanism. |
/version |
version/SKILL.md |
Report the installed version of the dev-team plugin. |
Agent-loaded skills¶
| Skill | File | Description |
|---|---|---|
| browser-testing | browser-testing/SKILL.md |
Patterns and templates for browser-based QA using Playwright. Covers navigation, form interaction, screenshot capture, visual verification, and CAPTCHA/auth handoff. |
| performance-benchmark | performance-benchmark/SKILL.md |
Capture runtime performance metrics (Core Web Vitals, resource sizes, load times) against defined budgets. Compare to baselines, flag regressions, and maintain trend history. Complements the code-level performance-review agent with actual runtime measurement. |
| static-analysis-integration | static-analysis-integration/SKILL.md |
SARIF-first pre-pass stage for /code-review that runs available static analysis tools and normalizes their output to the unified finding envelope defined in security-primitives-contract v1.0.0. Deduplicates findings across tools and passes confirmed issues to AI agents so they can focus on semantic concerns. |