Skip to content

Agentic Dev Team

Docs

📖 Documentation

Three Claude Code plugins for engineering workflows. Install one or all.

  • dev-team gives Claude Code a full persona-driven development team: an Orchestrator that routes tasks, specialist agents (engineer, QA, architect, reviewers…), skills that encode reusable knowledge, and the four-command feature workflow /specs → /plan → /build → /pr.
  • security-assessment is the security companion. It adds a deterministic-first /security-assessment pipeline (SAST + LLM judgment + FP-reduction + exec report), a /cross-repo-analysis command for multi-repo attack chains, and an adversarial ML red-team harness (/redteam-model) for self-owned model endpoints.
  • marketplace-dev is the plugin-author's toolkit. It scaffolds new plugins and marketplaces (/scaffold-plugin, /scaffold-marketplace, /init-plugin-eval), audits any plugin for structural compliance (/plugin-audit), advises on the markdown-vs-script agent decision (/agent-type-advisor), and ships the migrated agent/skill authoring toolkit (/agent-create, /agent-add, /agent-remove).

dev-team is the hub: it owns the primitives contract (codebase-recon, ACCEPTED-RISKS.md, unified finding envelope) that security-assessment builds on, so install dev-team first and add security-assessment when you need it. marketplace-dev is independent — it has no hard runtime dependency on dev-team and can be installed on its own to build or maintain plugins.

Plugins

Plugin What it does Key commands Required tools Optional tools
dev-team Persona-driven development team, reviewer swarm, TDD-gated build loop /specs, /plan, /build, /pr, /code-review, /triage jq, gh semgrep, playwright, hadolint/trivy/grype; auto-detected formatters and test/type/lint runners
security-assessment Tool-first security assessment + red-team pipeline /security-assessment, /cross-repo-analysis, /redteam-model, /export-pdf dev-team, Python ≥ 3.10, tier-1 SAST (semgrep, gitleaks, trivy, hadolint, actionlint) grype, PDF-export deps
marketplace-dev Scaffold, audit, and maintain Claude Code plugins and marketplaces /scaffold-plugin, /scaffold-marketplace, /plugin-audit, /agent-type-advisor, /agent-create jq git

Plugin names link to each plugin's README (or, for marketplace-dev, its CLAUDE.md guide), where the full tool list and per-tool install commands live. Claude Code itself is assumed. First time here? Start with dev-team; add security-assessment only when you run full /security-assessment pipelines against target repos, and marketplace-dev when you're building or maintaining plugins.

Getting Started

New here? The Getting Started guide is the full walkthrough — installing each plugin, configuring a project, the day-to-day workflow, and the diagnostic commands.

Quick install of the core plugin:

claude plugin marketplace add bdfinst/agentic-dev-team
claude plugin install dev-team@bfinster

Then run /init-dev-team and /setup in your project. Optional plugins (security-assessment, marketplace-dev), self-hosted git hosts, install scopes, and the /upgrade flow are all covered in the Getting Started guide.

Dev team workflow

Four commands drive feature development from idea to pull request:

/specs  →  /plan  →  /build  →  /pr
Step Command What it does
1. Specify /specs Describe the change and its goals — Intent, Architecture notes, Acceptance Criteria. A consistency gate must pass before moving on. Skip for bug fixes, refactors, or trivial changes.
2. Plan /plan Decompose the feature into vertical slices, author each slice's Gherkin scenarios, and lay out the TDD steps that satisfy them. Four plan-review personas (Acceptance Test, Design, UX, Strategic critics) challenge the plan before the human sees it. Human approves before any code is written.
3. Build /build Execute the approved plan slice by slice. Each step follows RED-GREEN-REFACTOR with inline review checkpoints (spec-compliance first, then quality agents). Produces verification evidence.
4. Ship /pr Run quality gates (tests, typecheck, lint, code review) and open a pull request.

Each step produces artifacts the next step consumes. The spec describes what and why; the plan turns that into per-slice behavioral contracts (Gherkin) and how. Human review gates sit between transitions.

Workflow: specs → plan → build → pr

For bug fixes or simple tasks, skip /specs and start at /plan — or go straight to implementation.

Supporting commands

Command When to use
/code-review Run review agents, auto-fix actionable issues, re-run until clean (up to 5 iterations)
/continue Resume an in-progress build or plan across sessions
/browse Visual QA via Playwright
/benchmark Runtime performance metrics (Core Web Vitals, resource sizes) against baselines
/careful / /freeze / /guard Safety modes for production-critical sessions
/triage Investigate a bug and file a GitHub issue with a TDD fix plan

Automated pre-commit review

Every git commit is automatically gated by /code-review. A PreToolUse hook detects commit attempts and blocks them until a passing review exists for the exact set of staged files.

Flow: attempt commit → hook blocks → Claude runs /code-review → if pass/warn, a .review-passed gate file is written → next commit attempt succeeds.

Bypass: git commit --no-verify skips the review gate.

Security assessment pipeline

/security-assessment <path> runs a six-phase pipeline against one or more target repos. Deterministic tools do the detection; LLM agents handle the judgment stages.

Phase Runs Output
0. Recon codebase-recon agent memory/recon-<slug>.{json,md}
1. Tool-first detection semgrep, gitleaks, trivy, hadolint, actionlint, custom rulesets unified findings stream
1b. Judgment security-review, business-logic-domain-review agents appended findings
1c. Suppression ACCEPTED-RISKS.md gate (deterministic) filtered stream + audit log
2. FP-reduction 5-stage rubric (reachability, environment, controls, dedup, severity) disposition register
2b. Severity floors deterministic domain-class calibration floor-adjusted scores
3. Narrative + compliance tool-finding-narrative-annotator, compliance-mapping skill 4-domain narrative + compliance JSON
4. Cross-repo service-comm parser, shared-cred hash match (multi-target only) mermaid diagram + SARIF
5. Exec report exec-report-generator agent publication-ready 7-section markdown

Zero-install flow: scripts/run-assessment-local.sh runs the same pipeline from the repo checkout without installing the plugin. Auto-detects the claude CLI; degrades to deterministic-only when absent. See the user guide for the full runbook.

Adversarial ML red-team: /redteam-model probes a self-owned model endpoint (localhost / RFC1918 by default; public targets require a signed authorization.md). Eight probes covering recon, evasion, extraction, and report synthesis.


Contributing

Developing, testing, or releasing the plugins? See CONTRIBUTING.md — local-dev setup (including live installs via symlinks), the /agent-eval and /agent-audit test commands, the security comparative-testing harness, how to add agents and skills, and the release process.

Documentation

Start here

Doc Covers
Getting Started User tutorial — the workflow, suggested skills, worked examples
Contributing Local development, testing, adding agents/skills, releasing
Plugin Development Guide Project North Star, repo structure, working rules

dev-team — architecture

Doc Covers
Plugin README Install, prerequisites, optional tools, quality gates
Orchestration Pipeline Three-phase workflow, registries, context management
Architecture Context management, quality assurance, governance, model routing
Team Structure Org chart, team + review-agent dispatch diagrams
Agents Agent roster, persona template, adding/removing/customizing

dev-team — workflows & skills

Doc Covers
Workflows /ship and /test-modernize orchestration with human gates
Code Review Process /code-review end-to-end flow and the agents it dispatches
Skills & Commands Skills catalog and slash-command reference
Session Review Mining session transcripts for improvement suggestions
Session Review — OSS complements OSS tools that complement the session-review loop

dev-team — model routing

Doc Covers
Model Routing Environment-aware effort-band → model resolution, defaults, ladder, troubleshooting
Model Routing — Overrides Authoring .claude/model-ladder.json: schema, precedence, worked ladders, migration guarantee

dev-team — evaluation & quality

Doc Covers
Eval System How review-agent accuracy is measured and graded
Eval Running Guide Running the eval fixtures
Eval Maintenance Maintaining fixtures, catching regressions, tracking accuracy
Test Evaluation Test-evaluation procedures

dev-team — operations & observability

Doc Covers
Concurrent Use Worktree isolation and concurrent build strategy
CodeGraph Nudge The hook that nudges CodeGraph over multi-file Read/Grep
Telemetry — CI access Giving CI read-only access to the telemetry repo
Telemetry — repo security How machines write digests to the telemetry repo

security-assessment plugin

Doc Covers
Plugin README Design philosophy, install, when to use vs. /code-review
Plugin Architecture Hooks, SARIF orchestration, LLM-safety bounds, red-team targets
User Guide Path-A (plugin) vs. Path-B (zero-install) runbook, tool matrix
Accepted Risks Format ACCEPTED-RISKS.md schema and format
Comparative Testing Fixture repo, ground truth, scoring methodology

marketplace-dev plugin

Doc Covers
Plugin Guide Slash commands, the structural review agent, conventions enforced, eval fixtures
Agent-type Decision Rules The markdown-vs-script decision matrix (rules R1–R10)
Marketplace Builder Playbook The conventions marketplace-dev encodes — layout, frontmatter contracts, release/catalog sync

Repo-level guides & decision records

Doc Covers
Marketplace Builder Playbook Building a plugin that scaffolds/audits marketplace monorepos
Plugin Skills in the Web Environment Running a plugin's skills from a Claude Code web session
Architecture Decision Records Indexed ADRs — model routing, knowledge indexing, eval tiers, integration topology
Design Specs Effort-band model routing design specification

CodeGraph

This repository uses CodeGraph for semantic code intelligence.