Add external runner support for quality gate reviews by nhorton · Pull Request #206 · Unsupervisedcom/deepwork

nhorton · 2026-02-10T18:25:14Z

Summary

This PR introduces configurable quality gate review modes to DeepWork, allowing users to choose between Claude CLI subprocess reviews (external runner) and agent self-review via instructions files.

Key Changes

New --external-runner CLI option: Added to deepwork serve command with support for "claude" mode. Defaults to None for self-review mode.
QualityGate refactoring:
- Made ClaudeCLI optional (can be None for self-review mode)
- Added configurable max_inline_files parameter (5 for external runner, 0 for self-review)
- New build_review_instructions_file() method generates markdown instructions for agent self-review
- Updated evaluate() to raise error if CLI is required but not provided
WorkflowTools dual-path quality gate handling:
- Self-review mode: Generates review instructions file and returns guidance to agent to spawn a subagent for verification
- External runner mode: Uses existing Claude CLI subprocess evaluation path
- Both paths share common review dict and output spec building logic
Server configuration:
- create_server() now accepts external_runner parameter
- Instantiates QualityGate with appropriate settings based on runner mode
- Updated .mcp.json to use --external-runner claude by default
File I/O: Added aiofiles import for async file writing of review instructions

Implementation Details

Self-review instructions include output listings, quality criteria, guidance, and clear evaluation guidelines
File embedding strategy is configurable: external runner embeds up to 5 files inline for efficiency, self-review always lists paths only to keep instructions concise
Review instructions are written to .deepwork/tmp/quality_review_<session>_<step>.md
Agent receives detailed feedback with instructions to spawn a subagent, review findings, fix issues, and retry until all criteria pass
Backward compatible: existing code using external runner continues to work unchanged

Testing

Updated unit tests to reflect new initialization behavior and added external_runner parameter to test fixtures.

https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV

…e execution Introduces --external-runner CLI param to the serve command that controls how quality gate reviews are executed: - external_runner=None (default): Agent self-review mode. finished_step dumps review instructions to .deepwork/tmp/ and returns guidance for the agent to verify its own work via a subagent, then call finished_step again with quality_review_override_reason once passing. - external_runner="claude": Claude CLI subprocess mode (existing behavior). Quality reviews are evaluated by spawning Claude as a subprocess. Also makes the max_inline_files threshold configurable per QualityGate instance (was hard-coded as MAX_INLINE_FILES=5). Claude subprocess mode uses 5 (embed up to 5 files inline), self-review mode uses 0 (always reference files by path so the subagent reads them directly). The installer now generates .mcp.json with --external-runner claude so Claude Code users get the subprocess review behavior by default. https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV

Adds 45 new tests across 4 files covering the external_runner feature: - TestConfigurableMaxInlineFiles (7 tests): QualityGate constructor with max_inline_files=0, 5, 10, None; payload behavior at each threshold - TestEvaluateWithoutCli (2 tests): evaluate() raises without CLI, empty criteria still auto-passes - TestBuildReviewInstructionsFile (9 tests): file structure, criteria, numbered reviews, notes, guidance, per-file listings, path-only mode - TestExternalRunnerSelfReview (9 tests): NEEDS_WORK status, feedback content, instructions file written, criteria in file, path-only refs, file naming, override-then-complete flow, skip for reviewless steps, notes propagation - TestExternalRunnerClaude (4 tests): evaluate_reviews called, no instructions file written, failing gate feedback, attempt tracking - TestExternalRunnerInit (3 tests): default None, explicit value, no-gate - TestClaudeAdapterMCPRegistration (7 tests): creates .mcp.json, includes --external-runner claude, full args, idempotent, updates old config, preserves other servers - TestServeExternalRunnerOption (4 tests): default None, claude passthrough, invalid choice rejected, help output https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV

https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV

claude added 3 commits February 10, 2026 17:14

style: Apply ruff formatting to new and modified files

0ea6486

https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add external runner support for quality gate reviews#206

Add external runner support for quality gate reviews#206
nhorton wants to merge 3 commits intomainfrom
claude/add-external-runner-mcp-ZNa9C

nhorton commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nhorton commented Feb 10, 2026

Summary

Key Changes

Implementation Details

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants