Add external runner support for quality gate reviews#206
Open
Add external runner support for quality gate reviews#206
Conversation
…e execution Introduces --external-runner CLI param to the serve command that controls how quality gate reviews are executed: - external_runner=None (default): Agent self-review mode. finished_step dumps review instructions to .deepwork/tmp/ and returns guidance for the agent to verify its own work via a subagent, then call finished_step again with quality_review_override_reason once passing. - external_runner="claude": Claude CLI subprocess mode (existing behavior). Quality reviews are evaluated by spawning Claude as a subprocess. Also makes the max_inline_files threshold configurable per QualityGate instance (was hard-coded as MAX_INLINE_FILES=5). Claude subprocess mode uses 5 (embed up to 5 files inline), self-review mode uses 0 (always reference files by path so the subagent reads them directly). The installer now generates .mcp.json with --external-runner claude so Claude Code users get the subprocess review behavior by default. https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV
Adds 45 new tests across 4 files covering the external_runner feature: - TestConfigurableMaxInlineFiles (7 tests): QualityGate constructor with max_inline_files=0, 5, 10, None; payload behavior at each threshold - TestEvaluateWithoutCli (2 tests): evaluate() raises without CLI, empty criteria still auto-passes - TestBuildReviewInstructionsFile (9 tests): file structure, criteria, numbered reviews, notes, guidance, per-file listings, path-only mode - TestExternalRunnerSelfReview (9 tests): NEEDS_WORK status, feedback content, instructions file written, criteria in file, path-only refs, file naming, override-then-complete flow, skip for reviewless steps, notes propagation - TestExternalRunnerClaude (4 tests): evaluate_reviews called, no instructions file written, failing gate feedback, attempt tracking - TestExternalRunnerInit (3 tests): default None, explicit value, no-gate - TestClaudeAdapterMCPRegistration (7 tests): creates .mcp.json, includes --external-runner claude, full args, idempotent, updates old config, preserves other servers - TestServeExternalRunnerOption (4 tests): default None, claude passthrough, invalid choice rejected, help output https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces configurable quality gate review modes to DeepWork, allowing users to choose between Claude CLI subprocess reviews (external runner) and agent self-review via instructions files.
Key Changes
New
--external-runnerCLI option: Added todeepwork servecommand with support for"claude"mode. Defaults toNonefor self-review mode.QualityGate refactoring:
ClaudeCLIoptional (can beNonefor self-review mode)max_inline_filesparameter (5 for external runner, 0 for self-review)build_review_instructions_file()method generates markdown instructions for agent self-reviewevaluate()to raise error if CLI is required but not providedWorkflowTools dual-path quality gate handling:
Server configuration:
create_server()now acceptsexternal_runnerparameter.mcp.jsonto use--external-runner claudeby defaultFile I/O: Added
aiofilesimport for async file writing of review instructionsImplementation Details
.deepwork/tmp/quality_review_<session>_<step>.mdTesting
Updated unit tests to reflect new initialization behavior and added
external_runnerparameter to test fixtures.https://claude.ai/code/session_015Lub1RgLErD6kC6k8cEmSV