-
Notifications
You must be signed in to change notification settings - Fork 189
Add PPL bugfix skill for Claude Code #5307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
qianheng-aws
wants to merge
10
commits into
opensearch-project:main
Choose a base branch
from
qianheng-aws:bugfix-skill
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
a5fd530
Add PPL bugfix harness and /ppl-bugfix slash command
qianheng-aws d4b9b01
Improve PPL bugfix command: permission modes, parallel dispatch, file…
qianheng-aws b9584ee
Fix follow-up harness to check PR comments including bot feedback
qianheng-aws 57bfdb0
Address review feedback from bot comments on PR #5307
qianheng-aws 638f0ce
Tighten git permissions: restrict push and reset to safe variants
qianheng-aws 6b29b60
Add retrospective step to follow-up harness
qianheng-aws ce54987
Streamline ppl-bugfix workflow: fix 5 cross-cutting issues
qianheng-aws 45bbb51
Phase 4: clarify that harness improvements go in the same PR
qianheng-aws 42caa29
Enforce hard stop when bug is already fixed on main
qianheng-aws e2ff864
Streamline ppl-bugfix harness and CLAUDE.md after eval
qianheng-aws File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,125 @@ | ||
| --- | ||
| allowed-tools: Agent, Read, Bash(gh:*), Bash(git:*) | ||
| description: Run the PPL bugfix harness for a GitHub issue or follow up on an existing PR | ||
| --- | ||
|
|
||
| Fix a PPL bug or follow up on an existing PR using the harness in `.claude/harness/ppl-bugfix-harness.md`. | ||
|
|
||
| ## Input | ||
|
|
||
| Accepts one or more issue/PR references. Multiple references are processed in parallel (each gets its own subagent + worktree). | ||
|
|
||
| - `/ppl-bugfix #1234` — single issue | ||
| - `/ppl-bugfix PR#5678` — single PR | ||
| - `/ppl-bugfix #1234 #5678 PR#9012` — multiple in parallel | ||
| - `/ppl-bugfix https://github.com/opensearch-project/sql/issues/1234` — URL | ||
|
|
||
| Optional mode flag (append to any of the above): | ||
| - `--safe` — `acceptEdits` mode. Auto-approve file edits only, Bash commands require manual approval. (Most conservative) | ||
| - `--yolo` — `bypassPermissions` mode. Fully trusted, no prompts. Subagent runs in an isolated worktree so this is safe. (Default) | ||
|
|
||
| > **Note**: `bypassPermissions` skips the interactive prompt but still respects the allow-list in `~/.claude/settings.json`. Ensure git/gh write commands are in the global allow-list. | ||
|
|
||
| Examples: | ||
| - `/ppl-bugfix #1234` — single issue, defaults to yolo | ||
| - `/ppl-bugfix #1234 #5678 --yolo` — two issues in parallel | ||
| - `/ppl-bugfix PR#5293 PR#5300` — two PRs in parallel | ||
| - `/ppl-bugfix #1234 PR#5678 --safe` — mix of issue and PR | ||
|
|
||
| If no argument given, ask for an issue or PR number. | ||
|
|
||
| ## Step 0: Resolve Permission Mode | ||
|
|
||
| Parse the mode flag from the input arguments: | ||
|
|
||
| | Flag | Mode | | ||
| |------|------| | ||
| | `--safe` | `acceptEdits` | | ||
| | `--yolo` | `bypassPermissions` | | ||
| | _(no flag)_ | `bypassPermissions` (default) | | ||
|
|
||
| Use the resolved mode as the `mode` parameter when dispatching the subagent in Step 2A/2B. | ||
|
|
||
| ## Step 1: Resolve Each Reference | ||
|
|
||
| For each issue/PR reference in the input, resolve its state. Run these lookups in parallel when there are multiple references. | ||
|
|
||
| ```bash | ||
| # Issue → PR (check multiple closing keyword variants) | ||
| gh pr list --search "Resolves #<issue_number>" --json number,url,state --limit 5 | ||
| gh pr list --search "Fixes #<issue_number>" --json number,url,state --limit 5 | ||
| gh pr list --search "Closes #<issue_number>" --json number,url,state --limit 5 | ||
|
|
||
| # PR → Issue | ||
| gh pr view <pr_number> --json body | jq -r '.body' | grep -oiE '(resolves|fixes|closes) #[0-9]+' | grep -oE '[0-9]+' | ||
| ``` | ||
|
|
||
| | State | Action | | ||
| |-------|--------| | ||
| | Issue exists, no PR | **Initial Fix** (Step 2A) | | ||
| | Issue exists, open PR found | **Follow-up** (Step 2B) | | ||
| | PR provided directly | **Follow-up** (Step 2B) | | ||
|
|
||
| ## Step 2: Dispatch Subagents | ||
|
|
||
| Dispatch one subagent per reference. When there are multiple references, dispatch all subagents in a single message (parallel execution). | ||
|
|
||
| ### 2A: Initial Fix | ||
|
|
||
| ``` | ||
| Agent( | ||
| mode: "<resolved_mode>", | ||
| isolation: "worktree", | ||
| name: "bugfix-<issue_number>", | ||
| description: "PPL bugfix #<issue_number>", | ||
| prompt: "Read .claude/harness/ppl-bugfix-harness.md and follow it to fix GitHub issue #<issue_number>. | ||
| Follow Phase 0 through Phase 3 in order. | ||
| Phase 0.3 defines TDD execution flow. Do NOT skip any phase. | ||
| CRITICAL: If Phase 0.1 determines the bug is already fixed on main, HARD STOP. | ||
| Do NOT write tests, do NOT create a PR — just comment/close the issue and report back. | ||
| If the bug IS reproducible, post the Decision Log (Phase 3.4) before completing." | ||
| ) | ||
| ``` | ||
|
|
||
| ### 2B: Follow-up | ||
|
|
||
| ``` | ||
| Agent( | ||
| mode: "<resolved_mode>", | ||
| isolation: "worktree", | ||
| name: "bugfix-<issue_number>", | ||
| description: "PPL bugfix #<issue_number> followup", | ||
| prompt: "Read .claude/harness/ppl-bugfix-followup.md and follow it. | ||
| PR: <pr_number> (<pr_url>), Issue: #<issue_number>" | ||
| ) | ||
| ``` | ||
|
|
||
| ## Step 3: Report Back | ||
|
|
||
| After all subagents complete, report a summary for each: | ||
| - Classification, fix summary, PR URL, worktree path and branch, items needing human attention (2A) | ||
| - What was addressed, current PR state, whether another round is needed (2B) | ||
|
|
||
| ## Subagent Lifecycle | ||
|
|
||
| Subagents are task-scoped. They complete and release context — they cannot poll for events. | ||
|
|
||
| ``` | ||
| Agent A (Phase 0-3) → creates PR → completes | ||
| (CI runs, reviewers comment, conflicts arise) | ||
| Agent B (Phase 3.5) → handles feedback → completes | ||
| (repeat as needed) | ||
| Agent N (Phase 3.5) → gh pr ready → done | ||
| ``` | ||
|
|
||
| Context is preserved across agents via: | ||
| - **Decision Log** (PR comment) — single source of truth for rejected alternatives, pitfalls, design rationale | ||
| - **GitHub state** (PR diff, review comments, CI logs) — reconstructed by each follow-up agent | ||
|
|
||
| ## Rules | ||
|
|
||
| - Subagent reads `.claude/harness/ppl-bugfix-harness.md` and fetches issue/PR details itself — do NOT inline content into the prompt | ||
| - If bug is not reproducible (Phase 0.1), stop and report — do not proceed | ||
| - Issue ↔ PR auto-resolution means the user never needs to track PR numbers manually | ||
| - **Do NOT use `mode: "auto"` for subagents** — `auto` mode does not work for subagents; Bash commands still require manual approval. Only `bypassPermissions` reliably skips permission checks. | ||
| - **Always dispatch subagent** — even for trivial follow-ups (remove co-author, force push). Do NOT run commands directly in the main session; subagents with `bypassPermissions` skip permission prompts, the main session does not. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| # PPL Bugfix Follow-up | ||
|
|
||
| ## Rules | ||
|
|
||
| - Do NOT add `Co-Authored-By` lines in commits — only DCO `Signed-off-by` | ||
|
|
||
| --- | ||
|
|
||
| ## Reconstruct Context | ||
|
|
||
| The follow-up agent runs in a fresh worktree. First checkout the PR branch, then load state: | ||
|
|
||
| ```bash | ||
| # Checkout the PR branch in this worktree | ||
| gh pr checkout <pr_number> | ||
|
|
||
| # Resolve fork remote — the worktree may only have origin (upstream) | ||
| git remote -v | ||
| # If no fork remote exists, add it: | ||
| git remote add fork https://github.com/<fork_owner>/sql.git | ||
|
|
||
| # Load PR state — reviews, CI, mergeability | ||
| gh pr view <pr_number> --json title,body,state,reviews,statusCheckRollup,mergeable | ||
| gh pr checks <pr_number> | ||
|
|
||
| # Load ALL comments — includes bot comments (Code-Diff-Analyzer, PR Reviewer Guide, Code Suggestions) and human comments | ||
| gh pr view <pr_number> --json comments --jq '.comments[] | {author: .author.login, body: .body}' | ||
| ``` | ||
|
|
||
| Categorize ALL signals — not just CI and human reviews: | ||
|
|
||
| | Signal | Type | | ||
| |--------|------| | ||
| | `statusCheckRollup` has failures | CI failure | | ||
| | `reviews` has CHANGES_REQUESTED | Review feedback | | ||
| | `mergeable` is CONFLICTING | Merge conflict | | ||
| | Bot comments with actionable suggestions | Review feedback (treat like human review) | | ||
| | All pass + approved | Ready — run `gh pr ready` | | ||
|
|
||
| ## Handle Review Feedback | ||
|
|
||
| For each comment (human OR bot), **cross-check against the Decision Log first**: | ||
|
|
||
| | Type | Action | | ||
| |------|--------| | ||
| | Code change | If already rejected in Decision Log, reply with reasoning. Otherwise make the change, new commit, push | | ||
| | Question | Reply with explanation — Decision Log often has the answer | | ||
| | Nit | Fix if trivial | | ||
| | Disagreement | Reply with Decision Log reasoning; if reviewer insists, escalate to user | | ||
|
|
||
| ```bash | ||
| git add <files> && git commit -s -m "Address review feedback: <description>" | ||
| git push -u fork <branch_name> | ||
| ``` | ||
|
|
||
| ## Clean Up Commit History | ||
|
|
||
| When you need to amend a commit (e.g. remove Co-Authored-By, reword message) and the branch has a merge commit on top, don't try `git reset --soft origin/main` — it will include unrelated changes if main has moved. Instead cherry-pick the fix onto latest main: | ||
|
|
||
| ```bash | ||
| git checkout -B clean-branch origin/main | ||
| git cherry-pick <fix_commit_sha> | ||
| git commit --amend -s -m "<updated message>" | ||
| git push fork clean-branch:<pr_branch> --force-with-lease | ||
| ``` | ||
|
|
||
| ## Handle CI Failures | ||
|
|
||
| ```bash | ||
| gh pr checks <pr_number> # Identify failures | ||
| gh run view <run_id> --log-failed # Read logs | ||
| # Test failure → fix locally, push new commit | ||
| # Spotless → ./gradlew spotlessApply, push | ||
| # Flaky → gh run rerun <run_id> --failed | ||
| ``` | ||
|
|
||
| ## Handle Merge Conflicts | ||
|
|
||
| ```bash | ||
| git fetch origin && git merge origin/main # Resolve conflicts | ||
| ./gradlew spotlessApply && ./gradlew test && ./gradlew :integ-test:integTest # Re-verify | ||
| git commit -s -m "Resolve merge conflicts with main" | ||
| git push -u fork <branch_name> | ||
| ``` | ||
|
|
||
| ## Mark Ready | ||
|
|
||
| ```bash | ||
| gh pr ready <pr_number> | ||
| ``` | ||
|
|
||
| ## Retrospective | ||
|
|
||
| After handling follow-up, reflect on the feedback received and check if it reveals gaps in the harness or command: | ||
|
|
||
| For each comment addressed (bot or human): | ||
| - **Does the feedback point to a pattern the harness should have prevented?** → Add guidance to the relevant Phase in `ppl-bugfix-harness.md` | ||
| - **Was this a repeated mistake across PRs?** → Add to Quick Reference or Case Index | ||
| - **Did the harness template produce the problematic code?** → Fix the template directly | ||
| - **Was a permission or tool missing?** → Add to `.claude/settings.json` | ||
| - **Did the follow-up workflow itself miss this signal?** → Update this file | ||
|
|
||
| If any improvement is needed, make the edit and include it in the same commit. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,151 @@ | ||
| # PPL Bugfix Harness | ||
|
|
||
| ## Phase 0: Triage | ||
|
|
||
| ### 0.1 Load & Reproduce | ||
|
|
||
| ```bash | ||
| gh issue view <issue_number> --repo opensearch-project/sql | ||
| ``` | ||
|
|
||
| Write a failing test or run an existing one to reproduce the bug on `main`. | ||
|
|
||
| If the bug **does not reproduce** (correct results, not infra failure): | ||
|
|
||
| | Finding | Action | | ||
| |---------|--------| | ||
| | Already fixed | `gh issue comment` + `gh issue close` | | ||
| | Older version only | `gh issue comment` + `gh issue close` | | ||
| | Intermittent | Label `flaky` or `needs-info`, do NOT close | | ||
| | Can't reproduce | Comment asking for repro steps, label `needs-info` | | ||
|
|
||
| **HARD STOP** — do not proceed. Report back. | ||
|
|
||
| ### 0.2 Classify | ||
|
|
||
| Identify the bug layer (Grammar, AST/Functions, Type System, Optimizer, Execution, DI/Resource) and record it. Consult `.claude/harness/ppl-bugfix-reference.md` for fix-path-specific guidance if needed. | ||
|
|
||
| ### 0.3 Guardrails | ||
|
|
||
| Stop and report back if: | ||
| - Root cause unclear after reading 15+ source files | ||
| - Fix breaks 5+ unrelated tests | ||
| - Same build error 3 times in a row | ||
|
|
||
| ### 0.4 Execution Flow | ||
|
|
||
| ``` | ||
| Triage → Write FAILING test → Fix → Remaining tests → Verify → Commit → PR → Decision Log → Completion Gate | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Phase 1: Fix | ||
|
|
||
| Find and fix the root cause. Consult `.claude/harness/ppl-bugfix-reference.md` for path-specific patterns and examples. | ||
|
|
||
| --- | ||
|
|
||
| ## Phase 2: Tests | ||
|
|
||
| Consult `.claude/harness/ppl-bugfix-reference.md` for test templates. | ||
|
|
||
| Required deliverables: | ||
| - Failing test reproducing the bug (written BEFORE the fix) | ||
| - Unit tests covering happy path and edge cases | ||
| - Integration test (`*IT.java` extending `CalcitePPLIT`) | ||
| - YAML REST test at `integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/<ISSUE>.yml` | ||
|
|
||
| --- | ||
|
|
||
| ## Phase 3: Verify & Submit | ||
|
|
||
| ### 3.1 Verify | ||
|
|
||
| ```bash | ||
| ./gradlew spotlessApply | ||
| ./gradlew :<module>:test --tests "<TestClass>" | ||
| ./gradlew test | ||
| ./gradlew :integ-test:integTest -Dtests.class="*<YourIT>" | ||
| ``` | ||
|
|
||
| Run `./gradlew :integ-test:yamlRestTest` if YAML tests were added. Run `./gradlew generateGrammarSource && ./gradlew :ppl:test` if grammar was modified. | ||
|
|
||
| ### 3.2 Commit & PR | ||
|
|
||
| ```bash | ||
| git add <changed_files> | ||
| git commit -s -m "[BugFix] Fix <description> (#<issue_number>)" | ||
| git fetch origin && git merge origin/main | ||
| ./gradlew test && ./gradlew :integ-test:integTest -Dtests.class="*<YourIT>" | ||
|
|
||
| # Resolve fork remote (check git remote -v; add if missing) | ||
| git remote add fork https://github.com/<fork_owner>/sql.git | ||
| git push -u fork <branch_name> | ||
| ``` | ||
|
|
||
| Do NOT add Co-Authored-By lines. Use the git user name to infer the fork owner, or fall back to "qianheng-aws". | ||
|
|
||
| ```bash | ||
| gh pr create --draft --repo opensearch-project/sql \ | ||
| --title "[BugFix] Fix <description> (#<issue_number>)" \ | ||
| --body "$(cat <<'EOF' | ||
| ### Description | ||
| <Brief description of fix and root cause> | ||
|
|
||
| ### Related Issues | ||
| Resolves #<issue_number> | ||
|
|
||
| ### Check List | ||
| - [x] New functionality includes testing | ||
| - [x] Commits signed per DCO (`-s`) | ||
| - [x] `spotlessCheck` passed | ||
| - [x] Unit tests passed | ||
| - [x] Integration tests passed | ||
| EOF | ||
| )" | ||
| ``` | ||
|
|
||
| ### 3.3 Decision Log | ||
|
|
||
| Post as a PR comment: | ||
|
|
||
| ```bash | ||
| gh pr comment <pr_number> --body "$(cat <<'EOF' | ||
| ## Decision Log | ||
| **Root Cause**: <what actually caused the bug> | ||
| **Approach**: <what was implemented and why> | ||
| **Alternatives Rejected**: <what was considered and why it didn't work> | ||
| **Pitfalls**: <dead ends, subtle edge cases discovered> | ||
| **Things to Watch**: <known limitations, areas needing follow-up> | ||
| EOF | ||
| )" | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Completion Gate | ||
|
|
||
| Do NOT report "done" until every item below is checked. List each in your final report: | ||
|
|
||
| - [ ] **Unit tests**: New test class or methods | ||
| - [ ] **Integration test**: New `*IT.java` test | ||
| - [ ] **YAML REST test**: `issues/<ISSUE>.yml` | ||
| - [ ] **spotlessApply**: Ran successfully | ||
| - [ ] **Tests pass**: Affected modules | ||
| - [ ] **Commit**: DCO sign-off, `[BugFix]` prefix, no Co-Authored-By | ||
| - [ ] **Draft PR**: `--draft`, body contains `Resolves #<issue>` | ||
| - [ ] **Decision Log**: PR comment posted | ||
|
|
||
| If any item is blocked, report which and why. | ||
|
|
||
| --- | ||
|
|
||
| ## Phase 4: Retrospective | ||
|
|
||
| - [ ] Symptom in Quick Reference? Add if missing. | ||
| - [ ] Classification correct? Fix routing if misleading. | ||
| - [ ] Test template worked as-is? Fix if broken. | ||
| - [ ] New pattern? Add to Case Index. | ||
|
|
||
| Include harness improvements in the same PR. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include "agent-team review" before publish PR?