Skip to content

feat: add devnet-runner and devnet-log-review Claude skills#130

Merged
MegaRedHand merged 1 commit intomainfrom
add-devnet-skills
Feb 13, 2026
Merged

feat: add devnet-runner and devnet-log-review Claude skills#130
MegaRedHand merged 1 commit intomainfrom
add-devnet-skills

Conversation

@pablodeymo
Copy link
Collaborator

Motivation

We need project-level Claude skills for managing local devnets and analyzing their logs directly from this repo, instead of relying on global user-level skills.

Description

Adds two Claude skills under .claude/skills/:

devnet-runner - Orchestrates local devnets from the repo root:

  • Manages lean-quickstart validator config and client image tags
  • Timed runs with automatic log collection via run-devnet-with-timeout.sh
  • All paths adapted to work from the ethlambda repo root (cd lean-quickstart && ...)
  • Reference guide for supported clients, images, and ports

devnet-log-review - Analyzes devnet logs:

  • analyze-logs.sh - Main entry point producing markdown summary
  • count-errors-warnings.sh - Per-node error/warning counts (filters benign patterns)
  • count-blocks.sh - Client-aware block production stats
  • check-consensus-progress.sh - Last slot reached and proposer assignments
  • show-errors.sh - Detailed error display with filtering
  • Reference guides for fork analysis, finalization debugging, error classification, and client log patterns

How to Test

# Run a devnet (requires lean-quickstart and Docker)
.claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh 120

# Analyze the resulting logs
.claude/skills/devnet-log-review/scripts/analyze-logs.sh

Add project-level skills for managing local devnets and analyzing their logs.

devnet-runner: orchestrates lean-quickstart devnets from the repo root,
with validator config management, timed runs, and automatic log collection.

devnet-log-review: analyzes devnet logs with scripts for error counting,
block production stats, consensus progress, and reference guides for
fork analysis, finalization debugging, and error classification.
@github-actions
Copy link

🤖 Kimi Code Review

This PR adds comprehensive devnet tooling and documentation for lean consensus testing. The changes are well-structured and provide valuable infrastructure for debugging consensus issues. Here are my observations:

Strengths

  • Excellent documentation: The devnet-log-review skill provides thorough guides for analyzing consensus failures, forks, and finalization issues
  • Practical scripts: All shell scripts are production-ready with proper error handling and ANSI code stripping
  • Progressive disclosure: The documentation structure allows users to start with quick analysis and dive deeper as needed
  • Client-specific patterns: Detailed knowledge of each client's log formats and error patterns

Minor Issues

1. Script portability in run-devnet-with-timeout.sh

File: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh

The script uses kill $PID which may not properly terminate the entire process tree. Consider:

# Line 31
pkill -TERM -P $PID  # Kill child processes first
kill $PID 2>/dev/null || true

2. Missing validation in count-blocks.sh

File: .claude/skills/devnet-log-review/scripts/count-blocks.sh

The script could validate that proposed and processed counts make sense:

# After the loop, add validation
if [[ $proposed -gt 0 && $processed -eq 0 ]]; then
    echo "Warning: $node proposed $proposed blocks but processed 0 - possible validation failures" >&2
fi

3. Race condition in log dumping

File: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh

The script dumps logs immediately after timeout, but containers might still be shutting down:

# Add graceful shutdown before dumping
echo "Stopping nodes gracefully..."
docker stop $(docker ps --format '{{.Names}}' | grep -E '^(zeam|ream|qlean|lantern|lighthouse|grandine|ethlambda)_') 2>/dev/null || true
sleep 2  # Allow containers to flush logs

4. Missing error handling in check-consensus-progress.sh

File: .claude/skills/devnet-log-review/scripts/check-consensus-progress.sh

The script should handle cases where no slot numbers are found:

# After extracting last_slot
if [[ "$last_slot" == "0" || "$last_slot" == "N/A" ]]; then
    printf "%-20s %12s\n" "$node" "NO SLOTS"
else
    printf "%-20s %12s\n" "$node" "$last_slot"
fi

Security Considerations

  • Docker socket access: The scripts assume Docker socket access, which is reasonable for devnet testing
  • File permissions: All scripts are executable and use safe defaults (set -euo pipefail)
  • Input validation: Scripts properly handle missing log directories and empty files

Documentation Quality

The documentation is exceptionally well-written with:

  • Clear progressive disclosure from quick start to deep dives
  • Practical examples for every investigation pattern
  • Accurate client-specific log format descriptions
  • Comprehensive error classification with actionable remediation steps

Overall Assessment

This is a high-quality addition that significantly improves the devnet testing experience. The minor issues above are non-blocking and can be addressed in follow-up PRs. The tooling will be invaluable for debugging consensus issues and interoperability problems between clients.


Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

Summary
This PR only adds devnet log review/run skills, reference docs, and helper shell scripts under .claude/skills/. There are no changes to consensus logic, fork choice, state transition, XMSS, or SSZ code. So none of the consensus-layer items you called out are affected.

Findings

  1. Containers may be left running: run-devnet-with-timeout.sh kills the spin-node.sh process but does not stop containers, which can leave stale nodes running and contaminate subsequent runs. Consider adding a stop step after the sleep (or in a trap) to call ./spin-node.sh --node all --stop.
    File: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh:21-34

  2. Missing signal handling / cleanup: If the script is interrupted (Ctrl+C), it won’t dump logs or stop nodes. Add a trap to ensure log dump and cleanup on INT, TERM, and EXIT.
    File: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh:21-34

  3. Input validation: The script doesn’t validate that the timeout is a positive integer. A non-numeric value will make sleep error. Consider validating with a simple regex guard.
    File: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh:7-24

Consensus-layer review
No code changes in fork choice, attestation processing, justification/finalization, state transitions, XMSS, or SSZ.

If you want, I can propose a small patch to harden run-devnet-with-timeout.sh with trap, validation, and container stop.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

🤖 Claude Code Review

Based on my review of PR #130, here's my analysis:


PR #130 Review: Add devnet-runner and devnet-log-review Claude Skills

Summary

This PR adds comprehensive Claude AI skills for managing local devnets and analyzing their logs. The code is well-structured with clear documentation and follows good shell scripting practices. I found several issues ranging from minor to moderate severity that should be addressed.

Issues Found

1. Script Portability & Robustness Issues

run-devnet-with-timeout.sh:12-19 - Path resolution could fail in edge cases

REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
  • This assumes exactly 4 levels deep from repo root
  • Consider using git rev-parse --show-toplevel for more reliable repo root detection
  • Add validation that we're actually in a git repository

run-devnet-with-timeout.sh:34 - Silent kill may leave orphaned processes

kill $PID 2>/dev/null
  • Should send SIGTERM first, wait, then SIGKILL if needed
  • The script doesn't stop the Docker containers before killing the monitoring process
  • This could leave containers running after the script exits

Suggested fix:

# Graceful shutdown
kill -TERM $PID 2>/dev/null
sleep 2
kill -KILL $PID 2>/dev/null || true
wait $PID 2>/dev/null || true

# Stop containers
cd "$QUICKSTART_DIR"
NETWORK_DIR=local-devnet ./spin-node.sh --node all --stop 2>/dev/null || true

2. Security & Input Validation

run-devnet-with-timeout.sh:7-10 - No validation of timeout parameter

if [ -z "$1" ]; then
    echo "Usage: $0 <seconds>"
    exit 1
fi
  • Missing numeric validation - could pass negative numbers or non-numeric values
  • No upper bound - could accidentally run for hours/days
  • Should validate: [[ "$1" =~ ^[0-9]+$ ]] && [ "$1" -gt 0 ] && [ "$1" -lt 7200 ]

Multiple scripts - Use of grep -E with user-controlled patterns

  • While not directly user-controlled in this PR, the scripts use complex regex patterns
  • Consider using grep -F (fixed strings) where appropriate for performance and safety

3. Error Handling Issues

analyze-logs.sh:58-59 - Complex error counting could silently fail

errors=$(grep -i "error" "$f" 2>/dev/null | grep -cvE "manifest unknown|NoFinalizedStateFound|HandshakeTimedOut" 2>/dev/null) || errors=0
  • The grep -cv (count non-matching) is fragile
  • If first grep fails, the entire pipeline returns the fallback 0, which masks real issues
  • Consider separating into clearer logic with explicit error handling

count-errors-warnings.sh:19-26 - count_filtered function has complex error handling

result=$(grep -i "$pattern" "$file" 2>/dev/null | grep -cvE "$exclude" 2>/dev/null) || result=0
echo "${result:-0}"
  • The ${result:-0} fallback is redundant since || result=0 already handles it
  • The double stderr redirect 2>/dev/null masks potential issues

4. Race Conditions & Timing Issues

run-devnet-with-timeout.sh:22-24 - Race condition in process management

./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" &
PID=$!
sleep "$1"
  • The script captures the PID of tee, not spin-node.sh
  • When killing $PID, it only kills tee, leaving spin-node.sh running
  • Containers will keep running since only the wrapper process is killed

Suggested fix:

# Use process substitution or job control
cd "$QUICKSTART_DIR"
NETWORK_DIR=local-devnet ./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" &
SPIN_PID=$!
sleep "$1"
# Kill the actual spin-node.sh process
kill $SPIN_PID 2>/dev/null

5. Data Quality & Log Analysis Issues

count-blocks.sh:54-88 - Client detection could fail with unconventional naming

client="${node%_*}"
  • Assumes node names always follow {client}_{index} pattern
  • No fallback if naming convention isn't followed
  • Consider adding validation: if [[ ! "$node" =~ ^[a-z]+_[0-9]+$ ]]; then continue; fi

show-errors.sh:37 - ANSI strip function uses basic sed

strip_ansi() {
    sed 's/\x1b\[[0-9;]*m//g'
}
  • This regex is correct but consider more comprehensive: 's/\x1b\[[0-9;]*[a-zA-Z]//g' to catch all escape sequences
  • Should handle OSC sequences: 's/\x1b\][^\x07]*\x07//g'

6. Documentation Issues

SKILL.md files - Inconsistent timing calculations

  • devnet-runner/SKILL.md:32-43 documents timing formula
  • The formula includes "genesis offset + startup buffer"
  • However, run-devnet-with-timeout.sh doesn't apply these offsets—it just sleeps for the exact timeout
  • Either the script should implement the formula, or the docs should clarify the script uses raw timeout

SKILL.md references to ports 9001-9008 - No validation that ports are available

  • Scripts don't check if ports are in use before starting
  • Consider adding: lsof -i :9001 >/dev/null && echo "Port 9001 in use" && exit 1

7. Minor Code Quality Issues

Multiple scripts - Inconsistent shebang options

  • Some use set -euo pipefail, others use set -uo pipefail
  • The -e flag (exit on error) is important for robustness
  • check-consensus-progress.sh:9 uses -euo, others don't
  • Recommendation: Use set -euo pipefail consistently for all scripts

count-blocks.sh:16 - ANSI strip is defined but could be more efficient

strip_ansi < "$file" | grep -cE "$pattern"
  • Reading entire file into strip_ansi for every pattern check is inefficient
  • Consider stripping once and storing in temp file for multi-pattern searches

analyze-logs.sh:16-21 - Uses nullglob but sets it globally

shopt -s nullglob
  • Good practice, but should shopt -u nullglob at the end to restore state
  • Or use subshell: ( shopt -s nullglob; ... )

Positive Observations

  1. Good separation of concerns - Runner and log review are separate skills
  2. Progressive disclosure - Reference docs are separate from main SKILL.md
  3. Comprehensive documentation - Clear usage examples and troubleshooting guides
  4. Client-aware patterns - Scripts handle differences between zeam, ream, ethlambda, etc.
  5. Filtering of benign errors - Smart exclusion of known false positives like "manifest unknown"

Recommendations

High Priority

  1. Fix the PID capture issue in run-devnet-with-timeout.sh (Item 4)
  2. Add numeric validation to timeout parameter (Item 2)
  3. Add proper container cleanup before script exit (Item 1)

Medium Priority

  1. Standardize set -euo pipefail across all scripts (Item 7)
  2. Improve error handling in analyze-logs.sh error counting (Item 3)
  3. Fix documentation inconsistency about timing calculations (Item 6)

Low Priority

  1. Enhance ANSI stripping regex for comprehensive escape sequence handling (Item 5)
  2. Add port availability checks before starting devnet (Item 6)
  3. Optimize file reading in count-blocks.sh (Item 7)

Verdict

🟡 APPROVE WITH CHANGES - The PR provides valuable devnet management tooling with good documentation. However, the process management issues in run-devnet-with-timeout.sh (especially the PID capture and cleanup) should be fixed before merging to prevent orphaned Docker containers and confusion during testing.

The shell scripts are generally well-written and handle edge cases thoughtfully. Once the high-priority items are addressed, this will be a solid addition to the development workflow.


Automated review by Claude (Anthropic) · custom prompt

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 13, 2026

Greptile Overview

Greptile Summary

Adds two Claude skills for managing local devnets and analyzing their logs.

devnet-runner provides orchestration for running lean consensus validators with configurable timeouts, automatic log collection, and comprehensive documentation covering validator config management, Docker image tags, port allocation, and troubleshooting workflows.

devnet-log-review provides analysis tooling with client-aware pattern matching for block counting, error classification with benign pattern filtering, consensus progress tracking, and detailed reference guides for fork analysis, finalization debugging, and client-specific log patterns.

The implementation follows good shell scripting practices with proper error handling (set -euo pipefail), ANSI code stripping, safe defaults, and executable permissions. Documentation uses progressive disclosure to manage context efficiently.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk - adds new tooling without modifying core functionality.
  • Score reflects well-written shell scripts with proper error handling and comprehensive documentation. Minor style improvements suggested for signal handling and code deduplication, but no functional issues found.
  • Pay attention to run-devnet-with-timeout.sh to verify signal propagation works as expected in your environment.

Important Files Changed

Filename Overview
.claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh Adds script to run devnet with timeout and automatic log collection. Minor issue: signal propagation may not work reliably with backgrounded pipeline.
.claude/skills/devnet-log-review/scripts/analyze-logs.sh Main analysis entry point orchestrating error counts, block stats, and consensus progress. Duplicates error counting logic from count-errors-warnings.sh.
.claude/skills/devnet-log-review/scripts/count-blocks.sh Client-aware block counting with pattern matching for each client type. Well-structured with ANSI stripping and safe defaults.
.claude/skills/devnet-log-review/scripts/check-consensus-progress.sh Shows last slot reached and proposer assignments per node. Handles multiple slot format variations correctly.
.claude/skills/devnet-runner/SKILL.md Comprehensive documentation for devnet management with clear workflows, timing calculations, and troubleshooting guides.
.claude/skills/devnet-log-review/SKILL.md Well-organized analysis documentation with progressive disclosure pattern and clear investigation workflows.

Sequence Diagram

sequenceDiagram
    participant User
    participant RunScript as run-devnet-with-timeout.sh
    participant SpinNode as spin-node.sh
    participant Docker
    participant AnalyzeScript as analyze-logs.sh
    
    User->>RunScript: Execute with timeout (e.g., 120s)
    RunScript->>RunScript: Validate lean-quickstart exists
    RunScript->>SpinNode: Start nodes with --generateGenesis
    SpinNode->>Docker: Start validator containers
    Docker-->>SpinNode: Containers running
    RunScript->>RunScript: Sleep for specified duration
    RunScript->>Docker: Dump logs from all containers
    Docker-->>RunScript: Log files (*.log)
    RunScript->>SpinNode: Kill process (SIGTERM)
    RunScript-->>User: Logs saved to repo root
    
    User->>AnalyzeScript: Analyze saved logs
    AnalyzeScript->>AnalyzeScript: Count errors/warnings
    AnalyzeScript->>AnalyzeScript: Count blocks proposed/processed
    AnalyzeScript->>AnalyzeScript: Check consensus progress
    AnalyzeScript-->>User: Markdown summary with health status
Loading

Last reviewed commit: 8d92edf

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +22 to +24
NETWORK_DIR=local-devnet ./spin-node.sh --node all --generateGenesis 2>&1 | tee "$REPO_ROOT/devnet.log" &
PID=$!
sleep "$1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal may not propagate correctly to spin-node.sh since $PID is the shell running the pipeline, not the actual process. The kill on line 33 sends SIGTERM to the pipeline shell, but this may not reliably stop the spin-node.sh process.

Consider capturing the process group and using kill -TERM -$PID to signal the entire process group, or use pkill to target spin-node.sh directly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-runner/scripts/run-devnet-with-timeout.sh
Line: 22:24

Comment:
Signal may not propagate correctly to `spin-node.sh` since `$PID` is the shell running the pipeline, not the actual process. The `kill` on line 33 sends SIGTERM to the pipeline shell, but this may not reliably stop the `spin-node.sh` process.

Consider capturing the process group and using `kill -TERM -$PID` to signal the entire process group, or use `pkill` to target `spin-node.sh` directly.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +54 to +61
total_errors=0
for f in "${log_files[@]}"; do
node=$(basename "$f" .log)
if [[ "$node" != "devnet" ]]; then
errors=$(grep -i "error" "$f" 2>/dev/null | grep -cvE "manifest unknown|NoFinalizedStateFound|HandshakeTimedOut" 2>/dev/null) || errors=0
total_errors=$((total_errors + errors))
fi
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicates error counting logic from count-errors-warnings.sh with slightly different filter patterns. The benign patterns here (manifest unknown|NoFinalizedStateFound|HandshakeTimedOut) should match those in count-errors-warnings.sh:15 to ensure consistency.

Consider sourcing the benign patterns from a shared location or calling the existing script's function.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/devnet-log-review/scripts/analyze-logs.sh
Line: 54:61

Comment:
Duplicates error counting logic from `count-errors-warnings.sh` with slightly different filter patterns. The benign patterns here (`manifest unknown|NoFinalizedStateFound|HandshakeTimedOut`) should match those in `count-errors-warnings.sh:15` to ensure consistency.

Consider sourcing the benign patterns from a shared location or calling the existing script's function.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@MegaRedHand MegaRedHand changed the title Add devnet-runner and devnet-log-review Claude skills feat: add devnet-runner and devnet-log-review Claude skills Feb 13, 2026
@MegaRedHand MegaRedHand merged commit ec8d91a into main Feb 13, 2026
7 checks passed
@MegaRedHand MegaRedHand deleted the add-devnet-skills branch February 13, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants