feat: AI-friendly structured workspace for Claude Code / Codex#18
feat: AI-friendly structured workspace for Claude Code / Codex#18
Conversation
…gest under debug - `analyze` now runs the full ingest pipeline (Drain + semantic labeling + DuckDB storage) before launching the AI agent - Move `ingest` command under `debug ingest` for step-by-step debugging - Extract shared pipeline helpers into `cmd/lapp/pipeline.go` - Remove top-level `templates` command - Add workspace path constraint to analyzer system prompt to prevent the agent from scanning files outside the workspace directory - Add Langfuse tracing support with docker-compose for local dev - Update CLAUDE.md with new CLI structure and code style notes
Instrument the entire pipeline with OTel spans: CLI commands, multiline merge, Drain parsing, semantic labeling, DuckDB storage, and analyzer. HTTP clients for LLM calls use otelhttp transport for deep request traces. - Add pkg/tracing/otel.go with OTLP HTTP exporter (env-gated via OTEL_TRACING_ENABLED) - Add Jaeger service to docker-compose.yml (UI on port 16686) - Wire InitOTel in main.go with graceful shutdown - Add ctx parameter to DrainParser.Feed/Templates and multiline.Merge/MergeSlice - Wrap eino OpenRouter HTTP clients with otelhttp.NewTransport
Extract AnalyzeWithTemplates() that accepts pre-computed templates, so the analyze command passes the same DrainParser output to both DuckDB storage and the workspace builder. Previously, Analyze() created a second DrainParser with fresh UUIDs, causing template IDs in the workspace to diverge from those in the database.
… agents
Replace all old CLI commands (analyze, debug *) with a new `workspace`
command group (create, add-log, analyze) that builds a structured
directory with pattern directories named by LLM-generated semantic IDs.
Workspace structure: logs/, patterns/<semantic-id>/{pattern.md,samples.log},
patterns/unmatched/, notes/{summary.md,errors.md}, and AGENTS.md.
Closes #17
Keep workspace-based CLI from our branch, discard old analyze/debug/ingest commands and pipeline.go that master modified. Preserve OTel tracing from both branches.
…que stdin names - Sanitize semantic IDs with [a-z0-9-] regex before using as directory names to prevent path traversal from LLM output - Sort filenames before iterating in mergeAllLogs for deterministic rebuild output across runs - Use UnixNano instead of Unix for stdin log filenames to avoid collisions within the same second
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f05c01eb0d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| label, hasLabel := labelMap[tid] | ||
| if !hasLabel { | ||
| continue |
There was a problem hiding this comment.
Preserve templates when semantic labels are missing
computePatterns drops any Drain template not present in the LLM label map (if !hasLabel { continue }), but semantic.parseResponse accepts partial arrays and does not verify one label per input pattern. When the model omits one pattern_id, all lines for that template are later treated as unmatched, so patterns/ and notes/summary.md lose real clusters and counts become incorrect. Keep unlabeled templates with a fallback semantic ID/description instead of skipping them.
Useful? React with 👍 / 👎.
| dest := filepath.Join(dir, "logs", filepath.Base(logFile)) | ||
| if err := os.WriteFile(dest, data, 0o644); err != nil { |
There was a problem hiding this comment.
Avoid overwriting existing logs with same basename
add-log always writes to logs/<basename> with os.WriteFile, so adding two different files that share the same filename (for example /var/log/app.log then /tmp/archive/app.log) silently replaces the first file. Because rebuilds process whatever is in logs/, this can drop previously ingested data and change pattern discovery unexpectedly. Use a collision-safe destination name (or fail on conflict) instead of unconditional overwrite.
Useful? React with 👍 / 👎.
Summary
analyze,debug *) with a newworkspacecommand group:create,add-log,analyzepkg/workspace/package with embedded Go templates forAGENTS.md,pattern.md,summary.md,errors.mdpkg/analyzer/to exposeRunAgentWithPrompt()andBuildWorkspaceSystemPrompt()for the new workspace layout--dbflag removed);pkg/store/kept for tests/future useNew CLI
Workspace structure
Closes #17
Test plan
make cipasses (fmt, vet, build, lint, unit tests)lapp workspace create /tmp/test-wscreates directory structure with AGENTS.mdlapp workspace add-log /tmp/test-ws <logfile>populates patterns/ and notes/echo "test" | lapp workspace add-log /tmp/test-ws --stdinreads from stdinlapp workspace analyze /tmp/test-ws "what errors?"runs agent