Skip to content

feat: AI-friendly structured workspace for Claude Code / Codex#18

Open
STRRL wants to merge 6 commits intomasterfrom
spawn-pagent-workspace
Open

feat: AI-friendly structured workspace for Claude Code / Codex#18
STRRL wants to merge 6 commits intomasterfrom
spawn-pagent-workspace

Conversation

@STRRL
Copy link
Owner

@STRRL STRRL commented Mar 5, 2026

Summary

  • Replace all old CLI commands (analyze, debug *) with a new workspace command group: create, add-log, analyze
  • Build structured file-based workspaces with pattern directories named by LLM-generated semantic IDs — navigable by any AI coding agent or human using basic unix tools
  • New pkg/workspace/ package with embedded Go templates for AGENTS.md, pattern.md, summary.md, errors.md
  • Refactor pkg/analyzer/ to expose RunAgentWithPrompt() and BuildWorkspaceSystemPrompt() for the new workspace layout
  • Remove DuckDB dependency from CLI (--db flag removed); pkg/store/ kept for tests/future use

New CLI

lapp workspace create <dir>
lapp workspace add-log <dir> <logfile> [--model] [--stdin]
lapp workspace analyze <dir> [question] [--model]

Workspace structure

my-investigation/
├── AGENTS.md
├── logs/
├── patterns/
│   ├── server-startup/
│   │   ├── pattern.md
│   │   └── samples.log
│   ├── connection-timeout/
│   │   ├── pattern.md
│   │   └── samples.log
│   └── unmatched/
│       └── samples.log
└── notes/
    ├── summary.md
    └── errors.md

Closes #17

Test plan

  • make ci passes (fmt, vet, build, lint, unit tests)
  • lapp workspace create /tmp/test-ws creates directory structure with AGENTS.md
  • lapp workspace add-log /tmp/test-ws <logfile> populates patterns/ and notes/
  • echo "test" | lapp workspace add-log /tmp/test-ws --stdin reads from stdin
  • lapp workspace analyze /tmp/test-ws "what errors?" runs agent

…gest under debug

- `analyze` now runs the full ingest pipeline (Drain + semantic labeling +
  DuckDB storage) before launching the AI agent
- Move `ingest` command under `debug ingest` for step-by-step debugging
- Extract shared pipeline helpers into `cmd/lapp/pipeline.go`
- Remove top-level `templates` command
- Add workspace path constraint to analyzer system prompt to prevent
  the agent from scanning files outside the workspace directory
- Add Langfuse tracing support with docker-compose for local dev
- Update CLAUDE.md with new CLI structure and code style notes
Instrument the entire pipeline with OTel spans: CLI commands, multiline
merge, Drain parsing, semantic labeling, DuckDB storage, and analyzer.
HTTP clients for LLM calls use otelhttp transport for deep request traces.

- Add pkg/tracing/otel.go with OTLP HTTP exporter (env-gated via OTEL_TRACING_ENABLED)
- Add Jaeger service to docker-compose.yml (UI on port 16686)
- Wire InitOTel in main.go with graceful shutdown
- Add ctx parameter to DrainParser.Feed/Templates and multiline.Merge/MergeSlice
- Wrap eino OpenRouter HTTP clients with otelhttp.NewTransport
Extract AnalyzeWithTemplates() that accepts pre-computed templates,
so the analyze command passes the same DrainParser output to both
DuckDB storage and the workspace builder. Previously, Analyze()
created a second DrainParser with fresh UUIDs, causing template IDs
in the workspace to diverge from those in the database.
… agents

Replace all old CLI commands (analyze, debug *) with a new `workspace`
command group (create, add-log, analyze) that builds a structured
directory with pattern directories named by LLM-generated semantic IDs.

Workspace structure: logs/, patterns/<semantic-id>/{pattern.md,samples.log},
patterns/unmatched/, notes/{summary.md,errors.md}, and AGENTS.md.

Closes #17
chatgpt-codex-connector[bot]

This comment was marked as resolved.

STRRL added 2 commits March 5, 2026 00:26
Keep workspace-based CLI from our branch, discard old analyze/debug/ingest
commands and pipeline.go that master modified. Preserve OTel tracing from
both branches.
…que stdin names

- Sanitize semantic IDs with [a-z0-9-] regex before using as directory
  names to prevent path traversal from LLM output
- Sort filenames before iterating in mergeAllLogs for deterministic
  rebuild output across runs
- Use UnixNano instead of Unix for stdin log filenames to avoid
  collisions within the same second
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f05c01eb0d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +109 to +111
label, hasLabel := labelMap[tid]
if !hasLabel {
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve templates when semantic labels are missing

computePatterns drops any Drain template not present in the LLM label map (if !hasLabel { continue }), but semantic.parseResponse accepts partial arrays and does not verify one label per input pattern. When the model omits one pattern_id, all lines for that template are later treated as unmatched, so patterns/ and notes/summary.md lose real clusters and counts become incorrect. Keep unlabeled templates with a fallback semantic ID/description instead of skipping them.

Useful? React with 👍 / 👎.

Comment on lines +164 to +165
dest := filepath.Join(dir, "logs", filepath.Base(logFile))
if err := os.WriteFile(dest, data, 0o644); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid overwriting existing logs with same basename

add-log always writes to logs/<basename> with os.WriteFile, so adding two different files that share the same filename (for example /var/log/app.log then /tmp/archive/app.log) silently replaces the first file. Because rebuilds process whatever is in logs/, this can drop previously ingested data and change pattern discovery unexpectedly. Use a collision-safe destination name (or fail on conflict) instead of unconditional overwrite.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: AI-friendly workspace for Claude Code / Codex

1 participant