feat(ai): PR 7 — E2E Tests + Fixtures + Config by ianwhitedeveloper · Pull Request #421 · paralleldrive/riteway

ianwhitedeveloper · 2026-02-27T18:22:44Z

Context

Part of the PR #394 consolidation effort. Targets ai-testing-framework-implementation-consolidation.

Dependency order: Foundation → Utilities → Parsers → Config + Validation → Core Runner → CLI + Output (#420, merged) → E2E (this PR) → outputFormat + riteway ai init (#423)

Next: Draft PR #423 (pr/agent-output-format) — outputFormat serialization strategy + riteway ai init eject command — is queued behind this PR.

What's in this PR

End-to-end test suite + fixture files + dedicated Vitest config for the AI testing framework, plus post-consolidation fixups to test-extractor.js.

`source/e2e.test.js`

Full E2E test coverage using describe.skipIf(!isClaudeAuthenticated) to gate all tests on real Claude CLI auth:

Full workflow — runAITests + recordTestOutput against sum-function-test.sudo; asserts assertion count, pass/fail, run counts, TAP file content and filename format
--agent-config file flow — loads claude-agent-config.json via loadAgentConfig, verifies config shape, runs the same fixture
Validation error tests (extraction-only, faster than full workflow):
- MISSING_PROMPT_UNDER_TEST — fixture with no import statement
- MISSING_USER_PROMPT — fixture with no userPrompt field
- NO_ASSERTIONS_FOUND — fixture with no assertion lines
SudoLang userPrompt — happy-path test proving the framework handles SudoLang syntax in userPrompt

All error-path tests use Try (not try/catch) and assert the full error.cause object — name, message, code, and testFile.

Fixtures (`source/fixtures/`)

File	Purpose
`sum-function-spec.mdc`	Self-contained prompt-under-test spec (no project-specific ai/ dependency)
`sum-function-test.sudo`	Primary happy-path fixture (3 assertions)
`sudolang-prompt-test.sudo`	Happy-path fixture with SudoLang `userPrompt`
`claude-agent-config.json`	Valid agent config for `--agent-config` file-loading flow
`no-prompt-under-test.sudo`	Triggers `MISSING_PROMPT_UNDER_TEST`
`missing-user-prompt.sudo`	Triggers `MISSING_USER_PROMPT`
`no-assertions.sudo`	Triggers `NO_ASSERTIONS_FOUND`

Config + scripts

vitest.config.e2e.js — dedicated config (include: ['source/e2e.test.js'], testTimeout: 300000)
vitest.config.js — updated comment on e2e.test.js exclusion
package.json — "test:e2e": "vitest run --config vitest.config.e2e.js"

Post-consolidation fixups (`source/test-extractor.js`)

Non-inferring extraction prompt — buildExtractionPrompt now includes explicit EXTRACTION RULES instructing the agent to return "" / [] for missing fields rather than synthesizing a userPrompt or extracting assertions from imported file contents. This makes MISSING_USER_PROMPT and NO_ASSERTIONS_FOUND reliably testable end-to-end.
Agent-attributed error messages — MISSING_USER_PROMPT and NO_ASSERTIONS_FOUND messages now correctly say "Extraction agent returned…" rather than "Test file does not…", accurately pointing at the source of truth.
Validation reorder (A1) — userPrompt and assertions checks now fire before resolveImportPaths() IO, so structural errors surface before any filesystem reads.
buildJudgePrompt guard (A2) — CONTEXT (Prompt Under Test) section is now conditionally omitted when promptUnderTest is empty, matching buildResultPrompt pattern.
Mock consistency (A3) — TAP YAML embed in ai-runner.test.js mock uses JSON.stringify() for consistency with extractionResult and resultText.

⚠️ E2E tests: local-only, team decision needed

npm run test:e2e requires an authenticated Claude CLI and must be run locally. These tests do not run in CI by default (no claude binary available in the CI environment).

A team decision is needed on CI strategy, for example:

Skip e2e in CI permanently (unit coverage is sufficient for most paths)
Run e2e in CI with a secrets-injected Claude API key on a scheduled basis
Gate e2e on a manual workflow trigger

Until a decision is made, contributors should run npm run test:e2e locally before merging PRs that touch the extraction prompt, fixtures, or agent config.

Why no deterministic failure fixture?

Deterministic E2E failure tests are not viable with capable LLMs as both result and judge agents — the result agent satisfies requirements from first principles regardless of bad prompt context, and the judge scores the actual output rather than the prompt quality. The failure detection path is fully covered by unit tests with mock agents in ai-runner.test.js. See source/fixtures/README.md for the full rationale.

Test results

npm test         → 190 tests passing
npm run lint     → Lint complete.
npm run ts       → TypeScript check complete.
npm run test:e2e → 6/6 passing (requires Claude auth)

- Add source/e2e.test.js: two Vitest describe blocks covering the full workflow (runAITests + recordTestOutput) and the --agent-config JSON file-loading flow; uses describe.skipIf, onTestFinished cleanup, and extracted timeout constants - Add sum-function-test.sudo + sum-function-spec.mdc: self-contained fixture that exercises the import/promptUnderTest pipeline without depending on project-specific ai/ rules - Add claude-agent-config.json: fixture for --agent-config file flow - Add vitest.config.e2e.js: dedicated config for npm run test:e2e - Update vitest.config.js: correct stale comment on e2e exclusion - Add test:e2e script to package.json - Update fixtures/README.md: accurate descriptions + rationale for omitting a deterministic failure fixture E2E failure path is proven at unit level (ai-runner.test.js mock agents); deterministic failure fixtures are not viable with capable LLMs as both result and judge agents. Made-with: Cursor

- Add no-prompt-under-test.sudo, missing-user-prompt.sudo, no-assertions.sudo fixtures to trigger extraction validation errors (MISSING_PROMPT_UNDER_TEST, MISSING_USER_PROMPT, NO_ASSERTIONS_FOUND) through the full E2E pipeline - Add sudolang-prompt-test.sudo fixture to verify the framework handles SudoLang syntax in the userPrompt field - Add describe.skipIf blocks for each validation error case and for the SudoLang happy-path scenario - Replace try/catch error capture with Try helper, consistent with ai-runner.test.js and test-extractor.test.js conventions - Update fixtures/README.md with new fixture descriptions Made-with: Cursor

- Replace { name, code } partial assertion shapes with full error?.cause comparisons for all three validation error tests (MISSING_PROMPT_UNDER_TEST, MISSING_USER_PROMPT, NO_ASSERTIONS_FOUND) - Expected objects now include name, message, code, and testFile — the complete deterministic cause shape from test-extractor.js - Follows Jan's review principle from PR #409: deterministic functions should assert the complete expected value, not individual properties Made-with: Cursor

- A1: validate userPrompt/assertions before resolveImportPaths so structural errors surface before any filesystem IO - A2: guard CONTEXT section in buildJudgePrompt matching existing buildResultPrompt pattern; add test for empty case - A3: use JSON.stringify for TAP YAML in mock helper, eliminating backtick-in-template fragility Made-with: Cursor

- Rewrite buildExtractionPrompt with explicit EXTRACTION RULES block: agent must return "" / [] for missing fields rather than inferring userPrompt or assertions from import/context - Update MISSING_USER_PROMPT and NO_ASSERTIONS_FOUND messages to correctly attribute failures to the agent, not the test file - Add explicit "return []" fallback to importPaths rule for consistency with rules 1 and 3 - Remove !assertions dead code (parseExtractionResult guarantees an array; only .length === 0 check is needed) - Update e2e expected messages and test-extractor snapshot test to match revised prompt and error messages - buildJudgePrompt now conditionally omits CONTEXT section when promptUnderTest is empty, matching buildResultPrompt pattern - Reorder extractTests validation: structural checks before IO Made-with: Cursor

Base automatically changed from pr/ai-test-output-cli to ai-testing-framework-implementation-consolidation March 4, 2026 16:58

ianwhitedeveloper added 5 commits March 4, 2026 13:41

ianwhitedeveloper force-pushed the pr/ai-e2e-fixtures branch from 9506069 to 7da79f3 Compare March 4, 2026 19:42

ianwhitedeveloper marked this pull request as ready for review March 4, 2026 20:07

ianwhitedeveloper mentioned this pull request Mar 4, 2026

feat(ai): AI Testing Framework — consolidation staging branch [6/8 → master] #411

Draft

ianwhitedeveloper requested review from ericelliott and janhesters March 4, 2026 20:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): PR 7 — E2E Tests + Fixtures + Config#421

feat(ai): PR 7 — E2E Tests + Fixtures + Config#421
ianwhitedeveloper wants to merge 5 commits intoai-testing-framework-implementation-consolidationfrom
pr/ai-e2e-fixtures

ianwhitedeveloper commented Feb 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ianwhitedeveloper commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

What's in this PR

source/e2e.test.js

Fixtures (source/fixtures/)

Config + scripts

Post-consolidation fixups (source/test-extractor.js)

⚠️ E2E tests: local-only, team decision needed

Why no deterministic failure fixture?

Test results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ianwhitedeveloper commented Feb 27, 2026 •

edited

Loading

`source/e2e.test.js`

Fixtures (`source/fixtures/`)

Post-consolidation fixups (`source/test-extractor.js`)