Skip to content

Comments

feat: add evaluations UI with eval sets, runs, and evaluators#87

Open
cristipufu wants to merge 2 commits intomainfrom
feat/eval-ui-overhaul
Open

feat: add evaluations UI with eval sets, runs, and evaluators#87
cristipufu wants to merge 2 commits intomainfrom
feat/eval-ui-overhaul

Conversation

@cristipufu
Copy link
Member

Summary

  • Add full evaluations UI: eval sets (with I/O and evaluators tabs), eval runs (with score/I/O/logs tabs and trace tree), and evaluators management (create/edit forms, category filtering, card layout)
  • Add backend support: eval data models, eval service, REST + WebSocket routes for eval sets, runs, and evaluators
  • Restructure sidebar layout with shared header, activity bar, and section-specific content panels
  • Add resizable split-pane layout with slide in/out animations, route-driven item selection, and auto-select on navigation
  • Bump version to 0.0.61

Test plan

  • Verify eval sets page loads with item grid, I/O and Evaluators tabs in sidebar
  • Verify eval runs page shows results grid with score/I/O/logs tabs and trace tree
  • Verify evaluators page with create/edit forms, category filtering, and card toggle
  • Verify sidebar resize drag works smoothly without lag
  • Verify sidebar slide in/out animations work on item selection/deselection
  • Verify route-driven navigation and auto-select behavior
  • Verify debug page is completely unchanged

🤖 Generated with Claude Code

cristipufu and others added 2 commits February 24, 2026 09:54
- Add eval data models, service layer, and REST/WS routes for eval sets, runs, and evaluators
- Add frontend eval pages: eval set detail with I/O and evaluators tabs, eval run results with score/I/O/logs tabs and trace tree
- Add evaluators management with create/edit forms, category filtering, and card-based layout
- Restructure sidebar: shared header with activity bar, section-specific content panels
- Add resizable split-pane layout with slide in/out animations and drag handle
- Route-driven item selection for eval runs (#/evals/runs/:id/:itemName)
- Auto-select latest run or first eval set on navigation
- Bump version to 0.0.61

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…reation

- Add POST /eval-sets/{set_id}/items endpoint to append items to eval sets
- Add AddToEvalModal component for adding completed run I/O to eval sets
- Fix create_eval_set to include id and version "1.0" fields required by EvaluationSet
- Fix add_eval_item to include item id (uuid) required by EvaluationItem
- Fix create_local_evaluator to omit empty defaultEvaluationCriteria
  (empty dict fails Pydantic validation for evaluators requiring expectedOutput)
- Add incrementEvalSetCount store action for sidebar count updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant