Skip to content

Improve error handling in crawler and add comprehensive test suite#54

Merged
YusukeHirao merged 4 commits intomainfrom
claude/update-dealer-dependency-RsQAB
Mar 11, 2026
Merged

Improve error handling in crawler and add comprehensive test suite#54
YusukeHirao merged 4 commits intomainfrom
claude/update-dealer-dependency-RsQAB

Conversation

@YusukeHirao
Copy link
Copy Markdown
Member

Summary

This PR enhances error handling in the crawler to properly handle AggregateError exceptions from the dealer, adds a comprehensive test suite for error scenarios, and improves worker-level error handling to ensure graceful degradation.

Key Changes

  • Refactored deal-level error handling: Extracted error emission logic into a new #emitDealErrors() method that intelligently handles both AggregateError (emitting each inner error as a separate event) and regular errors (emitting a single event). This method is now used by both start() and startMultiple().

  • Added worker-level error handling: Wrapped the worker's main processing logic in a try-catch block to catch and emit errors from individual URL processing, allowing the crawler to continue processing remaining URLs even when one fails.

  • Comprehensive test suite: Added 238 lines of test coverage (crawler.spec.ts) covering:

    • AggregateError handling with multiple errors being emitted as individual events
    • Non-Error values within AggregateError being converted to Error instances
    • Regular Error handling as a single event
    • crawlEnd event emission after deal failures
    • Worker-level exception handling and continuation
    • Both start() and startMultiple() methods
  • Updated documentation: Clarified in CLAUDE.md and ARCHITECTURE.md that the core package uses a bounded Promise pool rather than deal() for parallel processing, and noted that @d-zero/dealer is used for progress display via Lanes.

Implementation Details

  • The #emitDealErrors() method checks if the error is an AggregateError and iterates through its errors array; otherwise treats the error as a single failure.
  • Worker errors are caught and passed to handleScrapeError() for state management, then emitted as error events with proper context (URL, external flag, etc.).
  • All error events maintain consistent structure with pid, isMainProcess, url, isExternal, and error fields.
  • Dependencies updated: @d-zero/dealer (1.6.3 → 1.7.0) and @d-zero/shared (0.20.0 → 0.20.1) across all affected packages.

https://claude.ai/code/session_01DZApYkRAury35FhWGY72xf

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants