Improve error handling in crawler and add comprehensive test suite#54
Merged
YusukeHirao merged 4 commits intomainfrom Mar 11, 2026
Merged
Improve error handling in crawler and add comprehensive test suite#54YusukeHirao merged 4 commits intomainfrom
YusukeHirao merged 4 commits intomainfrom
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR enhances error handling in the crawler to properly handle
AggregateErrorexceptions from the dealer, adds a comprehensive test suite for error scenarios, and improves worker-level error handling to ensure graceful degradation.Key Changes
Refactored deal-level error handling: Extracted error emission logic into a new
#emitDealErrors()method that intelligently handles bothAggregateError(emitting each inner error as a separate event) and regular errors (emitting a single event). This method is now used by bothstart()andstartMultiple().Added worker-level error handling: Wrapped the worker's main processing logic in a try-catch block to catch and emit errors from individual URL processing, allowing the crawler to continue processing remaining URLs even when one fails.
Comprehensive test suite: Added 238 lines of test coverage (
crawler.spec.ts) covering:AggregateErrorhandling with multiple errors being emitted as individual eventsAggregateErrorbeing converted to Error instancescrawlEndevent emission after deal failuresstart()andstartMultiple()methodsUpdated documentation: Clarified in CLAUDE.md and ARCHITECTURE.md that the core package uses a bounded Promise pool rather than
deal()for parallel processing, and noted that@d-zero/dealeris used for progress display viaLanes.Implementation Details
#emitDealErrors()method checks if the error is anAggregateErrorand iterates through itserrorsarray; otherwise treats the error as a single failure.handleScrapeError()for state management, then emitted as error events with proper context (URL, external flag, etc.).pid,isMainProcess,url,isExternal, anderrorfields.@d-zero/dealer(1.6.3 → 1.7.0) and@d-zero/shared(0.20.0 → 0.20.1) across all affected packages.https://claude.ai/code/session_01DZApYkRAury35FhWGY72xf