Add @nitpicker/query and @nitpicker/mcp-server packages#56
Merged
YusukeHirao merged 12 commits intomainfrom Mar 13, 2026
Merged
Add @nitpicker/query and @nitpicker/mcp-server packages#56YusukeHirao merged 12 commits intomainfrom
YusukeHirao merged 12 commits intomainfrom
Conversation
Add two new packages for querying .nitpicker archive files via MCP: - @nitpicker/query: Archive lifecycle management and 12 query functions (getSummary, listPages, getPageDetail, getPageHtml, listLinks, listResources, listImages, getViolations, findDuplicates, findMismatches, getResourceReferrers, checkHeaders) - @nitpicker/mcp-server: MCP server exposing 14 tools via stdio transport (open_archive, close_archive + 12 query tools) Crawler changes: - Add getKnex() to ArchiveAccessor and Database for SQL-level queries - Add DB_Image type definition to archive types https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
Add explicit return type to CallToolRequestSchema handler to avoid deep type instantiation, and fix count query type access. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
…, update docs - Add unit tests for 7 query functions (get-page-detail, get-page-html, list-links, list-resources, list-images, find-mismatches, get-resource-referrers) - Add 16 integration tests for mcp-server covering all 14 tools, error handling, and lifecycle management - Replace unsafe type casts with requireString/optionalNumber validation helpers - Replace destructuring patterns causing unused variable lint errors with omit helper - Update ARCHITECTURE.md, CLAUDE.md, and README.md with query and mcp-server packages https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
Source code fixes: - getViolations: rewrite to read analysis/violations file instead of N+1 per-page queries; replace empty catch blocks with proper error handling - ArchiveManager: fix resource leak in close() by calling archive.close() to destroy DB connection and clean up tmpDir - mcp-server: add NaN check to optionalNumber(); add runtime enum validation for link type, mismatch type, and duplicate field; replace raw Knex query in open_archive with getSummary() - check-headers/get-page-detail: replace silent catch blocks with console.warn for JSON parse errors - All query functions: replace non-null assertion [0]! with optional chaining [0]?.total ?? 0 with explanatory comments Test improvements: - Add get-violations.spec.ts (9 tests): filtering, pagination, ENOENT - Add archive-manager.spec.ts (9 tests): lifecycle, cleanup, error cases - mcp-server.spec.ts: strengthen assertions (toBe instead of toBeGreaterThanOrEqual), add SDK internal API comment, add enum validation error tests - list-pages.spec.ts: add 6 missing filter tests (statusMin, statusMax, missingDescription, urlPattern, sortBy/sortOrder, directory) - list-links.spec.ts: replace weak assertions with precise toMatchObject checks, add pagination test https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
- list-images.ts: extract oversizedThreshold to local variable to avoid non-null assertion - list-links.spec.ts: use direct items[0] with toMatchObject instead of .find() - archive-manager.spec.ts: use hardcoded expected IDs instead of computed values https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
- ArchiveManager: add .nitpicker extension validation to reject arbitrary file types - ArchiveManager: add MAX_OPEN_ARCHIVES (20) limit to prevent resource exhaustion via unlimited archive opens - ArchiveManager: log warning on close failure instead of silently swallowing errors - getViolations: use error.code === 'ENOENT' instead of fragile string matching on error.message - mcp-server: sanitize error messages to avoid leaking internal file paths (/tmp, /home, /root, /usr) - Add tests for extension validation and concurrent archive limit https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
- Add file existence check (accessSync) before opening archives - Resolve symlinks (realpathSync) and re-validate extension to prevent symlink-based path traversal attacks - Simplify error message sanitization to strip all multi-segment absolute paths - Add tests for missing file and symlink traversal scenarios https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
ArchiveManager now deduplicates open calls by resolved real path. When the same .nitpicker file is opened again, the existing extraction and DB connection are shared via reference counting — no redundant untar is performed. Resources are released only when all references to the same file are closed. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
- Add explicit expect(archive).toBeDefined() before non-null assertions - Fix symlink test: use rmSync before symlinkSync instead of try/catch, wrap assertion in try/finally for reliable cleanup - Rename test to match actual verification: "同じファイルの再オープンは ユニークファイル数の上限にカウントされない" - Fix closeAll race condition: use sequential loop instead of Promise.all to prevent concurrent close on same shared entry - Add tmpDir existence check in ref-count partial-close test https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
- ARCHITECTURE.md: add reference counting / dedup description for ArchiveManager - archive-manager.ts: add missing @throws for file-not-found, clarify @returns includes archive on first open only - check-headers.ts: fix @param options description to include filter https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
zod was incorrectly listed as a direct dependency of @nitpicker/mcp-server in the lockfile but is not in package.json. This caused yarn install --immutable to fail in CI. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces two new packages to enable querying
.nitpickerarchive files:Key Changes
@nitpicker/query Package
getSummary()- Site-wide statistics (page counts, status distribution, metadata fulfillment rates)listPages()- Pages with rich filtering (status codes, missing metadata, URL patterns, directory paths)getPageDetail()- Detailed page information including links and redirectslistLinks()- Link analysis (broken, external, orphaned pages)listImages()- Image inventory with quality issue detection (missing alt, dimensions, lazy-loading)listResources()- Sub-resource tracking (CSS, JS, fonts, etc.)checkHeaders()- Security header validation (CSP, X-Frame-Options, HSTS, etc.)findDuplicates()- Metadata duplication detectionfindMismatches()- Canonical/OG tag mismatchesgetViolations()- Accessibility and validation violationsgetPageHtml()- HTML snapshot retrievalgetResourceReferrers()- Resource usage tracking@nitpicker/mcp-server Package
Supporting Changes
ARCHITECTURE.mdto document new packagesREADME.mdwith MCP Server setup instructionsCLAUDE.mddocumentation for AI context@nitpicker/crawlertypes withDB_Imageinterface for image trackingcspell.jsonImplementation Details
@modelcontextprotocol/sdkfor standards-compliant AI assistant integrationhttps://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc