Add @nitpicker/query and @nitpicker/mcp-server packages by YusukeHirao · Pull Request #56 · d-zero-dev/nitpicker

YusukeHirao · 2026-03-13T02:38:05Z

Summary

This PR introduces two new packages to enable querying .nitpicker archive files:

@nitpicker/query - A query API library providing SQL-level filtering, aggregation, and pagination for archive data
@nitpicker/mcp-server - A Model Context Protocol (MCP) server exposing archive queries to AI assistants like Claude

Key Changes

@nitpicker/query Package

ArchiveManager: Manages lifecycle of opened archives with reference counting to prevent resource exhaustion (max 20 concurrent archives)
Query Functions: 14 specialized query functions for archive analysis:
- getSummary() - Site-wide statistics (page counts, status distribution, metadata fulfillment rates)
- listPages() - Pages with rich filtering (status codes, missing metadata, URL patterns, directory paths)
- getPageDetail() - Detailed page information including links and redirects
- listLinks() - Link analysis (broken, external, orphaned pages)
- listImages() - Image inventory with quality issue detection (missing alt, dimensions, lazy-loading)
- listResources() - Sub-resource tracking (CSS, JS, fonts, etc.)
- checkHeaders() - Security header validation (CSP, X-Frame-Options, HSTS, etc.)
- findDuplicates() - Metadata duplication detection
- findMismatches() - Canonical/OG tag mismatches
- getViolations() - Accessibility and validation violations
- getPageHtml() - HTML snapshot retrieval
- getResourceReferrers() - Resource usage tracking
Type Definitions: Comprehensive TypeScript interfaces for all query options and results
SQL Optimization: All queries use Knex.js with database-level filtering for performance on large datasets (10,000+ pages)

@nitpicker/mcp-server Package

MCP Server Implementation: Stdio-based MCP server exposing all query functions as tools
Tool Definitions: 14 MCP tools with JSON Schema input validation and LLM-friendly descriptions
Argument Validation: Type-safe extraction and validation of tool arguments with helpful error messages
Enum Validation: Validates link types, mismatch types, and duplicate fields
Comprehensive Tests: Full test suite covering all tools and edge cases

Supporting Changes

Updated ARCHITECTURE.md to document new packages
Updated README.md with MCP Server setup instructions
Added CLAUDE.md documentation for AI context
Extended @nitpicker/crawler types with DB_Image interface for image tracking
Added security header keyword to cspell.json

Implementation Details

Reference Counting: Archives opened multiple times reuse the same extraction, preventing redundant untarring
Resource Limits: Maximum 20 concurrent open archives to prevent file descriptor exhaustion
SQL-Level Filtering: All filtering, sorting, and pagination happens at the database level for efficiency
Temporary Directory Management: Automatic cleanup of extracted archives when all references are closed
MCP Protocol: Uses @modelcontextprotocol/sdk for standards-compliant AI assistant integration

https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

Add two new packages for querying .nitpicker archive files via MCP: - @nitpicker/query: Archive lifecycle management and 12 query functions (getSummary, listPages, getPageDetail, getPageHtml, listLinks, listResources, listImages, getViolations, findDuplicates, findMismatches, getResourceReferrers, checkHeaders) - @nitpicker/mcp-server: MCP server exposing 14 tools via stdio transport (open_archive, close_archive + 12 query tools) Crawler changes: - Add getKnex() to ArchiveAccessor and Database for SQL-level queries - Add DB_Image type definition to archive types https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

Add explicit return type to CallToolRequestSchema handler to avoid deep type instantiation, and fix count query type access. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

…, update docs - Add unit tests for 7 query functions (get-page-detail, get-page-html, list-links, list-resources, list-images, find-mismatches, get-resource-referrers) - Add 16 integration tests for mcp-server covering all 14 tools, error handling, and lifecycle management - Replace unsafe type casts with requireString/optionalNumber validation helpers - Replace destructuring patterns causing unused variable lint errors with omit helper - Update ARCHITECTURE.md, CLAUDE.md, and README.md with query and mcp-server packages https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

Source code fixes: - getViolations: rewrite to read analysis/violations file instead of N+1 per-page queries; replace empty catch blocks with proper error handling - ArchiveManager: fix resource leak in close() by calling archive.close() to destroy DB connection and clean up tmpDir - mcp-server: add NaN check to optionalNumber(); add runtime enum validation for link type, mismatch type, and duplicate field; replace raw Knex query in open_archive with getSummary() - check-headers/get-page-detail: replace silent catch blocks with console.warn for JSON parse errors - All query functions: replace non-null assertion [0]! with optional chaining [0]?.total ?? 0 with explanatory comments Test improvements: - Add get-violations.spec.ts (9 tests): filtering, pagination, ENOENT - Add archive-manager.spec.ts (9 tests): lifecycle, cleanup, error cases - mcp-server.spec.ts: strengthen assertions (toBe instead of toBeGreaterThanOrEqual), add SDK internal API comment, add enum validation error tests - list-pages.spec.ts: add 6 missing filter tests (statusMin, statusMax, missingDescription, urlPattern, sortBy/sortOrder, directory) - list-links.spec.ts: replace weak assertions with precise toMatchObject checks, add pagination test https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

- list-images.ts: extract oversizedThreshold to local variable to avoid non-null assertion - list-links.spec.ts: use direct items[0] with toMatchObject instead of .find() - archive-manager.spec.ts: use hardcoded expected IDs instead of computed values https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

- ArchiveManager: add .nitpicker extension validation to reject arbitrary file types - ArchiveManager: add MAX_OPEN_ARCHIVES (20) limit to prevent resource exhaustion via unlimited archive opens - ArchiveManager: log warning on close failure instead of silently swallowing errors - getViolations: use error.code === 'ENOENT' instead of fragile string matching on error.message - mcp-server: sanitize error messages to avoid leaking internal file paths (/tmp, /home, /root, /usr) - Add tests for extension validation and concurrent archive limit https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

- Add file existence check (accessSync) before opening archives - Resolve symlinks (realpathSync) and re-validate extension to prevent symlink-based path traversal attacks - Simplify error message sanitization to strip all multi-segment absolute paths - Add tests for missing file and symlink traversal scenarios https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

ArchiveManager now deduplicates open calls by resolved real path. When the same .nitpicker file is opened again, the existing extraction and DB connection are shared via reference counting — no redundant untar is performed. Resources are released only when all references to the same file are closed. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

- Add explicit expect(archive).toBeDefined() before non-null assertions - Fix symlink test: use rmSync before symlinkSync instead of try/catch, wrap assertion in try/finally for reliable cleanup - Rename test to match actual verification: "同じファイルの再オープンはユニークファイル数の上限にカウントされない" - Fix closeAll race condition: use sequential loop instead of Promise.all to prevent concurrent close on same shared entry - Add tmpDir existence check in ref-count partial-close test https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

@throws

- ARCHITECTURE.md: add reference counting / dedup description for ArchiveManager - archive-manager.ts: add missing @throws for file-not-found, clarify @returns includes archive on first open only - check-headers.ts: fix @param options description to include filter https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

zod was incorrectly listed as a direct dependency of @nitpicker/mcp-server in the lockfile but is not in package.json. This caused yarn install --immutable to fail in CI. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

claude added 12 commits March 11, 2026 08:59

fix: resolve TS2589 and TS2339 build errors in mcp-server

d20c8ad

Add explicit return type to CallToolRequestSchema handler to avoid deep type instantiation, and fix count query type access. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

docs: add README.md for query and mcp-server packages

e20bd29

https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

fix: remove stale zod entry from lockfile

658fc38

zod was incorrectly listed as a direct dependency of @nitpicker/mcp-server in the lockfile but is not in package.json. This caused yarn install --immutable to fail in CI. https://claude.ai/code/session_01XmSXeM4Jx8rzxwzu6GSvGc

YusukeHirao merged commit 59e2282 into main Mar 13, 2026
3 checks passed

YusukeHirao deleted the claude/implement-archive-mcp-server-8XLGZ branch March 13, 2026 02:53

YusukeHirao linked an issue Mar 13, 2026 that may be closed by this pull request

feat: .nitpicker アーカイブ照会 MCP サーバーの新規実装 #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add @nitpicker/query and @nitpicker/mcp-server packages#56

Add @nitpicker/query and @nitpicker/mcp-server packages#56
YusukeHirao merged 12 commits intomainfrom
claude/implement-archive-mcp-server-8XLGZ

YusukeHirao commented Mar 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YusukeHirao commented Mar 13, 2026

Summary

Key Changes

@nitpicker/query Package

@nitpicker/mcp-server Package

Supporting Changes

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants