Skip to content

feat: static embedding for serverless buckaroo rendering#560

Open
paddymul wants to merge 14 commits intomainfrom
feat/static-embed
Open

feat: static embedding for serverless buckaroo rendering#560
paddymul wants to merge 14 commits intomainfrom
feat/static-embed

Conversation

@paddymul
Copy link
Collaborator

Summary

Closes #553. Adds a static-only rendering path: Python generates a self-contained artifact with parquet-b64 serialized data, JS renders it without any server connection.

  • buckaroo.prepare_buckaroo_artifact(df) — generates a dict with df_data, df_viewer_config, summary_stats_data, all parquet b64
  • buckaroo.to_html(df) — generates a complete HTML page referencing static-embed.js
  • BuckarooStaticTable React component — resolves parquet b64 payloads, renders DFViewer
  • static-embed.tsx esbuild entry point for static HTML pages
  • Registers AG-Grid v35 modules (PinnedRowModule, CellStyleModule, etc.) for full feature support
  • Handles hyparquet BigInt→Number conversion in parseParquetRow

Files changed

  • buckaroo/artifact.py — Python artifact generation (pandas, polars, file paths)
  • packages/buckaroo-js-core/src/components/BuckarooStaticTable.tsx — React component
  • packages/js/static-embed.tsx — esbuild entry point
  • packages/buckaroo-js-core/src/index.ts — added named exports
  • packages/buckaroo-js-core/src/components/DFViewerParts/resolveDFData.ts — BigInt fix
  • packages/buckaroo-js-core/src/components/DFViewerParts/DFViewerInfinite.tsx — AG-Grid v35 module registration

Test plan

  • 17 Python unit tests (tests/unit/artifact_test.py) — parquet b64 format, JSON serialization, polars, file paths
  • 6 Playwright integration tests (pw-tests/static-embed.spec.ts) — table renders, correct headers, data values, summary stats pinned rows with dtype info, histogram row present
  • Existing test suites pass (basic_widget, serialization_utils, widget_extension)

🤖 Generated with Claude Code

@github-actions
Copy link

github-actions bot commented Feb 23, 2026

📦 TestPyPI package published

pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev22418221000

or with uv:

uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev22418221000

MCP server for Claude Code

claude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.12.12.dev22418221000" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table

paddymul and others added 10 commits February 23, 2026 22:56
Add --minify and --sourcemap flags to the widget.js build command.
widget.js drops from 3.5 MB to 1.8 MB. standalone.js also gets --sourcemap.
Add *.js.map to .gitignore for the new sourcemaps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bump all @ag-grid-community/* packages from ^32.3.3 to ^32.3.9
to pick up bug fixes before the v33 migration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace lodash (CJS) with lodash-es (ESM) as a runtime dependency.
lodash-es allows bundlers (vite, esbuild) to tree-shake unused functions.
Keep lodash in devDependencies for Jest (mapped via moduleNameMapper).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `import * as _ from "lodash-es"` with named imports like
`import { map, filter, keys } from "lodash-es"` across all files.
Remove unused lodash imports from 10 files that didn't use any
lodash functions. This ensures all bundlers (including esbuild)
can tree-shake unused lodash functions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidate 6 @ag-grid-community/* packages into 2 unified packages
(ag-grid-community + ag-grid-react). v33 improves internal tree-shaking.

- Rewrite all imports across 16 source files
- Rename CSS variables: --ag-grid-size → --ag-spacing,
  --ag-alpine-active-color → --ag-accent-color,
  --ag-range-selection-background-color-3 → --ag-range-selection-background-color
- ESM bundle: 2,277 KB → 1,722 KB (-24.4%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Zero breaking changes. Version bump only.
ESM bundle: 1,722 KB → 1,773 KB (+3%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add suppressNoRowsOverlay to infinite row model grid options
(v35 adds default overlays that flash during data loading).
ESM bundle: 1,773 KB → 1,827 KB (+3%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add prepare_buckaroo_artifact() and to_html() for generating self-contained
static HTML tables without a notebook kernel or server. Both df_data and
summary_stats are serialized as parquet b64 for compact transport.

Python: buckaroo/artifact.py with pandas, polars, and file path support.
JS: BuckarooStaticTable component + static-embed.tsx esbuild entry point.
Fix: register AG-Grid v35 modules (PinnedRow, CellStyle, etc.) and handle
hyparquet BigInt→Number conversion in parseParquetRow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove stale per-project pnpm-lock.yaml that still referenced
@ag-grid-community v32 packages. Align deploy-ghpage and
publish-storybook workflows with checks.yml by using explicit
pnpm install --frozen-lockfile from the workspace root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Deploy step was pushing to gh-pages on PR builds, racing with
the deploy-ghpage workflow and causing push failures. Only deploy
when merging to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@paddymul
Copy link
Collaborator Author

@codex

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d46a541ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

</head>
<body>
<div id="root"></div>
<script>window.__BUCKAROO_ARTIFACT__ = {artifact_json};</script>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Escape embedded artifact JSON before injecting into script

The HTML template writes artifact_json directly into an inline <script> tag, but json.dumps does not escape </script> sequences; if a dataframe-derived field in df_viewer_config (for example a column/header name) contains that substring, the browser terminates the script early and treats the remainder as executable HTML/JS. This creates an XSS vector whenever untrusted data is rendered to static HTML.

Useful? React with 👍 / 👎.

Comment on lines +65 to +67
} else if (typeof val === 'bigint') {
// hyparquet decodes INT64 as BigInt; convert to Number for JSON compat
parsed[key] = Number(val);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve INT64 precision when decoding parquet BigInt values

Converting hyparquet bigint values with Number(...) silently loses precision for values above 2^53-1, so static embeds can corrupt large integer IDs/timestamps and then sort/filter on incorrect values. This regression is introduced by coercing every bigint cell to a JS number during parquet row parsing.

Useful? React with 👍 / 👎.

Comment on lines +136 to +137
elif suffix in ('.json', '.jsonl', '.ndjson'):
return pl.read_ndjson(path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Read .json paths with a JSON parser in the Polars branch

The Polars path routes .json files through pl.read_ndjson, which expects newline-delimited JSON; standard JSON files (array/object form) that work in the pandas fallback will fail whenever Polars is installed. This makes prepare_buckaroo_artifact(<json path>) behavior environment-dependent and breaks valid .json inputs.

Useful? React with 👍 / 👎.

The test was mocking `@ag-grid-community/react` (old modular path) but
the component now imports from `ag-grid-react` (v35 monolithic path).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make it easier to statically embed buckaroo

1 participant