Skip to content

fix(runtime): align Node warmup protocol and fail fast#196

Merged
bbopen merged 2 commits intomainfrom
codex/fix-node-warmup-protocol
Mar 10, 2026
Merged

fix(runtime): align Node warmup protocol and fail fast#196
bbopen merged 2 commits intomainfrom
codex/fix-node-warmup-protocol

Conversation

@bbopen
Copy link
Owner

@bbopen bbopen commented Feb 28, 2026

Summary\n- send warmup commands with the current protocol envelope (protocol/method/params + numeric id)\n- reject legacy warmup shape { method, params } with explicit migration guidance\n- surface warmup send/response failures with command index + module/function context\n- ensure WorkerPool disposes transports and does not retain workers when warmup fails\n\n## Validation\n- npx vitest --run test/runtime_node.test.ts test/optimized-node.test.ts test/worker-pool.test.ts

@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

Warning

Rate limit exceeded

@bbopen has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 16 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b91ff607-ba97-49ad-aa9b-f13b78d03361

📥 Commits

Reviewing files that changed from the base of the PR and between 987c7e8 and 05331c7.

📒 Files selected for processing (5)
  • src/runtime/node.ts
  • src/runtime/worker-pool.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • test/worker-pool.test.ts
📝 Walkthrough

Walkthrough

Warmup command handling was formalized into a strict WarmupCommand shape with normalization and numeric request IDs; warmup messaging now uses PROTOCOL_ID and robust error wrapping. WorkerPool initialization was hardened to dispose partially-initialized workers on warmup failure. Tests updated to cover new formats and failure cases.

Changes

Cohort / File(s) Summary
Warmup normalization & protocol
src/runtime/node.ts
Introduce WarmupCommand shape and normalization (reject legacy {method, params}), use numeric warmup IDs and PROTOCOL_ID, add per-command request IDs, stricter envelope validation and BridgeExecutionError wrapping for encoding/sending/parsing failures.
Worker pool init safety
src/runtime/worker-pool.ts
Wrap onWorkerReady in try/catch, dispose transport on warmup/init errors (suppressing disposal errors), and only push worker into pool after successful readiness.
Tests & error exports
test/optimized-node.test.ts, test/runtime_node.test.ts, test/worker-pool.test.ts, src/runtime/errors.js
Update tests to use new warmup command shape, add tests rejecting legacy/non-array warmup input and onWorkerReady failure; export BridgeProtocolError for protocol-related assertions.

Sequence Diagram

sequenceDiagram
    participant Client
    participant NodeBridge
    participant Normalizer
    participant Worker
    participant Transport

    Client->>NodeBridge: initialize({ warmupCommands })
    NodeBridge->>Normalizer: normalize(warmupCommands)
    Normalizer-->>NodeBridge: WarmupCommand[]
    NodeBridge->>Worker: create + onWorkerReady()
    Worker->>Transport: for each WarmupCommand -> send PROTOCOL_ID (numeric reqId)
    Transport-->>Worker: response / error
    alt all commands succeed
        Worker-->>NodeBridge: ready
        NodeBridge->>NodeBridge: push worker to pool
        NodeBridge-->>Client: initialized OK
    else any command fails
        Worker-->>NodeBridge: BridgeExecutionError / protocol error
        NodeBridge->>Transport: dispose()
        NodeBridge-->>Client: initialization rejected
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through code to tidy the nest,
Warmup commands now wear their best vest.
If a worker trips, I tidy the floor,
No half-built burrows left anymore,
Protocols hum — we're ready to test!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main changes: aligning Node warmup protocol with the current envelope format and implementing fail-fast behavior for warmup failures.
Description check ✅ Passed The description is directly related to the changeset, covering the key improvements: protocol alignment, legacy format rejection, error surfacing, and worker cleanup on warmup failure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/fix-node-warmup-protocol

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added bug Something isn't working enhancement New feature or request area:runtime-node Area: Node runtime bridge priority:p2 Priority P2 (medium) labels Feb 28, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a06067b2b2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +199 to +200
const warmups = commands ?? [];
return warmups.map((command, index) => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate warmupCommands is an array before mapping

normalizeWarmupCommands assumes commands has .map(), so a JS/config caller that passes a non-array value (for example a single object) now gets a raw TypeError (warmups.map is not a function) before any of the new protocol validation/migration messaging runs. This turns a user-input validation case into an opaque runtime crash and bypasses the explicit BridgeProtocolError paths added in this commit.

Useful? React with 👍 / 👎.

Comment on lines +552 to +553
const message = JSON.stringify({
id: generateWarmupId(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wrap warmup serialization failures with command context

The warmup payload is serialized before entering the guarded send/parse error handling, so unsupported argument shapes (e.g. BigInt or circular values in args) cause JSON.stringify to throw a bare TypeError without the command index/module/function context this change is trying to provide. This makes warmup failures harder to diagnose and inconsistent with the new fail-fast error reporting.

Useful? React with 👍 / 👎.

coderabbitai[bot]
coderabbitai bot previously requested changes Feb 28, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/runtime/node.ts`:
- Around line 550-599: The warmup loop around warmupCommands currently builds
the request with JSON.stringify and treats any JSON response without an "error"
key as success; wrap the message construction in a try/catch and throw a
BridgeExecutionError with the same "Warmup command `#N` (module.function)" context
on JSON encoding failures (the message variable created from
generateWarmupId()/PROTOCOL_ID/worker.transport.send), and after parsing the
response validate that a proper success envelope exists (e.g. the parsed object
either contains an expected "result" or explicit success marker) — if neither
"error" nor the expected success field is present, throw a BridgeExecutionError
for a malformed success payload referencing warmupCommands[index], commandLabel
and the request id; keep using BridgeExecutionError for all wrapped failures so
send, encode, decode, and malformed-success cases include the command context.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6245e95 and a06067b.

📒 Files selected for processing (5)
  • src/runtime/node.ts
  • src/runtime/worker-pool.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • test/worker-pool.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: os (windows-latest)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.
Learnt from: bbopen
Repo: bbopen/tywrap PR: 136
File: src/runtime/node.ts:238-249
Timestamp: 2026-01-20T01:33:12.841Z
Learning: In the tywrap repository, when reviewing init/startup patterns, prioritize cases that break core functionality over edge cases with diagnostic-only impact (e.g., missing bridgeInfo metadata). If workers are healthy and operational, defer low-impact defensive checks to follow-up PRs.
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:599-602
Timestamp: 2026-01-20T16:01:14.323Z
Learning: In `src/runtime/node-bridge.ts` (NodeBridge), a test message is sent to the Python subprocess to confirm the bridge is responsive before marking it as ready.
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.
📚 Learning: 2026-01-20T01:34:07.064Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 136
File: src/runtime/node.ts:444-458
Timestamp: 2026-01-20T01:34:07.064Z
Learning: When reviewing promise-based polling patterns (e.g., recursive setTimeout in waitForAvailableWorker functions), ensure that any recursive timer is tracked and cleared on timeout or promise resolution to prevent timer leaks. Do not rely on no-op checks after settlement; verify clearTimeout is invoked in all paths and consider using an explicit cancellation (e.g., AbortController) or a guard to cancel pending timers when the promise settles.

Applied to files:

  • test/worker-pool.test.ts
  • src/runtime/worker-pool.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:00:30.825Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/adversarial_playground.test.ts:339-360
Timestamp: 2026-01-19T21:00:30.825Z
Learning: In adversarial tests for the bridge (test/adversarial_playground.test.ts), recovery assertions should be kept in the finally block adjacent to cleanup to verify resilience under stress, even if that means post-timeout checks might mask the original error. The goal is to surface recovery failures as legitimate test failures.

Applied to files:

  • test/worker-pool.test.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-19T21:48:27.823Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: test/runtime_bridge_fixtures.test.ts:59-66
Timestamp: 2026-01-19T21:48:27.823Z
Learning: In fixture-based tests (e.g., test/runtime_bridge_fixtures.test.ts) and similar tests in the tywrap repository, prefer early returns when Python or fixture files are unavailable. Do not rely on Vitest dynamic skip APIs; a silent pass is intentional for environments lacking Python/fixtures. Treat missing fixtures as optional soft-guards and ensure the test remains non-disruptive in non-availability scenarios.

Applied to files:

  • test/worker-pool.test.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-20T18:37:05.670Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: src/runtime/process-io.ts:37-40
Timestamp: 2026-01-20T18:37:05.670Z
Learning: In the tywrap repo, ESLint is used for linting (not Biome). Do not flag regex literals that contain control characters (e.g., \u001b, \u0000-\u001F) as lint errors, since the current ESLint configuration allows them. When reviewing TypeScript files (e.g., src/**/*.ts), rely on ESLint results and avoid raising issues about these specific control-character regex literals unless there is a competing config change or a policy requiring explicit escaping.

Applied to files:

  • test/worker-pool.test.ts
  • src/runtime/worker-pool.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-20T16:00:49.829Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:203-211
Timestamp: 2026-01-20T16:00:49.829Z
Learning: In the BridgeProtocol implementation (tywrap), reject Map and Set explicitly before serialization (e.g., in safeStringify or the serialization entrypoint) with a clear error like "Bridge protocol does not support Map/Set values". Do not rely on post-hoc checks of non-string keys at the point of JSON.stringify, since Maps/Sets cannot round-trip. This should be enforced at the boundary where data is prepared for serialization to ensure deterministic errors and prevent invalid data from propagating.

Applied to files:

  • test/worker-pool.test.ts
  • src/runtime/worker-pool.ts
  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:14:29.869Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:29.869Z
Learning: In the runtime env-var parsing (e.g., in src/runtime/bridge-core.ts and similar modules), adopt a tolerant, best-effort policy: numeric env values should parse by taking the leading numeric portion (e.g., TYWRAP_CODEC_MAX_BYTES=1024abc -> 1024). Only reject clearly invalid values (non-numeric start or <= 0). This reduces surprising failures from minor typos. Add tests to cover partial numeric prefixes, and ensure downstream logic documents the trusted range; consider a small helper to extract a positive integer from a string with a safe fallback.

Applied to files:

  • src/runtime/worker-pool.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:14:35.390Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:35.390Z
Learning: In src/runtime/bridge-core.ts and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (such as runtime, shape, or error-payload checks) inside tight loops unless the protocol boundary is untrusted or there is a concrete bug report. The Python bridge protocol is controlled via tests, so these checks add unnecessary branching overhead without meaningful benefit. Apply this guidance to other hot-path runtime loops in src/runtime/**/*.ts, and re-enable additional validations only when a documented risk or failure scenario is identified. Ensure tests cover protocol validation where applicable.

Applied to files:

  • src/runtime/worker-pool.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:48:45.693Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-298
Timestamp: 2026-01-19T21:48:45.693Z
Learning: In `src/runtime/bridge-core.ts`, keep `normalizeErrorPayload` to validate error payloads from the Python subprocess. The subprocess boundary is effectively untrusted, and normalizing error responses prevents `undefined: undefined` errors on malformed payloads. Error responses are not the hot path, so the small validation overhead is acceptable for the added resilience.

Applied to files:

  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:14:40.872Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:260-263
Timestamp: 2026-01-19T21:14:40.872Z
Learning: In `src/runtime/bridge-core.ts` and similar hot request/response loop implementations in the tywrap repository, avoid adding extra defensive validation (e.g., runtime shape checks on error payloads) in tight loops unless the protocol boundary is untrusted or there's a concrete bug report. The Python bridge protocol is controlled and validated via tests, so defensive checks would add unnecessary branching overhead without meaningful benefit.

Applied to files:

  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
📚 Learning: 2026-01-20T16:01:14.323Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:599-602
Timestamp: 2026-01-20T16:01:14.323Z
Learning: In `src/runtime/node-bridge.ts` (NodeBridge), a test message is sent to the Python subprocess to confirm the bridge is responsive before marking it as ready.

Applied to files:

  • test/optimized-node.test.ts
  • test/runtime_node.test.ts
  • src/runtime/node.ts
📚 Learning: 2026-01-19T21:14:37.032Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 127
File: src/runtime/bridge-core.ts:375-385
Timestamp: 2026-01-19T21:14:37.032Z
Learning: In tywrap (src/runtime/bridge-core.ts and similar), environment variable parsing follows a tolerant/best-effort policy. For example, `TYWRAP_CODEC_MAX_BYTES=1024abc` should be accepted as 1024. Only reject clearly invalid values (non-numeric start or <=0). This avoids surprising failures from minor typos.

Applied to files:

  • test/optimized-node.test.ts
📚 Learning: 2026-01-20T16:01:14.323Z
Learnt from: bbopen
Repo: bbopen/tywrap PR: 152
File: docs/adr/002-bridge-protocol.md:599-602
Timestamp: 2026-01-20T16:01:14.323Z
Learning: In the tywrap repository, bridges currently use implicit ready detection: the first successful response from the Python subprocess proves the bridge is ready to handle requests.

Applied to files:

  • test/runtime_node.test.ts
🧬 Code graph analysis (3)
test/optimized-node.test.ts (1)
src/runtime/errors.ts (1)
  • BridgeProtocolError (13-13)
test/runtime_node.test.ts (1)
src/runtime/node.ts (1)
  • NodeBridge (279-525)
src/runtime/node.ts (3)
src/runtime/errors.ts (2)
  • BridgeProtocolError (13-13)
  • BridgeExecutionError (31-33)
src/runtime/worker-pool.ts (1)
  • PooledWorker (47-53)
src/runtime/transport.ts (1)
  • PROTOCOL_ID (20-20)
🔇 Additional comments (5)
src/runtime/worker-pool.ts (1)

412-426: Good warmup-failure cleanup in worker creation.

Transport disposal on onWorkerReady failure plus deferring this.workers.push(worker) until success correctly prevents retaining partially initialized workers.

src/runtime/node.ts (1)

21-25: Warmup normalization and constructor wiring look solid.

The new normalization path is fail-fast, gives explicit migration guidance for legacy warmup shape, and cleanly feeds a strict internal warmup command model into initialization.

Also applies to: 123-130, 198-236, 298-315, 535-546

test/runtime_node.test.ts (1)

9-9: Warmup integration tests are strong and targeted.

The new cases validate both successful per-worker warmup and failure surfacing/readiness behavior, with solid cleanup in finally.

Also applies to: 706-795

test/worker-pool.test.ts (1)

289-307: Great failure-path regression test for worker warmup.

This test captures the key contract: reject acquire, dispose transport, and keep workerCount at zero when onWorkerReady fails.

test/optimized-node.test.ts (1)

7-7: Constructor tests correctly enforce the warmup shape transition.

Accepting { module, functionName, args } while rejecting legacy { method, params } with BridgeProtocolError is exactly the right compatibility posture for this PR.

Also applies to: 74-87

@bbopen bbopen dismissed coderabbitai[bot]’s stale review March 10, 2026 21:34

Superseded by commit 987c7e8; all review findings are addressed.

@bbopen bbopen force-pushed the codex/fix-node-warmup-protocol branch from 987c7e8 to 05331c7 Compare March 10, 2026 21:37
@bbopen bbopen merged commit 30f4a40 into main Mar 10, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:runtime-node Area: Node runtime bridge bug Something isn't working enhancement New feature or request priority:p2 Priority P2 (medium)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant