fix: tolerate non-JSON output in JSONL subprocess parser#829
fix: tolerate non-JSON output in JSONL subprocess parser#829dkpoulsen wants to merge 3 commits intoAutoMaker-Org:v1.0.0rcfrom
Conversation
Agents may run commands like bun/npm install that output plain text to
stdout. Previously, this caused JSON parse errors that failed the entire
feature execution. Now non-JSON lines are logged but skipped gracefully.
Only lines starting with { or [ are attempted to be parsed as JSON.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness of subprocess management by making the JSONL parsing mechanism more tolerant of unexpected output. It addresses a critical issue where plain text output from CLI tools like Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughAdds tolerant JSONL parsing to Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ChildProc as Child Process
participant Parser
participant Logger
Client->>ChildProc: spawn process (cmd, args, env, cwd)
activate ChildProc
ChildProc-->>Parser: stdout lines (streamed)
Parser->>Parser: isJsonCandidate(line)?
alt JSON candidate
Parser->>Parser: JSON.parse(line)
Parser-->>Client: emit parsed object
else Non-JSON
Parser->>Logger: sanitize & truncate line
Logger-->>Client: log Non-JSON output (mark "(final)" if last)
end
ChildProc-->>Parser: EOF / final partial line
Parser->>Logger: handle final-line as above
ChildProc-->>Client: exit code / stderr
deactivate ChildProc
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
libs/platform/src/subprocess.ts (1)
197-204: Consider extracting JSON-candidate filtering into one helper.The same parse-gate/logging logic appears in both the chunk loop and final-buffer flush path, which can drift over time.
♻️ Minimal refactor sketch
+ const queueJsonIfCandidate = (rawLine: string, isFinal = false) => { + const trimmed = rawLine.trim(); + if (!trimmed) return; + const firstChar = trimmed[0]; + if (firstChar !== '{' && firstChar !== '[') { + console.log( + isFinal + ? `[SubprocessManager] Non-JSON output (final): ${trimmed}` + : `[SubprocessManager] Non-JSON output: ${trimmed}` + ); + return; + } + try { + eventQueue.push(JSON.parse(trimmed)); + } catch (parseError) { + console.error( + isFinal + ? `[SubprocessManager] Failed to parse final JSONL line: ${trimmed}` + : `[SubprocessManager] Failed to parse JSONL line: ${trimmed}`, + parseError + ); + eventQueue.push({ type: 'error', error: `Failed to parse output: ${trimmed}` }); + } + }; ... - for (const line of lines) { - const trimmed = line.trim(); - if (!trimmed) continue; - const firstChar = trimmed[0]; - if (firstChar !== '{' && firstChar !== '[') { - console.log(`[SubprocessManager] Non-JSON output: ${trimmed}`); - continue; - } - try { - eventQueue.push(JSON.parse(trimmed)); - } catch (parseError) { - ... - } - } + for (const line of lines) queueJsonIfCandidate(line); ... - if (lineBuffer.trim()) { - const trimmed = lineBuffer.trim(); - const firstChar = trimmed[0]; - if (firstChar === '{' || firstChar === '[') { - ... - } else { - console.log(`[SubprocessManager] Non-JSON output (final): ${trimmed}`); - } - } + if (lineBuffer.trim()) queueJsonIfCandidate(lineBuffer, true);Also applies to: 233-249
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libs/platform/src/subprocess.ts` around lines 197 - 204, Extract the JSON-candidate detection and logging into a small helper (e.g., isJsonCandidate(line: string): boolean and/or logNonJson(line: string)): move the logic that computes trimmed, checks firstChar !== '{' && firstChar !== '[' and logs `[SubprocessManager] Non-JSON output: ${trimmed}` into that helper, then call it from both the chunk-processing loop and the final-buffer flush path so both places use the same gate; update the references in the methods using the trimmed variable (the chunk loop and final flush in subprocess.ts) to call the helper and continue/skip when it returns false to avoid duplicated logic and drift.libs/platform/tests/subprocess.test.ts (1)
150-175: Optional: add one case for non-JSON final partial line (no trailing\n).That would directly exercise the
(final)non-JSON path in the parser and prevent regressions there.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libs/platform/tests/subprocess.test.ts` around lines 150 - 175, Add a new test exercising the parser's "(final)" non-JSON path by creating a mock process via createMockProcess whose last stdout chunk is a non-JSON partial line without a trailing "\n" (e.g., 'npm WARN something' as the final string) and use vi.mocked(cp.spawn).mockReturnValue(mockProcess) with spawnJSONLProcess(baseOptions) and collectAsyncGenerator to consume it; assert that only valid JSONL entries are yielded (via collectAsyncGenerator results) and that consoleSpy.log was called with a message containing "[SubprocessManager] Non-JSON output" for that final partial line, referencing the existing test setup and functions spawnJSONLProcess, createMockProcess, collectAsyncGenerator, and cp.spawn.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@libs/platform/src/subprocess.ts`:
- Around line 197-204: Extract the JSON-candidate detection and logging into a
small helper (e.g., isJsonCandidate(line: string): boolean and/or
logNonJson(line: string)): move the logic that computes trimmed, checks
firstChar !== '{' && firstChar !== '[' and logs `[SubprocessManager] Non-JSON
output: ${trimmed}` into that helper, then call it from both the
chunk-processing loop and the final-buffer flush path so both places use the
same gate; update the references in the methods using the trimmed variable (the
chunk loop and final flush in subprocess.ts) to call the helper and
continue/skip when it returns false to avoid duplicated logic and drift.
In `@libs/platform/tests/subprocess.test.ts`:
- Around line 150-175: Add a new test exercising the parser's "(final)" non-JSON
path by creating a mock process via createMockProcess whose last stdout chunk is
a non-JSON partial line without a trailing "\n" (e.g., 'npm WARN something' as
the final string) and use vi.mocked(cp.spawn).mockReturnValue(mockProcess) with
spawnJSONLProcess(baseOptions) and collectAsyncGenerator to consume it; assert
that only valid JSONL entries are yielded (via collectAsyncGenerator results)
and that consoleSpy.log was called with a message containing
"[SubprocessManager] Non-JSON output" for that final partial line, referencing
the existing test setup and functions spawnJSONLProcess, createMockProcess,
collectAsyncGenerator, and cp.spawn.
There was a problem hiding this comment.
Code Review
This pull request makes the JSONL subprocess parser more robust by gracefully handling non-JSON output, preventing crashes from noisy command-line tools. However, the newly introduced logging of non-JSON lines to the console poses a risk of leaking sensitive information (e.g., tokens, PII) into the server logs. It is recommended to sanitize these logs or only enable them in a debug environment. Additionally, there's an opportunity to reduce code duplication by refactoring the line processing logic into a shared helper function to improve maintainability.
libs/platform/src/subprocess.ts
Outdated
| }); | ||
| } | ||
| } else { | ||
| console.log(`[SubprocessManager] Non-JSON output (final): ${trimmed}`); |
There was a problem hiding this comment.
Logging the final line of subprocess output when it fails to parse as JSON can leak sensitive information into the server logs. This line should be sanitized or logged only in debug mode. Additionally, the logic for processing the final line buffer in this block is very similar to the logic for processing lines from stdout in the data event handler (lines 194-218), leading to code duplication. lineBuffer.trim() is also called twice here. Consider extracting the line processing logic into a single helper function to centralize it and resolve the redundant trim() call.
libs/platform/src/subprocess.ts
Outdated
| const firstChar = trimmed[0]; | ||
| if (firstChar !== '{' && firstChar !== '[') { | ||
| // Non-JSON output (e.g., "bun install v1.3.10") - log but don't error | ||
| console.log(`[SubprocessManager] Non-JSON output: ${trimmed}`); |
There was a problem hiding this comment.
The application logs untrusted output from a subprocess directly to the console. If the subprocess (e.g., npm install, bun install) outputs sensitive information such as API keys, authentication tokens, or Personally Identifiable Information (PII) in its stdout/stderr, this information will be recorded in the server logs. Additionally, this can lead to Log Injection if an attacker can control the subprocess output to inject malicious log entries.
Consider masking sensitive patterns (e.g., tokens) or only enabling this logging when a specific debug flag is active.
|
Thanks for this! Please target the v1.0.0rc branch instead of main |
…SON path - Extract isJsonCandidate() and logNonJsonOutput() helpers to avoid duplicated logic between chunk-processing loop and final-buffer flush - Add test case for partial non-JSON line at end of stream (no trailing newline) - Ensures consistent behavior and prevents drift between both code paths Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
libs/platform/tests/subprocess.test.ts (1)
177-207: Refactor duplicated mock-process wiring in final-line test.Line 179–Line 190 re-implements process/stream setup already present in
createMockProcess, which makes future test updates easy to miss in one path. Consider extendingcreateMockProcesswith a “raw stdout chunks / no trailing newline” mode and reusing it here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libs/platform/tests/subprocess.test.ts` around lines 177 - 207, The test "should log non-JSON output (final) for partial lines at end of stream" duplicates the mock process/stream wiring; refactor by extending the existing createMockProcess helper to accept a mode/option (e.g., rawFinalChunk or noTrailingNewline) that configures stdout to emit provided chunks without a trailing '\n' and to end the stream, then update this test to call createMockProcess({ noTrailingNewline: true }) (or similar option name) and remove the inline Readable/stdout/stderr setup and cp.spawn mock wiring so the test reuses the common helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@libs/platform/src/subprocess.ts`:
- Around line 24-27: The logNonJsonOutput function currently prints raw
subprocess text; instead sanitize control/ANSI sequences and truncate the
payload before logging to prevent secret leaks and log forging. Update
logNonJsonOutput(trimmed: string, isFinal: boolean) to first remove or escape
non-printable/control characters and strip ANSI escape sequences, then cap the
length (e.g., 200 chars) and append a clear truncation marker (like
"…[truncated]") before composing the console message including the existing
suffix; ensure you still preserve a short safe preview of the original string
for debugging.
---
Nitpick comments:
In `@libs/platform/tests/subprocess.test.ts`:
- Around line 177-207: The test "should log non-JSON output (final) for partial
lines at end of stream" duplicates the mock process/stream wiring; refactor by
extending the existing createMockProcess helper to accept a mode/option (e.g.,
rawFinalChunk or noTrailingNewline) that configures stdout to emit provided
chunks without a trailing '\n' and to end the stream, then update this test to
call createMockProcess({ noTrailingNewline: true }) (or similar option name) and
remove the inline Readable/stdout/stderr setup and cp.spawn mock wiring so the
test reuses the common helper.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 47964c6b-ebbe-4f81-98fc-4ee2c13a11d1
📒 Files selected for processing (2)
libs/platform/src/subprocess.tslibs/platform/tests/subprocess.test.ts
- Add sanitizeForLog() helper to strip ANSI escape sequences - Remove control characters that could enable log forging - Truncate output to 200 chars with '…[truncated]' marker - Add tests for ANSI stripping and truncation behavior Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
libs/platform/src/subprocess.ts (1)
15-19: Consider trimming before checking the first character.
isJsonCandidatechecks the first character of the input, but the input may have leading whitespace. Whileline.trim()is called before invoking this function in the current code paths (Lines 237, 272), the helper itself doesn't enforce this. If called with untrimmed input, it would returnfalsefor valid JSON like" {}".This is fine given current usage, but consider either documenting the precondition or trimming internally for defensive robustness.
♻️ Optional: Make the helper self-contained
function isJsonCandidate(line: string): boolean { if (!line) return false; - const firstChar = line[0]; + const firstChar = line.trimStart()[0]; return firstChar === '{' || firstChar === '['; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libs/platform/src/subprocess.ts` around lines 15 - 19, The isJsonCandidate helper currently inspects the first character of the provided line without trimming, so leading whitespace will make valid JSON like " {}" return false; update the function isJsonCandidate in subprocess.ts to defensively trim the input (e.g., call line = line.trim()) before checking length and the first character, and ensure it still returns false for empty/undefined input; alternatively, if you prefer not to change behavior, add a clear comment on isJsonCandidate documenting the requirement that callers must pass a trimmed string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@libs/platform/src/subprocess.ts`:
- Around line 15-19: The isJsonCandidate helper currently inspects the first
character of the provided line without trimming, so leading whitespace will make
valid JSON like " {}" return false; update the function isJsonCandidate in
subprocess.ts to defensively trim the input (e.g., call line = line.trim())
before checking length and the first character, and ensure it still returns
false for empty/undefined input; alternatively, if you prefer not to change
behavior, add a clear comment on isJsonCandidate documenting the requirement
that callers must pass a trimmed string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1269d473-dd2e-47cc-9858-a624809b6b50
📒 Files selected for processing (2)
libs/platform/src/subprocess.tslibs/platform/tests/subprocess.test.ts
Agents may run commands like bun/npm install that output plain text to stdout. Previously, this caused JSON parse errors that failed the entire feature execution. Now non-JSON lines are logged but skipped gracefully.
Only lines starting with { or [ are attempted to be parsed as JSON.
Summary by CodeRabbit
New Features
Bug Fixes
Tests