Skip to content

fix: sanitize control characters in ToolCall::arguments() before json_decode#937

Open
ruttydm wants to merge 2 commits intoprism-php:mainfrom
ruttydm:fix/sanitize-toolcall-control-characters
Open

fix: sanitize control characters in ToolCall::arguments() before json_decode#937
ruttydm wants to merge 2 commits intoprism-php:mainfrom
ruttydm:fix/sanitize-toolcall-control-characters

Conversation

@ruttydm
Copy link
Copy Markdown

@ruttydm ruttydm commented Mar 2, 2026

Summary

  • Sanitize raw control characters (0x00-0x1F, 0x7F) from tool call argument strings before json_decode in ToolCall::arguments()
  • These bytes are invalid in JSON per RFC 8259 and cause JsonException: Control character error, possibly incorrectly encoded
  • Observed 11 times in production with the DeepSeek streaming provider

Context

Some providers (notably DeepSeek) occasionally include raw control characters in streamed tool call argument JSON. The OpenRouter handler already has defensive handling for malformed arguments — this adds similar protection at the ToolCall level so all providers benefit.

Fixes #936

Copy link
Copy Markdown
Contributor

@sixlive sixlive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address CI errors please

@ruttydm ruttydm requested a review from sixlive March 31, 2026 07:52
ruttydm and others added 2 commits April 3, 2026 23:22
…_decode

Some providers (e.g. DeepSeek) may include raw control characters
(0x00-0x1F, 0x7F) in streamed tool call argument JSON. These bytes are
invalid per RFC 8259 and cause json_decode to throw a JsonException with
"Control character error, possibly incorrectly encoded".

Strip them before decoding. This mirrors the defensive approach already
used in the OpenRouter streaming handler.

Fixes prism-php#936
preg_replace() returns string|null, but json_decode() expects string.
Add explicit (string) cast to satisfy both PHPStan and Rector's
NullToStrictStringFuncCallArgRector rule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ruttydm ruttydm force-pushed the fix/sanitize-toolcall-control-characters branch from d8ed209 to 7e4d3a8 Compare April 3, 2026 21:22
@ruttydm
Copy link
Copy Markdown
Author

ruttydm commented Apr 3, 2026

Rebased on latest main (up to #970). CI doesn't seem to trigger for fork PRs — could a maintainer approve the workflow run?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JsonException: Control character error in ToolCall::arguments() with DeepSeek streaming

2 participants