[Repo Assist] perf: bulk-append unescaped character runs in JSON string parser#1715
Draft
github-actions[bot] wants to merge 2 commits intomainfrom
Draft
Conversation
Replace per-character StringBuilder.Append(char) calls with StringBuilder.Append(string, start, length) for consecutive runs of unescaped characters in the JSON string parser. For strings with no escape sequences (the common case in real-world JSON), this reduces the number of Append calls from O(n) to O(1), which should meaningfully speed up JsonValue.Parse on workloads with many string values. The approach mirrors what JsonStringEncodeTo already does for serialisation: track a chunk start position and flush accumulated characters as a bulk substring when an escape or closing quote is hit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 This is an automated pull request from Repo Assist, an AI assistant for this repository.
Summary
Optimises the JSON string parser (
parseStringinJsonValue.fs) to avoid per-characterStringBuilder.Appendcalls by batching consecutive unescaped characters into a singleAppend(string, start, length)call.Motivation
The inner loop of
parseStringpreviously calledbuf.Append(s.[i])for every unescaped character. For a 100-character string value with no escape sequences that means 100Appendmethod calls — each with its own bounds-check, capacity-check, and write.With this change, an unescaped run is tracked by a single
chunkStartmutable. When an escape sequence (or the closing") is reached, the whole run is flushed in onebuf.Append(s, chunkStart, length)call. For the common case of strings with no escapes at all the total number ofAppendcalls drops from O(n) to O(1).Precedent
JsonStringEncodeTo(the serialisation counterpart) already uses exactly this chunk-flush pattern. This change applies the same technique symmetrically to the parser.Change details
src/FSharp.Data.Json.Core/JsonValue.fs—parseString: replace character-by-characterAppendwith achunkStart-tracking loop +flushChunkhelper.RELEASE_NOTES.md— version bump to 8.1.4.No public API is changed; this is a pure implementation improvement.
Test Status
dotnet build src/FSharp.Data.Json.Core/FSharp.Data.Json.Core.fsproj— 0 errors, 4 pre-existing warningsdotnet test tests/FSharp.Data.Core.Tests— 2896 passed, 0 failed (including all existing FsCheck property-based round-trip tests forparseString)