Replace nested Stream.concat with direct iteration in expression merging#6725
Open
Replace nested Stream.concat with direct iteration in expression merging#6725
Conversation
…he previous value associated with key
2bf8d81 to
e40e293
Compare
|
dagnir
reviewed
Feb 25, 2026
| mergeValuesInto(merged, action.expressionValues()); | ||
| } | ||
|
|
||
| return merged.isEmpty() ? Collections.emptyMap() : Collections.unmodifiableMap(merged); |
Contributor
There was a problem hiding this comment.
Is this necessary? Why not just returned Collections.unmodifiableMap(merged) unconditionally?
Contributor
Author
There was a problem hiding this comment.
It's not necessary but we can avoid wrapping an empty map with UnmodifiableMap this way.
Comment on lines
+183
to
+184
| String.format("Attempt to coalesce two expressions with conflicting expression names. " | ||
| + "Expression name key = '%s'", key)); |
Contributor
Author
There was a problem hiding this comment.
No, but I can see why you'd think that. It's a bit hard to read the diff.
In the current code streamOfExpressionNames calls joinNames() which has the same validation.
In the new code I just moved the validation here.
We also have an existing test covering this specific codepath
RanVaknin
added a commit
that referenced
this pull request
Feb 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Overview
This PR targets two bottlenecks in
UpdateExpressionConverter.toExpressionas part of the Update operation request pipeline.toExpressionscales and becomes the most noticable bottleneck the bigger the input is: ~40% cpu time in SMALL datasets, and ~74% in HUGE_FLAT datasets.Interactive Flamegraph
Testing
We have fairly good test coverage already:
convert_mixedActions_uniqueNameTokensandconvert_mixedActionsverify the string output for correctness. This PR changes the underlying implementation but the API surface remains unchanged.1. String.format overhead (~18% CPU time)
groupExpressionsuses String.format to build the update expression for each attribute. Can replace with a StringJoiner which avoids the formatter's overhead.2. Nested
Stream.concat()inmergeExpressionNames()andmergeExpressionValues()(15-50% CPU time)Current implementation:
The problem is that
Stream.reduce()operation callsExpression.joinNames()for each action type (SET, ADD, REMOVE, DELETE). This method creates a new HashMap and copies all existing entries on every iteration:Example: With N attributes distributed across 4 action types, Stream.reduce() creates 3 intermediate HashMaps. For example, with 20 attributes (5 per action type), it copies 5 entries, then 10, then 15, before producing the final 20 entry result. The cost scales poorly with dataset size.
Instead we can iterate directly on each action and use a single HashMap:
This eliminates intermediate HashMap allocations and theredundant entry copying. The same proposed optimization also applies to
mergeExpressionValues().Results:
toExpressionwent from ~40% CPU time to ~15%groupExpressionswent from 18% CPU time to ~2.5%mergeExpressionValuesandmergeExpressionNameswent down from 18% to ~4%