[SPARK-56007][CONNECT] Fix ArrowDeserializer to use positional binding for rows by hvanhovell · Pull Request #54832 · apache/spark

hvanhovell · 2026-03-16T20:09:22Z

What changes were proposed in this pull request?

This PR switches RowEncoder deserialization in the Spark Connect Scala client from name-based lookup to positional binding to correctly handle duplicate column names.

Why are the changes needed?

The Spark Connect Scala client can't handle with rows with duplicate column names. This is regression w.r.t. classic.

Does this PR introduce any user-facing change?

Yes. It fixes a bug.

How was this patch tested?

I added tests to ArrowEncoderSuite.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code v2.1.76

…g for RowEncoder and validate schema Switch RowEncoder deserialization from name-based lookup to positional binding to correctly handle duplicate column names. Add field-count and field-name mismatch error conditions with new tests. Co-authored-by: Isaac

Co-authored-by: Isaac

HyukjinKwon

LGTM cc @zhengruifeng

Fix the `bind to schema` test: - Correct `wideSchemaEncoder` (remove stray `a: int` field) - Fix narrow schema field order (C before d) and element struct fields (da, db not da, dc) - Supply complete wide-schema input rows (include dc boolean in d elements) - Correct expected output to match narrow schema projection - Add try/finally to ensure both iterators are always closed - Fix `unknown field` to expect `SparkRuntimeException` not `AnalysisException` Co-authored-by: Isaac

hvanhovell added 2 commits March 16, 2026 19:29

Merge apache/master into SPARK-56007

b563796

Co-authored-by: Isaac

HyukjinKwon approved these changes Mar 16, 2026

View reviewed changes

hvanhovell added 2 commits March 19, 2026 22:15

Fixes...

ef09595

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56007][CONNECT] Fix ArrowDeserializer to use positional binding for rows#54832

[SPARK-56007][CONNECT] Fix ArrowDeserializer to use positional binding for rows#54832
hvanhovell wants to merge 4 commits intoapache:masterfrom
hvanhovell:SPARK-56007

hvanhovell commented Mar 16, 2026

Uh oh!

HyukjinKwon left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hvanhovell commented Mar 16, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants