Skip to content

chore: run Spark 3.4 tests with native_datafusion scan [WIP]#3722

Draft
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:spark-sql-native-datafusion-3.4
Draft

chore: run Spark 3.4 tests with native_datafusion scan [WIP]#3722
andygrove wants to merge 2 commits intoapache:mainfrom
andygrove:spark-sql-native-datafusion-3.4

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Mar 17, 2026

Which issue does this PR close?

N/A - CI enablement

Rationale for this change

The native_datafusion Spark SQL tests were already running for Spark 3.5 but not for Spark 3.4. Adding 3.4 coverage revealed test failures that need to be skipped with the appropriate ignore tags.

What changes are included in this PR?

  • Add Spark 3.4.3 to the native_datafusion CI workflow
  • Update dev/diffs/3.4.3.diff to tag failing tests with IgnoreCometNativeDataFusion and IgnoreCometNativeScan, matching tags already applied in the 3.5.8 diff plus 3 tests specific to Spark 3.4:
    • FileBasedDataSourceSuite - "Spark native readers should respect spark.sql.caseSensitive"
    • ParquetIOSuite - "SPARK-35640: read binary as timestamp should throw schema incompatible error"
    • ParquetIOSuite - "SPARK-35640: int as long should throw schema incompatible error"
    • ParquetQuerySuite - "SPARK-36182: can't read TimestampLTZ as TimestampNTZ"
    • ParquetQuerySuite - "SPARK-34212 Parquet should read decimals correctly"
    • ParquetQuerySuite - "row group skipping doesn't overflow when reading into larger type"
    • ParquetSchemaSuite - "schema mismatch failure error message for parquet vectorized reader"
    • ParquetSchemaSuite - "SPARK-45604: schema mismatch failure error on timestamp_ntz to array<timestamp_ntz>"
    • ParquetFilterSuite - "filter pushdown - StringPredicate" (IgnoreCometNativeScan)
    • ParquetFilterSuite - "SPARK-25207: exception when duplicate fields in case-insensitive mode"
    • DynamicPartitionPruningSuite - "static scan metrics"
    • ExtractPythonUDFsSuite - "Python UDF should not break column pruning/filter pushdown -- Parquet V1"

How are these changes tested?

By running the Spark SQL native_datafusion tests in CI for Spark 3.4.3.

Tag tests with IgnoreCometNativeDataFusion and IgnoreCometNativeScan
to match tags already applied in the 3.5.8 diff, plus 3 tests that
are specific to Spark 3.4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant