Skip to content

Conversation

@aseembits93
Copy link
Contributor

Summary

  • Add comparison support for PyArrow types (Table, RecordBatch, Array, ChunkedArray, Scalar, Schema, Field, DataType)
  • Add pyarrow>=15.0.0 to test dependencies
  • Add comprehensive tests for all PyArrow type comparisons

Test plan

  • All 149 comparator tests pass
  • New PyArrow tests cover equality and inequality for all supported types
  • Existing pandas tests still pass (verified pyarrow checks don't interfere)

🤖 Generated with Claude Code

aseembits93 and others added 2 commits February 4, 2026 10:40
Add comparison support for PyArrow types including Table, RecordBatch,
Array, ChunkedArray, Scalar, Schema, Field, and DataType.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Feb 4, 2026

PR Review Summary

Prek Checks

Passed (after auto-fix)

  • Fixed 4 RUF100 (unused-noqa) errors in codeflash/verification/comparator.py — removed unused # noqa: PGH003 comments from numba imports
  • Committed and pushed as style: auto-fix linting issues

Mypy

⚠️ Pre-existing issues only — no new mypy errors introduced by this PR

  • comparator.py: 27 errors (all no-any-return from comparing Any-typed objects — pre-existing pattern throughout the file)
  • test_comparator.py: ~400 errors (all no-untyped-def — missing return type annotations on test methods, standard for test files)

Code Review

No critical issues found

The PR adds pyarrow type support to the comparator function, following the existing pattern used for numpy, pandas, scipy, torch, etc.:

  • Correct placement: pyarrow checks are placed after scipy sparse and before pandas, avoiding type overlap issues
  • Type coverage: Handles Table, RecordBatch, ChunkedArray, Array, Scalar, Schema, Field, DataType
  • Null handling: Scalar null values are handled explicitly via is_valid checks before calling equals()
  • bool() wrapping: Applied consistently to equals() calls, including fixing the pre-existing pandas equals() return (line 408)
  • Lazy import: pyarrow is imported conditionally inside the HAS_PYARROW guard, consistent with other optional dependencies
  • Good test coverage design: test_pyarrow() tests equality, inequality, type mismatches, nulls, nested types, and list arrays

Test Coverage

File Stmts (main) Stmts (PR) Coverage (main) Coverage (PR) Delta
codeflash/verification/comparator.py 341 378 62% 56% -6%

Note: The coverage decrease is due to the new pyarrow code paths (37 new statements) not being exercised in this test environment — pyarrow is in the tests dependency group but was not installed in the CI runner. The test_pyarrow test is correctly skipped via pytest.skip() when pyarrow is unavailable. When pyarrow IS installed, the test exercises all new code paths (Table, RecordBatch, Array, ChunkedArray, Scalar, Schema, Field, DataType — both equality and inequality). No actual coverage regression on existing code.


Last updated: 2026-02-11

@KRRT7 KRRT7 merged commit 3dd19c6 into main Feb 11, 2026
25 of 28 checks passed
@KRRT7 KRRT7 deleted the pyarrow-comparator branch February 11, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants