(cleanup) remove Python 2 remaining items by mykaul · Pull Request #727 · scylladb/python-driver

mykaul · 2026-03-04T19:57:23Z

Pre-review checklist

This is 100% OpenCode's work. So take it with a grain of salt, and I need to go over it. I can also cherry-pick each one separately. I've asked it to separate as much as possible to independent items.

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

In Python 3, iterator objects do not have a .next() method; the built-in next() function must be used instead. The call at line 355 of test_class_construction.py used the Python 2 pattern iter(...).next(), which would raise AttributeError if ever reached at runtime. Currently the test passes only because CQLEngineException is raised before .next() is called, but this is fragile: if the exception timing changes, the test would fail with AttributeError instead of the expected CQLEngineException. Replace with next(iter(...)) for correct Python 3 usage.

In Python 3, calling str.encode() can only raise UnicodeEncodeError, never UnicodeDecodeError. The except UnicodeDecodeError branches in AsciiType.serialize and UTF8Type.serialize were leftover from Python 2, where str.encode() could trigger an implicit decode of a byte string. These dead except branches silently masked the intended behavior. In Python 3, if the input is already bytes there is no .encode() to call, so the original code would raise AttributeError rather than returning the value as-is. Replace the try/except pattern with explicit isinstance(var, bytes) checks, which correctly handles both str and bytes inputs on Python 3.

In Python 2, filter() returned a list. In Python 3, it returns a lazy iterator that can only be consumed once. The column_aliases variable assigned from filter() at metadata.py:2273 may be iterated multiple times downstream (e.g., for length checks and enumeration), which would silently produce empty results on the second pass. Wrap the filter() call in list() to ensure the result is a concrete list that supports repeated iteration, indexing, and len().

The module-level "".encode("utf8") call was a workaround for CPython bug #10923, where importing the utf8 codec for the first time in a background thread could cause a deadlock due to the import lock. This bug was fixed in CPython 3.3 (2012), and the driver now requires Python 3.9+. The workaround is dead code that serves no purpose and confuses readers.

Python 3 uses __str__ for string representation; __unicode__ was the Python 2 equivalent for unicode strings. The UnicodeMixin base class currently bridges the two by wiring __str__ to call __unicode__(), but this indirection is unnecessary on Python 3. Rename all 18 __unicode__ method definitions in statements.py directly to __str__, and update the one direct __unicode__() call in __repr__ to __str__(). The __str__ definitions on the subclasses now take precedence over the inherited UnicodeMixin.__str__ lambda, so behavior is unchanged.

Rename the 2 __unicode__ methods in AbstractQueryableColumn and ModelQuerySet to __str__. Remove the redundant __str__ wrapper in ModelQuerySet that existed solely to bridge __str__ -> __unicode__ for Python 2 compatibility. In Python 3, __str__ is the canonical string representation method; __unicode__ served that role in Python 2. The indirection through UnicodeMixin is no longer needed for these classes.

…rs, named Rename the remaining 5 __unicode__ method definitions across cqlengine/models.py (ColumnQueryEvaluator), cqlengine/functions.py (QueryValue, Token), cqlengine/operators.py (BaseQueryOperator), and cqlengine/named.py (NamedColumn) to __str__. This is part of the systematic removal of the Python 2 UnicodeMixin pattern. The __str__ definitions on each class now take precedence over the inherited UnicodeMixin.__str__ lambda, so behavior is unchanged.

UnicodeMixin was a Python 2/3 compatibility shim that wired __str__ to call __unicode__(). Now that all __unicode__ methods have been renamed to __str__ in prior commits, UnicodeMixin serves no purpose. - Delete the UnicodeMixin class from cassandra/cqlengine/__init__.py - Remove UnicodeMixin from the inheritance list of 6 classes: ValueQuoter, BaseClause, BaseCQLStatement, AbstractQueryableColumn, QueryValue, BaseQueryOperator - Remove all 'from cassandra.cqlengine import UnicodeMixin' imports The classes now define __str__ directly, which is the standard Python 3 approach for string representation.

absolute_import became the default behavior in Python 3.0. These imports were needed in Python 2 to prevent relative imports from shadowing stdlib modules (e.g., 'import io' resolving to a local module instead of the stdlib). Since the driver requires Python 3.9+, these are dead code. Removed from: cassandra/protocol.py, cassandra/cqltypes.py, cassandra/connection.py, cassandra/cluster.py, and tests/integration/cqlengine/query/test_queryset.py.

WeakSet has been available in the weakref module since Python 2.7+ and all Python 3 versions. The try/except ImportError fallback to cassandra.util.WeakSet was unreachable dead code on Python 3. - Replace try/except with direct 'from weakref import WeakSet' in cluster.py, pool.py, and io/asyncorereactor.py - Delete the ~210-line custom WeakSet class and its _IterationGuard helper from cassandra/util.py - Remove the now-unused 'from _weakref import ref' import

Copilot

Pull request overview

This PR continues the Python 2 cleanup by removing remaining unicode/UnicodeMixin compatibility shims and updating tests/docs/code paths to assume Python 3-only semantics (the project now requires Python >=3.9).

Changes:

Removes Python 2-era unicode patterns (u'', __unicode__, UnicodeMixin) and normalizes string handling across driver and cqlengine.
Simplifies Python-version conditionals/fallback imports (e.g., WeakSet imports) and applies formatting-only refactors in several modules/tests.
Updates unit/integration tests and Sphinx config to reflect Python 3-only behavior and representations.

Reviewed changes

Copilot reviewed 34 out of 37 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/unit/test_types.py	Replaces `u''` literals with plain `str` in type read/write tests.
tests/unit/test_row_factories.py	Removes Python 3.0–3.6 conditional expectations for namedtuple creation.
tests/unit/test_orderedmap.py	Updates unicode-key tests to Python 3 `str` keys.
tests/unit/test_metadata.py	Replaces `u''` literals with `str` in metadata CQL export tests.
tests/unit/test_marshalling.py	Updates UTF8/unicode expectations to Python 3 `str` and cleans up ordered map inserts.
tests/unit/advanced/test_insights.py	Removes Python-version-specific namespace logic and reformats expected dicts.
tests/integration/standard/test_query.py	Updates unicode query strings/column names to Python 3 `str`.
tests/integration/standard/test_cluster.py	Updates expected row tuples to Python 3 `str`.
tests/integration/cqlengine/model/test_udts.py	Updates unicode literals to Python 3 `str`.
tests/integration/cqlengine/model/test_model_io.py	Updates unicode literals to Python 3 `str` in model IO assertions.
tests/integration/cqlengine/model/test_class_construction.py	Mostly formatting + Python 3 iterator usage (`next(iter(...))`) and string literal normalization.
tests/integration/cqlengine/columns/test_validation.py	Removes old Python-version branches and normalizes string usage/formatting in validation tests.
setup.py	Removes Python 2-era subprocess gating and refactors extension/doc build setup logic.
docs/conf.py	Normalizes string literals and formatting in Sphinx configuration.
cassandra/query.py	Python 3 string/formatting cleanup; keeps namedtuple fallback paths but modernizes literals/layout.
cassandra/pool.py	Removes legacy `WeakSet` fallback and reformats/shard-aware related code blocks.
cassandra/io/asyncorereactor.py	Removes legacy `WeakSet` fallback and modernizes literals/formatting.
cassandra/encoder.py	Deprecates Python 2 “unicode” semantics and standardizes encoding/quoting behavior for Python 3.
cassandra/datastax/graph/query.py	Python 3 string/formatting cleanup and minor readability refactors.
cassandra/datastax/graph/graphson.py	Python 3 cleanup + formatting; updates docs/comments describing supported Python types.
cassandra/datastax/graph/fluent/_query.py	Python 3 string/formatting cleanup and improves readability of traversal query generation.
cassandra/cqlengine/statements.py	Removes `UnicodeMixin` usage and converts `__unicode__` implementations to `__str__`.
cassandra/cqlengine/operators.py	Removes `UnicodeMixin` usage and converts operator stringification to `__str__`.
cassandra/cqlengine/named.py	Converts `__unicode__` to `__str__` and normalizes string literals.
cassandra/cqlengine/models.py	Removes `UnicodeMixin` usage and normalizes string literals/formatting across model machinery.
cassandra/cqlengine/functions.py	Removes `UnicodeMixin` usage and converts `__unicode__` to `__str__`.
cassandra/cqlengine/init.py	Removes `UnicodeMixin` definition entirely.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

setup.py

cassandra/datastax/graph/graphson.py

tests/unit/test_row_factories.py

tests/unit/test_orderedmap.py

The subprocess module has been part of Python's standard library since Python 2.4. The try/except ImportError guard and has_subprocess flag were unreachable dead code that added unnecessary indentation and complexity to the doc-building logic. Replace with a direct 'import subprocess' and remove the conditional guard around the documentation build steps.

Python 3 uses __bool__ for truth-value testing; __nonzero__ was the Python 2 equivalent. The code previously defined __nonzero__ and aliased __bool__ = __nonzero__ for cross-compatibility. Since Python 3 never calls __nonzero__, rename the method directly to __bool__ and remove the alias.

This integration test was permanently dead code: it contained a guard 'if sys.version_info[0:2] != (2, 7): raise SkipTest(...)' which means it was always skipped on Python 3. The skip reason stated that the test compares static strings from dict items whose ordering is not deterministic on Python 3. Since the driver no longer supports Python 2, and fixing the test to use order-independent comparison is a separate concern, remove the permanently-skipped test entirely.

Two code blocks in test_validation.py were guarded by 'if sys.version_info < (3, 1):', making them unreachable on Python 3. The blocks used unichr() (a Python 2 builtin that does not exist in Python 3) and u'' string prefixes for unicode validation tests. In Python 3, chr() already returns a unicode character and all strings are unicode, so the adjacent chr(233) tests already cover the same functionality. Remove the dead blocks entirely.

…3.9+ Three version guards were left over from Python 2/3.x compatibility: 1. cluster.py: 'if sys.version_info[0] >= 3 and sys.version_info[1] >= 7' guarded the Eventlet/futurist ThreadPoolExecutor workaround. Since the driver requires 3.9+, this is always True. Removed the guard, dedented the body, and updated the docstring and error message to drop the 'Python 3.7+' qualifier (the issue is inherent to Eventlet, not a version-specific regression). 2. test_row_factories.py: NAMEDTUPLE_CREATION_BUG was defined as 'sys.version_info >= (3,) and sys.version_info < (3, 7)', which is always False on 3.9+. The test's dead branch tested a warning path that can never trigger. Removed the constant, the dead branch, the unused 'sys' import, and simplified the test to just verify long column lists work. 3. test_insights.py: 'if sys.version_info > (3,)' guarded a namespace suffix that is always needed on Python 3. Removed the guard and the now-unused 'sys' import. All 608 unit tests pass.

On Python 3, the u'' prefix is a no-op since all strings are already unicode. These prefixes were left over from Python 2 compatibility and add visual noise without any semantic effect. Removed u'' prefixes from: - cassandra/query.py: __str__ methods for SimpleStatement, PreparedStatement, BoundStatement, BatchStatement, and a docstring example showing OrderedMapSerializedKey output - cassandra/datastax/graph/query.py: GraphStatement.__str__ - cassandra/datastax/graph/fluent/_query.py: TraversalBatch.__str__ and as_graph_statement query construction - docs/conf.py: project name and copyright strings All 608 unit tests pass.

On Python 3, the u'' prefix is a no-op since all strings are already unicode. These prefixes were left over from Python 2 compatibility. Removed 67 u-prefix occurrences across 9 test files: - tests/unit/test_types.py (3) - tests/unit/test_orderedmap.py (3) - tests/unit/test_marshalling.py (6) - tests/unit/test_metadata.py (27) - tests/integration/standard/test_types.py (4) - tests/integration/standard/test_query.py (13) - tests/integration/standard/test_cluster.py (8) - tests/integration/cqlengine/model/test_udts.py (1) - tests/integration/cqlengine/model/test_model_io.py (2) All 608 unit tests pass.

Several comments and docstrings still referenced Python 2 concepts that no longer apply now that the driver requires Python 3.9+: - encoder.py: Updated cql_encode_unicode() docstring to note it is unused since Python 2 removal (str is always unicode on Python 3). Also fixed the method body: it was calling val.encode('utf-8') which on Python 3 converts str to bytes, producing wrong output. Now it passes val directly to cql_quote. - metadata.py: Changed 'will always be a unicode' to 'will always be a str' (line 2155). Updated unhexlify comment to say 'str input' instead of 'unicode input' and fixed typo 'everythin' (line 2350). - graphson.py: Removed '(PY2)'/'(PY3)' qualifiers from the type mapping table. Updated 'long' to 'int' for varint, 'str (unicode)' to 'str' for inet, removed 'buffer (PY2)' from blob entries. - util.py: Updated comment on _positional_rename_invalid_identifiers to remove stale 'Python 2.6' reference. - asyncorereactor.py: Removed stale 'TODO: Remove when Python 2 support is removed' since Python 2 support has been removed. The guard itself is still needed for interpreter shutdown scenarios. All 608 unit tests pass.

mykaul · 2026-03-05T17:20:31Z

Fixed all comments.

mykaul · 2026-03-05T17:21:03Z

@copilot code review[agent] - please re-review

mykaul added 10 commits March 4, 2026 18:58

mykaul marked this pull request as draft March 4, 2026 19:57

mykaul requested a review from Copilot March 4, 2026 19:57

Copilot started reviewing on behalf of mykaul March 4, 2026 19:58 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

mykaul added 8 commits March 5, 2026 18:34

mykaul force-pushed the python_2_no_more branch from 172606e to 59b6813 Compare March 5, 2026 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(cleanup) remove Python 2 remaining items#727

(cleanup) remove Python 2 remaining items#727
mykaul wants to merge 18 commits intoscylladb:masterfrom
mykaul:python_2_no_more

mykaul commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Mar 5, 2026

Uh oh!

mykaul commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mykaul commented Mar 4, 2026

Pre-review checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Mar 5, 2026

Uh oh!

mykaul commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants