(cleanup) remove Python 2 remaining items#727
Draft
mykaul wants to merge 18 commits intoscylladb:masterfrom
Draft
(cleanup) remove Python 2 remaining items#727mykaul wants to merge 18 commits intoscylladb:masterfrom
mykaul wants to merge 18 commits intoscylladb:masterfrom
Conversation
In Python 3, iterator objects do not have a .next() method; the built-in next() function must be used instead. The call at line 355 of test_class_construction.py used the Python 2 pattern iter(...).next(), which would raise AttributeError if ever reached at runtime. Currently the test passes only because CQLEngineException is raised before .next() is called, but this is fragile: if the exception timing changes, the test would fail with AttributeError instead of the expected CQLEngineException. Replace with next(iter(...)) for correct Python 3 usage.
In Python 3, calling str.encode() can only raise UnicodeEncodeError, never UnicodeDecodeError. The except UnicodeDecodeError branches in AsciiType.serialize and UTF8Type.serialize were leftover from Python 2, where str.encode() could trigger an implicit decode of a byte string. These dead except branches silently masked the intended behavior. In Python 3, if the input is already bytes there is no .encode() to call, so the original code would raise AttributeError rather than returning the value as-is. Replace the try/except pattern with explicit isinstance(var, bytes) checks, which correctly handles both str and bytes inputs on Python 3.
In Python 2, filter() returned a list. In Python 3, it returns a lazy iterator that can only be consumed once. The column_aliases variable assigned from filter() at metadata.py:2273 may be iterated multiple times downstream (e.g., for length checks and enumeration), which would silently produce empty results on the second pass. Wrap the filter() call in list() to ensure the result is a concrete list that supports repeated iteration, indexing, and len().
The module-level "".encode("utf8") call was a workaround for CPython
bug #10923, where importing the utf8 codec for the first time in a
background thread could cause a deadlock due to the import lock.
This bug was fixed in CPython 3.3 (2012), and the driver now requires
Python 3.9+. The workaround is dead code that serves no purpose and
confuses readers.
Python 3 uses __str__ for string representation; __unicode__ was the Python 2 equivalent for unicode strings. The UnicodeMixin base class currently bridges the two by wiring __str__ to call __unicode__(), but this indirection is unnecessary on Python 3. Rename all 18 __unicode__ method definitions in statements.py directly to __str__, and update the one direct __unicode__() call in __repr__ to __str__(). The __str__ definitions on the subclasses now take precedence over the inherited UnicodeMixin.__str__ lambda, so behavior is unchanged.
Rename the 2 __unicode__ methods in AbstractQueryableColumn and ModelQuerySet to __str__. Remove the redundant __str__ wrapper in ModelQuerySet that existed solely to bridge __str__ -> __unicode__ for Python 2 compatibility. In Python 3, __str__ is the canonical string representation method; __unicode__ served that role in Python 2. The indirection through UnicodeMixin is no longer needed for these classes.
…rs, named Rename the remaining 5 __unicode__ method definitions across cqlengine/models.py (ColumnQueryEvaluator), cqlengine/functions.py (QueryValue, Token), cqlengine/operators.py (BaseQueryOperator), and cqlengine/named.py (NamedColumn) to __str__. This is part of the systematic removal of the Python 2 UnicodeMixin pattern. The __str__ definitions on each class now take precedence over the inherited UnicodeMixin.__str__ lambda, so behavior is unchanged.
UnicodeMixin was a Python 2/3 compatibility shim that wired __str__ to call __unicode__(). Now that all __unicode__ methods have been renamed to __str__ in prior commits, UnicodeMixin serves no purpose. - Delete the UnicodeMixin class from cassandra/cqlengine/__init__.py - Remove UnicodeMixin from the inheritance list of 6 classes: ValueQuoter, BaseClause, BaseCQLStatement, AbstractQueryableColumn, QueryValue, BaseQueryOperator - Remove all 'from cassandra.cqlengine import UnicodeMixin' imports The classes now define __str__ directly, which is the standard Python 3 approach for string representation.
absolute_import became the default behavior in Python 3.0. These imports were needed in Python 2 to prevent relative imports from shadowing stdlib modules (e.g., 'import io' resolving to a local module instead of the stdlib). Since the driver requires Python 3.9+, these are dead code. Removed from: cassandra/protocol.py, cassandra/cqltypes.py, cassandra/connection.py, cassandra/cluster.py, and tests/integration/cqlengine/query/test_queryset.py.
WeakSet has been available in the weakref module since Python 2.7+ and all Python 3 versions. The try/except ImportError fallback to cassandra.util.WeakSet was unreachable dead code on Python 3. - Replace try/except with direct 'from weakref import WeakSet' in cluster.py, pool.py, and io/asyncorereactor.py - Delete the ~210-line custom WeakSet class and its _IterationGuard helper from cassandra/util.py - Remove the now-unused 'from _weakref import ref' import
There was a problem hiding this comment.
Pull request overview
This PR continues the Python 2 cleanup by removing remaining unicode/UnicodeMixin compatibility shims and updating tests/docs/code paths to assume Python 3-only semantics (the project now requires Python >=3.9).
Changes:
- Removes Python 2-era unicode patterns (
u'',__unicode__,UnicodeMixin) and normalizes string handling across driver and cqlengine. - Simplifies Python-version conditionals/fallback imports (e.g.,
WeakSetimports) and applies formatting-only refactors in several modules/tests. - Updates unit/integration tests and Sphinx config to reflect Python 3-only behavior and representations.
Reviewed changes
Copilot reviewed 34 out of 37 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_types.py | Replaces u'' literals with plain str in type read/write tests. |
| tests/unit/test_row_factories.py | Removes Python 3.0–3.6 conditional expectations for namedtuple creation. |
| tests/unit/test_orderedmap.py | Updates unicode-key tests to Python 3 str keys. |
| tests/unit/test_metadata.py | Replaces u'' literals with str in metadata CQL export tests. |
| tests/unit/test_marshalling.py | Updates UTF8/unicode expectations to Python 3 str and cleans up ordered map inserts. |
| tests/unit/advanced/test_insights.py | Removes Python-version-specific namespace logic and reformats expected dicts. |
| tests/integration/standard/test_query.py | Updates unicode query strings/column names to Python 3 str. |
| tests/integration/standard/test_cluster.py | Updates expected row tuples to Python 3 str. |
| tests/integration/cqlengine/model/test_udts.py | Updates unicode literals to Python 3 str. |
| tests/integration/cqlengine/model/test_model_io.py | Updates unicode literals to Python 3 str in model IO assertions. |
| tests/integration/cqlengine/model/test_class_construction.py | Mostly formatting + Python 3 iterator usage (next(iter(...))) and string literal normalization. |
| tests/integration/cqlengine/columns/test_validation.py | Removes old Python-version branches and normalizes string usage/formatting in validation tests. |
| setup.py | Removes Python 2-era subprocess gating and refactors extension/doc build setup logic. |
| docs/conf.py | Normalizes string literals and formatting in Sphinx configuration. |
| cassandra/query.py | Python 3 string/formatting cleanup; keeps namedtuple fallback paths but modernizes literals/layout. |
| cassandra/pool.py | Removes legacy WeakSet fallback and reformats/shard-aware related code blocks. |
| cassandra/io/asyncorereactor.py | Removes legacy WeakSet fallback and modernizes literals/formatting. |
| cassandra/encoder.py | Deprecates Python 2 “unicode” semantics and standardizes encoding/quoting behavior for Python 3. |
| cassandra/datastax/graph/query.py | Python 3 string/formatting cleanup and minor readability refactors. |
| cassandra/datastax/graph/graphson.py | Python 3 cleanup + formatting; updates docs/comments describing supported Python types. |
| cassandra/datastax/graph/fluent/_query.py | Python 3 string/formatting cleanup and improves readability of traversal query generation. |
| cassandra/cqlengine/statements.py | Removes UnicodeMixin usage and converts __unicode__ implementations to __str__. |
| cassandra/cqlengine/operators.py | Removes UnicodeMixin usage and converts operator stringification to __str__. |
| cassandra/cqlengine/named.py | Converts __unicode__ to __str__ and normalizes string literals. |
| cassandra/cqlengine/models.py | Removes UnicodeMixin usage and normalizes string literals/formatting across model machinery. |
| cassandra/cqlengine/functions.py | Removes UnicodeMixin usage and converts __unicode__ to __str__. |
| cassandra/cqlengine/init.py | Removes UnicodeMixin definition entirely. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The subprocess module has been part of Python's standard library since Python 2.4. The try/except ImportError guard and has_subprocess flag were unreachable dead code that added unnecessary indentation and complexity to the doc-building logic. Replace with a direct 'import subprocess' and remove the conditional guard around the documentation build steps.
Python 3 uses __bool__ for truth-value testing; __nonzero__ was the Python 2 equivalent. The code previously defined __nonzero__ and aliased __bool__ = __nonzero__ for cross-compatibility. Since Python 3 never calls __nonzero__, rename the method directly to __bool__ and remove the alias.
This integration test was permanently dead code: it contained a guard 'if sys.version_info[0:2] != (2, 7): raise SkipTest(...)' which means it was always skipped on Python 3. The skip reason stated that the test compares static strings from dict items whose ordering is not deterministic on Python 3. Since the driver no longer supports Python 2, and fixing the test to use order-independent comparison is a separate concern, remove the permanently-skipped test entirely.
Two code blocks in test_validation.py were guarded by 'if sys.version_info < (3, 1):', making them unreachable on Python 3. The blocks used unichr() (a Python 2 builtin that does not exist in Python 3) and u'' string prefixes for unicode validation tests. In Python 3, chr() already returns a unicode character and all strings are unicode, so the adjacent chr(233) tests already cover the same functionality. Remove the dead blocks entirely.
…3.9+ Three version guards were left over from Python 2/3.x compatibility: 1. cluster.py: 'if sys.version_info[0] >= 3 and sys.version_info[1] >= 7' guarded the Eventlet/futurist ThreadPoolExecutor workaround. Since the driver requires 3.9+, this is always True. Removed the guard, dedented the body, and updated the docstring and error message to drop the 'Python 3.7+' qualifier (the issue is inherent to Eventlet, not a version-specific regression). 2. test_row_factories.py: NAMEDTUPLE_CREATION_BUG was defined as 'sys.version_info >= (3,) and sys.version_info < (3, 7)', which is always False on 3.9+. The test's dead branch tested a warning path that can never trigger. Removed the constant, the dead branch, the unused 'sys' import, and simplified the test to just verify long column lists work. 3. test_insights.py: 'if sys.version_info > (3,)' guarded a namespace suffix that is always needed on Python 3. Removed the guard and the now-unused 'sys' import. All 608 unit tests pass.
On Python 3, the u'' prefix is a no-op since all strings are already unicode. These prefixes were left over from Python 2 compatibility and add visual noise without any semantic effect. Removed u'' prefixes from: - cassandra/query.py: __str__ methods for SimpleStatement, PreparedStatement, BoundStatement, BatchStatement, and a docstring example showing OrderedMapSerializedKey output - cassandra/datastax/graph/query.py: GraphStatement.__str__ - cassandra/datastax/graph/fluent/_query.py: TraversalBatch.__str__ and as_graph_statement query construction - docs/conf.py: project name and copyright strings All 608 unit tests pass.
On Python 3, the u'' prefix is a no-op since all strings are already unicode. These prefixes were left over from Python 2 compatibility. Removed 67 u-prefix occurrences across 9 test files: - tests/unit/test_types.py (3) - tests/unit/test_orderedmap.py (3) - tests/unit/test_marshalling.py (6) - tests/unit/test_metadata.py (27) - tests/integration/standard/test_types.py (4) - tests/integration/standard/test_query.py (13) - tests/integration/standard/test_cluster.py (8) - tests/integration/cqlengine/model/test_udts.py (1) - tests/integration/cqlengine/model/test_model_io.py (2) All 608 unit tests pass.
Several comments and docstrings still referenced Python 2 concepts that
no longer apply now that the driver requires Python 3.9+:
- encoder.py: Updated cql_encode_unicode() docstring to note it is
unused since Python 2 removal (str is always unicode on Python 3).
Also fixed the method body: it was calling val.encode('utf-8') which
on Python 3 converts str to bytes, producing wrong output. Now it
passes val directly to cql_quote.
- metadata.py: Changed 'will always be a unicode' to 'will always be
a str' (line 2155). Updated unhexlify comment to say 'str input'
instead of 'unicode input' and fixed typo 'everythin' (line 2350).
- graphson.py: Removed '(PY2)'/'(PY3)' qualifiers from the type
mapping table. Updated 'long' to 'int' for varint, 'str (unicode)'
to 'str' for inet, removed 'buffer (PY2)' from blob entries.
- util.py: Updated comment on _positional_rename_invalid_identifiers
to remove stale 'Python 2.6' reference.
- asyncorereactor.py: Removed stale 'TODO: Remove when Python 2
support is removed' since Python 2 support has been removed. The
guard itself is still needed for interpreter shutdown scenarios.
All 608 unit tests pass.
Author
|
Fixed all comments. |
Author
|
@copilot code review[agent] - please re-review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pre-review checklist
This is 100% OpenCode's work. So take it with a grain of salt, and I need to go over it. I can also cherry-pick each one separately. I've asked it to separate as much as possible to independent items.
./docs/source/.Fixes:annotations to PR description.