Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 10, 2026

⚡️ This pull request contains optimizations for PR #1443

If you approve this dependent PR, these changes will be merged into the original PR branch fix/java-exception-assignment-instrumentation.

This PR will be automatically closed if the original PR is merged.


📄 33% (0.33x) speedup for JavaAssertTransformer._detect_variable_assignment in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 1.63 milliseconds 1.23 milliseconds (best of 250 runs)

📝 Explanation and details

The optimization achieves a 33% runtime speedup (from 1.63ms to 1.23ms) by eliminating repeated regex compilation overhead through two key changes:

What Changed:

  1. Precompiled regex pattern: The regex pattern r"(\w+(?:<[^>]+>)?)\s+(\w+)\s*=\s*$" is now compiled once in __init__ and stored as self._assign_re, rather than being recompiled on every call to _detect_variable_assignment.

  2. Direct substring search: Instead of first extracting line_before_assert = source[line_start:assertion_start] and then searching it, the optimized version directly searches the source string using self._assign_re.search(source, line_start, assertion_start) with positional parameters.

Why This Is Faster:

  • Regex compilation overhead eliminated: Line profiler shows the original code spent 53.4% of total time (3.89ms out of 7.29ms) on re.search(pattern, line_before_assert). This line was called 1,057 times, meaning the regex pattern was compiled 1,057 times. The optimized version reduces this to just 30.8% (1.20ms out of 3.91ms) by using a precompiled pattern.

  • Reduced string allocations: By passing line_start and assertion_start as positional bounds to search(), we avoid creating the temporary line_before_assert substring (which took 5% of time in the original), reducing memory churn.

Performance Across Test Cases:
The optimization shows consistent improvements across all scenarios:

  • Simple cases: 35-45% faster (e.g., simple variable assignment: 39.1% faster)
  • No-match cases: 82-101% faster (e.g., no assignment: 101% faster) - regex compilation was pure overhead here
  • Complex generics: Still 6-14% faster despite more complex matching
  • Large-scale test (1000 iterations): 36.7% faster, proving the benefit scales with repeated calls

Impact on Workloads:
Since _detect_variable_assignment is called for every assertion in Java test code being analyzed, and the JavaAssertTransformer is likely instantiated once per file/session, this optimization provides cumulative benefits. The precompilation happens once at instantiation, then every subsequent call benefits from the compiled pattern - making it especially valuable when processing files with many assertions (as demonstrated by the 1000-iteration test showing consistent 36.7% improvement).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1104 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
from codeflash.languages.java.remove_asserts import JavaAssertTransformer

def test_simple_variable_assignment_detection():
    # Create transformer instance (analyzer is optional and not used by this helper)
    transformer = JavaAssertTransformer(function_name="assertThrows")

    # Build a simple source line where a variable of a concrete type is assigned the result of an assertion call.
    source = "IllegalArgumentException exception = assertThrows(new Runnable() {});"
    # Find the index where the assertion invocation begins
    assertion_start = source.index("assertThrows")

    # Call the private helper and assert we get the expected (type, name)
    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 4.21μs -> 3.03μs (39.1% faster)

def test_generic_type_assignment_detection():
    # Test a generic type such as List<String>
    transformer = JavaAssertTransformer(function_name="assertThrows")

    source = "List<String> result = assertThrows(() -> {});"
    assertion_start = source.index("assertThrows")

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 4.20μs -> 2.98μs (40.9% faster)

def test_no_assignment_returns_none():
    # If there is no variable assignment before the assertion, function should return (None, None)
    transformer = JavaAssertTransformer(function_name="assertThrows")

    source = "assertThrows(() -> { /* no assignment */ });"  # assertion starts at beginning of line
    assertion_start = source.index("assertThrows")

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 2.10μs -> 1.04μs (101% faster)

def test_indented_assignment_and_tab_between_type_and_name():
    # Leading whitespace and tabs between type and variable name should still be matched
    transformer = JavaAssertTransformer(function_name="assertSomething")

    # Indentation and a tab between type and variable
    source = "    Map<String, Integer>\tmyMap = assertSomething();"
    assertion_start = source.index("assertSomething")

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 4.36μs -> 3.23μs (34.8% faster)

def test_variable_with_underscore_and_numbers():
    # Variable names with underscores and types with numbers should match (\w includes digits and underscores)
    transformer = JavaAssertTransformer(function_name="assertThrows")

    source = "Type123 var_name_1 = assertThrows();"
    assertion_start = source.index("assertThrows")

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 3.86μs -> 2.84μs (35.9% faster)

def test_assertion_at_start_of_file_no_assignment_returns_none():
    # When the assertion is at the very start of the file, there is no previous line content to match
    transformer = JavaAssertTransformer(function_name="assertThrows")

    source = "assertThrows(() -> {});"
    assertion_start = 0  # assertion starts at the beginning of file

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 2.05μs -> 1.12μs (82.9% faster)

def test_passing_none_for_source_raises_attribute_error():
    # The function expects a str. Passing None should raise an AttributeError when it tries to call rfind.
    transformer = JavaAssertTransformer(function_name="assertThrows")

    with pytest.raises(AttributeError):
        # This will fail at source.rfind(...) inside the method because NoneType has no rfind
        transformer._detect_variable_assignment(None, 5) # 2.80μs -> 3.14μs (10.8% slower)

def test_qualified_type_with_dots_is_not_matched():
    # A fully qualified type (with dots) should not match because the regex only allows \w and <...>
    transformer = JavaAssertTransformer(function_name="assertThrows")

    source = "com.example.MyType myVar = assertThrows();"
    assertion_start = source.index("assertThrows")

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 6.50μs -> 5.42μs (20.0% faster)

def test_assertion_start_beyond_length_still_uses_line_before_end():
    # If assertion_start is beyond the end of the source, slicing will use the full line content.
    transformer = JavaAssertTransformer(function_name="assertThrows")

    # Provide a line that ends with " = " which would normally indicate an assignment
    source = "ShortType shortVar = "
    assertion_start = len(source) + 50  # intentionally beyond the string length

    var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 4.00μs -> 2.83μs (40.9% faster)

def test_large_scale_detection_repeated_calls():
    # Build a large source with many lines and place one variable assignment near the end.
    transformer = JavaAssertTransformer(function_name="assertThrows")

    # Create 1000 lines of filler and then the target line
    filler_lines = [f"int filler{i} = {i};" for i in range(999)]
    target_line = "MyBigType myBigVar = assertThrows(new Runnable() {});"
    all_lines = filler_lines + [target_line]
    source = "\n".join(all_lines)

    # Compute the assertion start index for the target line
    assertion_start = source.index("assertThrows")

    # Call the function repeatedly to exercise performance with repeated usage
    for _ in range(1000):  # loop up to 1000 iterations as requested
        var_type, var_name = transformer._detect_variable_assignment(source, assertion_start) # 1.25ms -> 910μs (36.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.parser import JavaAnalyzer, get_java_analyzer
from codeflash.languages.java.remove_asserts import JavaAssertTransformer

class TestDetectVariableAssignment:
    """Test suite for JavaAssertTransformer._detect_variable_assignment method."""

    def setup_method(self):
        """Set up test fixtures before each test method."""
        # Create a real JavaAssertTransformer instance for testing
        self.transformer = JavaAssertTransformer(
            function_name="testMethod",
            qualified_name="com.example.TestClass.testMethod"
        )

    # ========== BASIC TESTS ==========
    # These tests verify fundamental functionality under normal conditions

    def test_simple_variable_assignment_with_exception(self):
        """Test detection of simple exception variable assignment to assertThrows."""
        # Arrange: source code with exception variable assignment
        source = "IllegalArgumentException exception = assertThrows(IllegalArgumentException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 5.37μs -> 3.69μs (45.3% faster)

    def test_simple_variable_assignment_with_ex(self):
        """Test detection of 'ex' variable name in assertion assignment."""
        # Arrange: source code with 'ex' as variable name
        source = "Exception ex = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.47μs -> 3.26μs (37.2% faster)

    def test_no_variable_assignment(self):
        """Test when assertion has no variable assignment."""
        # Arrange: source code without variable assignment (just calling assertThrows)
        source = "assertThrows(IllegalArgumentException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 2.37μs -> 1.25μs (90.2% faster)

    def test_multiple_spaces_between_type_and_name(self):
        """Test detection with multiple spaces between type and variable name."""
        # Arrange: source code with extra whitespace
        source = "IOException   ioError   =   assertThrows(IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.72μs -> 3.36μs (40.4% faster)

    def test_variable_name_with_numbers(self):
        """Test detection of variable names containing numbers."""
        # Arrange: source code with numbered variable
        source = "RuntimeException error123 = assertThrows(RuntimeException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.49μs -> 3.22μs (39.1% faster)

    def test_generic_type_simple(self):
        """Test detection with simple generic types."""
        # Arrange: source code with generic type parameter
        source = "List<String> result = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.65μs -> 3.29μs (41.3% faster)

    def test_generic_type_nested(self):
        """Test detection with nested generic types."""
        # Arrange: source code with nested generic type
        source = "Map<String, List<Integer>> data = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 9.53μs -> 8.32μs (14.5% faster)

    def test_assertion_at_line_start(self):
        """Test when assertion call is at the beginning of a line."""
        # Arrange: source code with assertion at line start (no variable assignment)
        source = "assertThrows(IllegalArgumentException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 2.35μs -> 1.22μs (92.3% faster)

    def test_multiline_with_newline_before(self):
        """Test detection when there's a newline before the assertion."""
        # Arrange: source code with newline before assertion
        source = "SomeClass obj = new SomeClass();\nNullPointerException npe = assertThrows(NullPointerException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.52μs -> 3.20μs (41.4% faster)

    def test_tab_characters_in_whitespace(self):
        """Test detection with tab characters as whitespace."""
        # Arrange: source code with tabs
        source = "ArrayIndexOutOfBoundsException\terrorArray\t=\tassertThrows(ArrayIndexOutOfBoundsException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.57μs -> 3.21μs (42.3% faster)

    # ========== EDGE TESTS ==========
    # These tests evaluate behavior under extreme or unusual conditions

    def test_empty_source_string(self):
        """Test with empty source code."""
        # Arrange: empty source
        source = ""
        assertion_start = 0
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 2.47μs -> 1.36μs (81.8% faster)

    def test_assertion_at_position_zero(self):
        """Test when assertion is at position 0 in source."""
        # Arrange: assertion at the very start
        source = "assertThrows(Exception.class, () -> {});"
        assertion_start = 0
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 2.42μs -> 1.33μs (81.7% faster)

    def test_assertion_position_beyond_assignment(self):
        """Test when assertion_start points after the actual assignment."""
        # Arrange: source with assignment, but assertion_start in middle of assertThrows
        source = "MyException myEx = assertThrows(MyException.class, () -> {});"
        assertion_start = source.find("assertThrows") + 5  # Point into "assertThrows"
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 10.6μs -> 9.55μs (10.9% faster)

    def test_underscore_in_type_name(self):
        """Test with underscores in type name (though uncommon)."""
        # Arrange: source with underscore in type (if supported by Java)
        source = "Custom_Exception err = assertThrows(Custom_Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.59μs -> 3.24μs (42.0% faster)
        
        # Assert: verify underscore handling (may not match if pattern doesn't include _)
        # This tests edge case behavior
        if var_type is not None:
            pass

    def test_very_long_type_name(self):
        """Test with very long generic type parameter."""
        # Arrange: source with long type name
        source = "Map<String, Map<String, List<String>>> veryLongName = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 14.7μs -> 13.5μs (9.24% faster)

    def test_single_character_variable_name(self):
        """Test with single character variable name."""
        # Arrange: source with single char variable
        source = "IOException e = assertThrows(IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.49μs -> 3.12μs (43.6% faster)

    def test_equals_without_space_before_assignment(self):
        """Test with no space before equals sign."""
        # Arrange: source with equals directly after type
        source = "Exception error= assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.31μs -> 3.07μs (40.2% faster)

    def test_equals_without_space_after_assignment(self):
        """Test with no space after equals sign."""
        # Arrange: source with equals directly before method
        source = "RuntimeException err =assertThrows(RuntimeException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.33μs -> 3.11μs (39.2% faster)

    def test_fully_qualified_type_name(self):
        """Test when type includes package (should not match our simple pattern)."""
        # Arrange: source with fully qualified type
        source = "java.io.IOException ioex = assertThrows(java.io.IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 5.50μs -> 4.05μs (35.7% faster)

    def test_keyword_like_variable_name(self):
        """Test with variable name that resembles keyword (but is valid)."""
        # Arrange: source with 'exception' as variable (resembles Exception type)
        source = "IOException exception = assertThrows(IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.28μs -> 3.07μs (39.7% faster)

    def test_closing_bracket_in_generic_type(self):
        """Test that closing bracket in generic is properly handled."""
        # Arrange: source with generic type having multiple levels
        source = "Optional<RuntimeException> opt = assertThrows(RuntimeException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.58μs -> 3.18μs (44.2% faster)

    def test_assertion_start_greater_than_source_length(self):
        """Test when assertion_start exceeds source length."""
        # Arrange: invalid assertion_start position
        source = "Exception ex = assertThrows(Exception.class, () -> {});"
        assertion_start = len(source) + 100
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 15.6μs -> 14.3μs (8.75% faster)

    def test_negative_assertion_start_position(self):
        """Test with negative assertion_start (edge case)."""
        # Arrange: negative position (edge case, shouldn't occur in practice)
        source = "Exception ex = assertThrows(Exception.class, () -> {});"
        assertion_start = -1
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 15.2μs -> 1.48μs (928% faster)

    def test_special_characters_in_source_before_assignment(self):
        """Test with special characters before the assignment."""
        # Arrange: source with special chars (comments, strings) before
        source = '// someComment\nNullPointerException npe = assertThrows(NullPointerException.class, () -> {});'
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.37μs -> 3.12μs (40.0% faster)

    def test_generic_type_with_wildcard(self):
        """Test generic type with wildcard parameter."""
        # Arrange: source with wildcard in generic
        source = "List<?> items = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.43μs -> 3.34μs (32.6% faster)

    def test_generic_type_with_extends_bound(self):
        """Test generic type with extends bound."""
        # Arrange: source with bounded type parameter
        source = "List<? extends Exception> errors = assertThrows(Exception.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.53μs -> 3.26μs (39.0% faster)
        
        # Assert: pattern may not fully match complex bounds, test for robustness
        # The pattern uses [^>]+ which stops at first >
        if var_type is not None:
            pass

    def test_multiple_equals_on_line(self):
        """Test when there are multiple equals signs on the line."""
        # Arrange: source with multiple equals (comparison before assignment)
        source = "if (a == b) IOException ex = assertThrows(IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 5.29μs -> 4.05μs (30.8% faster)

    # ========== LARGE-SCALE TESTS ==========
    # These tests assess performance and scalability with large data

    def test_very_long_source_file_with_many_assertions(self):
        """Test with a large source file containing many assertion lines."""
        # Arrange: build a source string with 1000 lines before the target assertion
        lines = []
        for i in range(1000):
            if i % 10 == 0:
                lines.append(f"SomeException ex{i} = assertThrows(SomeException.class, () -> {{}});")
            else:
                lines.append(f"int x{i} = {i};")
        
        # Add the target assertion
        target_assertion = "TargetException targetEx = assertThrows(TargetException.class, () -> {});"
        lines.append(target_assertion)
        source = "\n".join(lines)
        
        # Find position of the target assertion
        assertion_start = source.rfind("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.81μs -> 3.35μs (43.4% faster)

    def test_very_long_single_line(self):
        """Test with a very long single line (no newlines) before assertion."""
        # Arrange: create a long line of variable declarations
        prefix = "int a=1, b=2, c=3;" * 100  # Repeat to make it very long
        source = prefix + "LongException longEx = assertThrows(LongException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 104μs -> 102μs (1.38% faster)

    def test_deeply_nested_generic_types(self):
        """Test with deeply nested generic type parameters (10 levels)."""
        # Arrange: build a deeply nested generic type
        generic_type = "Map"
        for i in range(10):
            generic_type += "<String, "
        generic_type += "String"
        for _ in range(10):
            generic_type += ">"
        
        source = f"{generic_type} deepMap = assertThrows(Exception.class, () -> {{}});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 20.5μs -> 19.1μs (7.00% faster)

    def test_many_variables_with_similar_names(self):
        """Test detection when many similar variable names exist nearby."""
        # Arrange: source with many similar declarations
        source = ""
        for i in range(100):
            source += f"Exception ex{i} = someMethod{i}();\n"
        
        source += "IOException targetException = assertThrows(IOException.class, () -> {});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.73μs -> 3.40μs (39.2% faster)

    def test_whitespace_variations_in_large_context(self):
        """Test with various whitespace patterns in a large file."""
        # Arrange: create source with inconsistent whitespace patterns
        source = ""
        patterns = [
            "IOException  e  =  assertThrows(IOException.class, () -> {});",
            "RuntimeException\re\r=\rassertThrows(RuntimeException.class, () -> {});",
            "IllegalStateException\tise\t=\tassertThrows(IllegalStateException.class, () -> {});",
        ]
        
        for pattern in patterns:
            source += pattern + "\n"
        
        # Test each pattern
        for pattern in patterns:
            assertion_start = pattern.find("assertThrows")
            # Create a source with just this pattern
            test_source = pattern
            actual_assertion_start = test_source.find("assertThrows")
            
            # Act: call the method under test
            var_type, var_name = self.transformer._detect_variable_assignment(test_source, actual_assertion_start) # 7.90μs -> 5.72μs (38.2% faster)

    def test_large_number_of_assertions_in_sequence(self):
        """Test with 500 sequential assertion assignments."""
        # Arrange: create 500 sequential assertions
        source_lines = []
        for i in range(500):
            source_lines.append(f"Exception ex{i} = assertThrows(Exception.class, () -> {{}});")
        
        source = "\n".join(source_lines)
        
        # Test first, middle, and last assertions
        test_indices = [0, 250, 499]
        for idx in test_indices:
            line = source_lines[idx]
            # Calculate position in full source
            assertion_start = source.find("assertThrows", source.find(line))
            
            # Act: call the method under test
            var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 8.16μs -> 5.62μs (45.2% faster)

    def test_mixed_types_large_dataset(self):
        """Test with large number of different exception types."""
        # Arrange: create source with many different exception types
        exception_types = [
            "IOException", "SQLException", "RuntimeException", "IllegalArgumentException",
            "NullPointerException", "ArrayIndexOutOfBoundsException", "ClassCastException",
            "IllegalStateException", "UnsupportedOperationException", "ConcurrentModificationException"
        ]
        
        source_lines = []
        for i in range(100):
            exc_type = exception_types[i % len(exception_types)]
            source_lines.append(f"{exc_type} ex{i} = assertThrows({exc_type}.class, () -> {{}});")
        
        source = "\n".join(source_lines)
        
        # Test detection at various points
        for i in range(0, 100, 10):
            line_num = i
            exc_type = exception_types[i % len(exception_types)]
            
            # Find this line in the source
            line_pattern = f"ex{i} = assertThrows"
            assertion_start = source.find("assertThrows", source.find(line_pattern))
            
            # Act: call the method under test
            var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 16.6μs -> 12.2μs (36.9% faster)

    def test_performance_with_1000_line_source(self):
        """Test performance with 1000-line source document."""
        # Arrange: create a 1000-line source file
        source_lines = []
        for i in range(1000):
            if i % 7 == 0:
                source_lines.append(f"Exception ex{i} = assertThrows(Exception.class, () -> {{}});")
            else:
                source_lines.append(f"int value{i} = {i};")
        
        source = "\n".join(source_lines)
        
        # Find the last assertion
        assertion_start = source.rfind("assertThrows")
        
        # Act: call the method under test (should complete quickly)
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 4.69μs -> 3.03μs (54.8% faster)

    def test_extreme_generic_nesting_with_multiple_bounds(self):
        """Test with extreme generic nesting (100+ characters in type)."""
        # Arrange: create an extremely complex generic type
        complex_type = "Map<String, Map<String, Map<String, List<Optional<RuntimeException>>>>>"
        source = f"{complex_type} complex = assertThrows(RuntimeException.class, () -> {{}});"
        assertion_start = source.find("assertThrows")
        
        # Act: call the method under test
        var_type, var_name = self.transformer._detect_variable_assignment(source, assertion_start) # 22.8μs -> 21.5μs (6.11% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1443-2026-02-10T21.31.59 and push.

Codeflash Static Badge

The optimization achieves a **33% runtime speedup** (from 1.63ms to 1.23ms) by eliminating repeated regex compilation overhead through two key changes:

**What Changed:**
1. **Precompiled regex pattern**: The regex pattern `r"(\w+(?:<[^>]+>)?)\s+(\w+)\s*=\s*$"` is now compiled once in `__init__` and stored as `self._assign_re`, rather than being recompiled on every call to `_detect_variable_assignment`.

2. **Direct substring search**: Instead of first extracting `line_before_assert = source[line_start:assertion_start]` and then searching it, the optimized version directly searches the source string using `self._assign_re.search(source, line_start, assertion_start)` with positional parameters.

**Why This Is Faster:**
- **Regex compilation overhead eliminated**: Line profiler shows the original code spent **53.4% of total time** (3.89ms out of 7.29ms) on `re.search(pattern, line_before_assert)`. This line was called 1,057 times, meaning the regex pattern was compiled 1,057 times. The optimized version reduces this to just **30.8%** (1.20ms out of 3.91ms) by using a precompiled pattern.

- **Reduced string allocations**: By passing `line_start` and `assertion_start` as positional bounds to `search()`, we avoid creating the temporary `line_before_assert` substring (which took 5% of time in the original), reducing memory churn.

**Performance Across Test Cases:**
The optimization shows consistent improvements across all scenarios:
- **Simple cases**: 35-45% faster (e.g., simple variable assignment: 39.1% faster)
- **No-match cases**: 82-101% faster (e.g., no assignment: 101% faster) - regex compilation was pure overhead here
- **Complex generics**: Still 6-14% faster despite more complex matching
- **Large-scale test** (1000 iterations): 36.7% faster, proving the benefit scales with repeated calls

**Impact on Workloads:**
Since `_detect_variable_assignment` is called for every assertion in Java test code being analyzed, and the `JavaAssertTransformer` is likely instantiated once per file/session, this optimization provides cumulative benefits. The precompilation happens once at instantiation, then every subsequent call benefits from the compiled pattern - making it especially valuable when processing files with many assertions (as demonstrated by the 1000-iteration test showing consistent 36.7% improvement).
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 10, 2026
@mashraf-222 mashraf-222 merged commit 5c302bf into fix/java-exception-assignment-instrumentation Feb 11, 2026
13 of 29 checks passed
@mashraf-222 mashraf-222 deleted the codeflash/optimize-pr1443-2026-02-10T21.31.59 branch February 11, 2026 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant