difflib.GestaltSequenceMatcher implementation

# Feature or enhancement

### Proposal:

Increase quality of results by being able to run Longest Common Subsequence with `autojunk=False` with little to no degradation in performance.

Implemented `GestaltSequenceMatcher` by default gives equivalent results to `SequenceMatcher(autojunk=False)`.

| case                             |    Myers | difflib(aj=1) | difflib(aj=0) |   Gestalt |  Gestalt(balancing=0.66) |
| -------------------------------- | -------: | ------------: | ------------: | -----------: | -----------: |
| [L] diff_key      (   15,    15) |        8 |      5     -3 |      5     -3 |     5     -3 |     8      0 |
| [L] diff_mid1     (   28,    27) |       11 |     11      0 |     11      0 |    11      0 |    11      0 |
| [L] diff_mid2     (  128,   128) |       73 |     72     -1 |     72     -1 |    72     -1 |    72     -1 |
| [L] diff_mid3     (  131,   131) |       63 |     50    -13 |     50    -13 |    50    -13 |    57     -6 |
| [L] difflib.py    ( 2041,  2334) |     1883 |   1878     -5 |   1879     -4 |  1879     -4 |  1879     -4 |
| [L] linux_sched.c (11373, 11832) |    10844 |  10793    -51 |  10837     -7 | 10837     -7 | 10844      0 |
| [L] react.js      ( 3555,  3337) |     2631 |   2568    -63 |   2581    -50 |  2581    -50 |  2581    -50 |
| [L] webpack.js    ( 2327,  3429) |      929 |    772   -157 |    899    -30 |   899    -30 |   916    -13 |
| -----                            |          |               |               |              |              |
|                           Total: |    16442 |  16149   -293 |  16334   -108 | 16334   -108 | 16368    -74 |
|                      Avg % diff: |       0% |         9.94% |         8.11% |        8.11% |        1.80% |
|                     Runtime (s): |    3.311 |         0.108 |         2.789 |        0.093 |        0.113 |

Table above shows the number of individual match points.
Which is not a definitive metric.
More definitive metric could be a number of alignment points sacrificed for the sake of quality.

However, it can be safely assumed that `autojunk=False` / new matcher produces higher quality diffs by not resorting to approximations. Luckily, it also increases number of match points.

New optional `balancing` argument can be turned to reduce the chance of greedily committing to unbalanced matches. This often produces more concise diffs with more lines matched. It produces more concise diffs with more lines matched, while retaining block-oriented nature.

For visual inspection, can see outputs of different algorithms for tailored case [HERE!]( https://htmlpreview.github.io/?https://gist.githubusercontent.com/dg-pb/89c2ed7cc3ebad06d9cd87fe91361292/raw/d6220ec7c71d40112a01b2257e3b2f0f28bbe354/diff_comparison.html)

<details><summary>Legacy</summary>
<p>


O(n) solution is not that complex - 100 of Python lines.

I hacked in initial implementation and ran `context_diff` on the `difflib.py` - the change itself.

```python
import difflib, time

with open('/Users/Edu/lib/src/cx_amb/src/cx/cli/snd/diff/difflib.py') as f:
    s1 = f.read().splitlines(keepends=True)

with open('/Users/Edu/lib/src/cx_amb/src/cx/cli/snd/diff/difflib_new.py') as f:
    s2 = f.read().splitlines(keepends=True)

list(difflib.context_diff(s1, s2, fromfile='before.py', tofile='after.py'))
# 2.4 ms (Current)
# 2.9 ms (New)
```

~~So 30% faster.~~
So 20% slower.
Also, this is a fairly rough plug-in - there is a chance that the gap can be bridged.

`O(n)` solution should avoid some of the user dissatisfaction:
</p>
</details> 


### Has this already been discussed elsewhere?

No response given

### Links to previous discussion of this feature:

Made a couple of posts in unrelated thread.
Performance tables are there showing that this is faster even for sizes of 10.
https://discuss.python.org/t/include-time-and-space-complexity-in-documentation/105683/32


### Linked PRs
* gh-144164

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

difflib.GestaltSequenceMatcher implementation #144132

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

case	Myers	difflib(aj=1)	difflib(aj=0)	Gestalt	Gestalt(balancing=0.66)
[L] diff_key ( 15, 15)	8	5 -3	5 -3	5 -3	8 0
[L] diff_mid1 ( 28, 27)	11	11 0	11 0	11 0	11 0
[L] diff_mid2 ( 128, 128)	73	72 -1	72 -1	72 -1	72 -1
[L] diff_mid3 ( 131, 131)	63	50 -13	50 -13	50 -13	57 -6
[L] difflib.py ( 2041, 2334)	1883	1878 -5	1879 -4	1879 -4	1879 -4
[L] linux_sched.c (11373, 11832)	10844	10793 -51	10837 -7	10837 -7	10844 0
[L] react.js ( 3555, 3337)	2631	2568 -63	2581 -50	2581 -50	2581 -50
[L] webpack.js ( 2327, 3429)	929	772 -157	899 -30	899 -30	916 -13
-----
Total:	16442	16149 -293	16334 -108	16334 -108	16368 -74
Avg % diff:	0%	9.94%	8.11%	8.11%	1.80%
Runtime (s):	3.311	0.108	2.789	0.093	0.113

Uh oh!

difflib.GestaltSequenceMatcher implementation #144132

Description

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions