Dennis Sweeney <sweeney.dennis...@gmail.com> added the comment:

Indeed, this is just a very unlucky case.

    >>> n = len(longer)
    >>> from collections import Counter
    >>> Counter(s[:n])
    Counter({0: 9056995, 255: 6346813})
    >>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@')
    b'..............................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@'
    >>> Counter(s[n:])
    Counter({255: 18150624})


When checking "base", we're in this situation

    pattern:     @@@@@@@@
     string:     .........@@@@@@@@
    Algorithm says:     ^ these last characters don't match.
                         ^ this next character is not in the pattern
                         Therefore, skip ahead a bunch:

     pattern:              @@@@@@@@
      string:     .........@@@@@@@@

     This is a match!


Whereas when checking "longer", we're in this situation:

    pattern:     @@@@@@@@@
     string:     .........@@@@@@@@
    Algorithm says:      ^ these last characters don't match.
                          ^ this next character *is* in the pattern.
                          We can't jump forward.

     pattern:       @@@@@@@@
      string:     .........@@@@@@@@

     Start comparing at every single alignment...


I'm attaching reproducer.py, which replicates this from scratch without loading 
data from a file.

----------
Added file: https://bugs.python.org/file49499/reproducer.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41972>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to