Dennis Sweeney <sweeney.dennis...@gmail.com> added the comment:
Indeed, this is just a very unlucky case. >>> n = len(longer) >>> from collections import Counter >>> Counter(s[:n]) Counter({0: 9056995, 255: 6346813}) >>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@') b'..............................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@' >>> Counter(s[n:]) Counter({255: 18150624}) When checking "base", we're in this situation pattern: @@@@@@@@ string: .........@@@@@@@@ Algorithm says: ^ these last characters don't match. ^ this next character is not in the pattern Therefore, skip ahead a bunch: pattern: @@@@@@@@ string: .........@@@@@@@@ This is a match! Whereas when checking "longer", we're in this situation: pattern: @@@@@@@@@ string: .........@@@@@@@@ Algorithm says: ^ these last characters don't match. ^ this next character *is* in the pattern. We can't jump forward. pattern: @@@@@@@@ string: .........@@@@@@@@ Start comparing at every single alignment... I'm attaching reproducer.py, which replicates this from scratch without loading data from a file. ---------- Added file: https://bugs.python.org/file49499/reproducer.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41972> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com