[issue31561] difflib pathological behavior with mixed line endings

2017-09-24 Thread Tim Peters
Tim Peters added the comment: The text/binary distinction you have in mind doesn't particularly apply to difflib: it compares sequences of hashable objects. "Text files" are typically converted by front ends to lists of strings, but, e.g., the engine is just as happy comparing tuples of floa

[issue31561] difflib pathological behavior with mixed line endings

2017-09-24 Thread Raymond Hettinger
Raymond Hettinger added the comment: > Of course I can understand if all this is out of the scope of difflib and not > an endeavor worth taking up. I agree with that sentiment. Data normalization for comparability belongs upstream from difflib (i.e. normalizing line-endings, unicode normaliza

[issue31561] difflib pathological behavior with mixed line endings

2017-09-24 Thread Mahmoud Al-Qudsi
Mahmoud Al-Qudsi added the comment: @tim.peters No, `icdiff` is not part of core and probably should be omitted from the remainder of this discussion. I just checked and it's actually not a mix of line endings in each file, it's just that one file is \n and the other is \r\n You can actually

[issue31561] difflib pathological behavior with mixed line endings

2017-09-23 Thread Tim Peters
Tim Peters added the comment: I'm not familiar with `icdiff` - it's not part of the Python distribution, right? If so, you should really talk to its author(s). If two files have different line endings, then no pair of lines between them can be equal, and the difference engine will struggle mi

[issue31561] difflib pathological behavior with mixed line endings

2017-09-23 Thread Raymond Hettinger
Changes by Raymond Hettinger : -- nosy: +tim.peters ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://ma

[issue31561] difflib pathological behavior with mixed line endings

2017-09-23 Thread Mahmoud Al-Qudsi
Mahmoud Al-Qudsi added the comment: Attaching file2 -- Added file: https://bugs.python.org/file47165/file2 ___ Python tracker ___ ___

[issue31561] difflib pathological behavior with mixed line endings

2017-09-23 Thread Mahmoud Al-Qudsi
New submission from Mahmoud Al-Qudsi: While using the icdiff command line interface to difflib, I ran into an interesting issue where difflib took 47 seconds to compare two simple text documents (a PHP source code file that had been refactored via phptidy). On subsequent analysis, it turned ou