Bugs item #1711800, was opened at 2007-05-03 06:24 Message generated for change (Comment added) made by collinwinter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1711800&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.6 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Christian Hammond (chipx86) Assigned to: Nobody/Anonymous (nobody) Summary: SequenceMatcher bug with insert/delete block after "replace" Initial Comment: difflib.SequenceMatcher fails to distinguish between a "replace" block and an "insert" or "delete" block when the "insert/delete" immediately follows a "replace". It will lump both changes together as one big "replace" block. This happens due to how get_opcodes() works. get_opcodes() loops through the matching blocks, grouping them into tags and ranges. However, if a block of text is changed and then new text is immediately added, it can't see this. All it knows is that the next matching block is after the added text. As an example, consider these strings: "ABC" "ABCD EFG." Any diffing program will show that the first line was replaced and the second was inserted. SequenceMatcher, however, just shows that there was one replace, and includes both lines in the range. I've attached a testcase that reproduces this for both replace>insert and replace>delete blocks. ---------------------------------------------------------------------- >Comment By: Collin Winter (collinwinter) Date: 2007-06-05 19:40 Message: Logged In: YES user_id=1344176 Originator: NO Thanks for the test case! Is there any chance you could also provide a patch to fix it? ---------------------------------------------------------------------- Comment By: Gabriel Genellina (gagenellina) Date: 2007-05-07 01:40 Message: Logged In: YES user_id=479790 Originator: NO Maybe you are more interested on a Differ object. These are the results from your example using '\n'.join(difflib.Differ().compare(a,b)) - This is my old file, containing only one line. + This is my new file. + It contains multiple lines. + SequenceMatcher should see two blocks as a result. and + This is my new file, containing only one line. - This is my old file. - It contains multiple lines. - SequenceMatcher should see two blocks as a result. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1711800&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com