Jan <pf...@yahoo.com.br> added the comment: hi all,
just got bitten by this, so i took the time to reiterate the issue. according to the docs: http://docs.python.org/library/difflib.html find_longest_match() should return the longest matching string: "If isjunk was omitted or None, find_longest_match() returns (i, j, k) such that a[i:i+k] is equal to b[j:j+k], where alo <= i <= i+k <= ahi and blo <= j <= j+k <= bhi. For all (i', j', k') meeting those conditions, the additional conditions k >= k', i <= i', and if i == i', j <= j' are also met. In other words, of all maximal matching blocks, return one that starts earliest in a, and of all those maximal matching blocks that start earliest in a, return the one that starts earliest in b." but after a couple of hours debugging i finally convinced myself that the bug was in the library ... and i ended up here :) any ideas on how to work around this bug/feature, and just get the longest matching string ? (from a normal/newbie user perspective, that is, without patching the C++ library code and recompiling?) from the comments (which i couldn't follow entirely), does it use some concept of popularity that is not exposed by the API ? How is "popularity" defined ? many thanks! - jan ps.: using ubuntu's python 2.5.2 ps2.: and example of a string pair where the issue shows up: s1='Floor Box SystemsFBS Floor Box Systems - Manufacturer & supplier of FBS floor boxes, electrical ... experience, FBS Floor Box Systems continue ... raceways, floor box. ...www.floorboxsystems.com' s2='FBS Floor Box SystemsFBS Floor Box Systems - Manufacturer & supplier of FBS floor boxes, electrical floor boxes, wood floor box, concrete floor box, surface mount floor box, raised floor ...www.floorboxsystems.com' ---------- nosy: +janpf _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1528074> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com