GZ wrote: > I should distinguish between modifications and additions. In my above > example, one line is modified/replaced, one line is added and one line > is deleted. There are a total of 3 edits. I am looking for an > alternative python library other than difflib that minimizes this > number (edit distance).
Since you know the lingo, "edit distance", I expect you've already Googled up the general results. Depending on what kinds of edits we count, the problem of finding a minimal diff can be low-order poly- time, NP-hard, or incomputable. Gonzalo Navarro's 2001 survey, "A Guided Tour to Approximate String Matching", is now available on-line for free: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.96.7225&rep=rep1&type=pdf The paper notes, "issues have been put aside to keep a reasonable scope," but presents more than most of us ever wanted to know about the subject. I don't know of a module that does what GZ asks. There are scores of line-oriented diff implementations, and there are minimal-edit- distance diffs, but the combination is rare at best. Problem domains that call for a true minimal diff tend not to work in units of lines of text. -- --Bryan -- http://mail.python.org/mailman/listinfo/python-list