On 23 ene, 06:59, "krishnakant Mane" <[EMAIL PROTECTED]> wrote: > On 23/01/2008, Paul Hankin <[EMAIL PROTECTED]> wrote:> On Jan 22, 6:57 pm, > "krishnakant Mane" <[EMAIL PROTECTED]> wrote: > > > hello all, > > > I have a bit of a confusing question. > > > firstly I wanted a library which can do an svn like diff with two files. > > > let's say I have file1 and file2 where file2 contains some thing which > > > file1 does not have. now if I do readlines() on both the files, I > > > have a list of all the lines. > > > I now want to do a diff and find out which word is added or deleted or > > changed. > > > and that too on which character, if not at least want to know the word > > > that has the change. > > > any ideas please? > > > Have a look at difflib in the standard library. > > I am aware of the difflib library but still can't figure out. > I know that differences in two lines can be got but how to get it between > words?
The base functionality is in SequenceMatcher; this class takes sequence pairs of any type and tries to match them. The sequences may be a list of lines, a single line (seen as a list of characters), or you may feed it with a list of words (perhaps using line.split()). Built on top of SequenceMatcher, you have a text Differ. It takes a sequence of lines, and does its work in two steps: first tries to match blocks of lines (using a SequenceMatcher), and later unmatched blocks are further analyzed to show intraline differences (with another SequenceMatcher, considering lines as a sequence of characters). See the example at http://docs.python.org/lib/differ-examples.html - perhaps this is what you want. Note that Differ has no concept of "word"; if you want to report only whole word differences take a look at the _fancy_replace method. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list