Re: LyX CVS/SVN diff support.

Asger Ottar Alstrup Mon, 03 Sep 2007 23:24:10 -0700

Alfredo Braunstein wrote:

Amazing review, thanks. On a fast read, I've found that the variant that
perfectly fits our edition/CT capabilities is 1-degree, unit cost edit
distance. Unfortunately the algorithm cited for this purpose [Sel77, Didn't
look at it yet, though.] seems to have time and space complexity of


O(|T1||T2|)

which doesn't seem very practical (and by the way, this seems roughly
similar to what you get with string edit distance in which insets are
symbols of an extended alphabet with cost of substitution recursively
defined as edit distance, using plain Levenshtein algorithm)

Only X percent of insets will have changed when you compare twodocuments. In a long document, X will be small.

Therefore, you can probably optimise most of the problem away so thatcomplexity will not hurt you. Calculate a SHA-1 of the contents of eachinset, and use that for initial comparisons, rather than the completelist of characters.

Then complexity is the product of the number of insets squared, whichshouldn't be to bad.

After this, you get a reduced list of insets which are changed, and thenyou can rerun the algorithm on the text of those only.


Regards,
Asger

Re: LyX CVS/SVN diff support.

Reply via email to