code_berzerker wrote: >> If document order doesn't matter, try sorting the elements of each level in >> the two documents by some arbitrary deterministic key, such as (tag name, >> text, attr count, whatever), and then compare them in order, instead of >> trying >> to find matches in multiple passes. itertools.groupby() might be your friend >> here. > > I think that sorting multiple times by each attribute will cost more > than I've managed to do: [...] > let1 = [x for x in et1.iter()] > let2 = [x for x in et2.iter()] > [...] > while let1: > el = let1.pop(0) > foundEl = findMatchingElem(el, let2) > if foundEl is None: > return False > let2.remove(foundEl) > return True > > def findMatchingElem(el, eList): > for elem in eList: > if elemsEqual(el, elem): > return elem > return None [...] > Notice that if documents are in exact same order, each element is > compared only once!
Not in your code. Stefan -- http://mail.python.org/mailman/listinfo/python-list