R. David Murray <rdmur...@bitdance.com> added the comment: I checked the speed of the proposed patch, and found that it was definitely slower than the original code. So I took another look at the original, and refactored it in a different way: instead of moving the sibling relinking into a second pass, I changed to code to only relink siblings when a node is removed. The new patch passes all test, and is faster than the old code. I tested the timing both against the same small nested document I used in testNormalize2, and by running normalize on a 37K html document (a copy of the xml.dom.minidom chapter from the Library Reference):
original code: testNormalize2: [2.5144219398498535, 2.5053589344024658, 2.5059471130371094] example.html: [44.641155958175659, 44.575434923171997, 44.996657133102417] original patch testNormalize2: [2.7070891857147217, 2.7012341022491455, 2.7003159523010254] example.html: [67.908604860305786, 68.088788986206055, 67.92288613319397] My patch testNormalize2: [2.4626028537750244, 2.4619381427764893, 2.4617609977722168] example.html: [22.780415058135986, 22.780103921890259, 22.721666097640991] IMO my refactoring is also easier to understand than either the old code or the proposed patch. Patch, including new test, is attached, and also pushed to bzr+ssh://bazaar.launchpad.net/~rdmurray/python/issue2170. ---------- versions: +Python 2.7, Python 3.0, Python 3.1 Added file: http://bugs.python.org/file13349/issue2170.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2170> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com