Hi all, I'd like to ask about the most reasonable/recommended/... way to modify the functionality of the standard library module (if it is recommended at all). I'm using difflib.SequenceMatcher for character-wise comparisons of the texts; although this might not be a usual use case, the results are fine for the given task; however, there were some cornercases, where the shown differences were clearly larger than needed. As it turned out, this is due to a kind of specialcasing of relatively more frequent items; cf. http://bugs.python.org/issue1528074#msg29269 http://bugs.python.org/issue2986 The solution (or workaround) for me was to modify the SequenceMatcher class by adding another parameter checkpopular=True which influences the behaviour of the __chain_b function accordingly. The possible speed issues with this optimisation turned off (checkpopular=False) don't really matter now and the comparison results are much better for my use cases.
However, I'd like to ask, how to best maintain this modified functionality in the sourcecode. I tried some possibilities, which seem to work, but I'd appreciate suggestions on the preferred way in such cases. - It is simply possibly to have a modified sourcefile difflib.py in the script directory. - Furthermore one can subclass difflib.SequenceMatcher an overide its __chain_b function (however the name doesn't look like a "public" function ... - I guess, it wouldn't be recommended to directly replace difflib.SequenceMatcher._SequenceMatcher__chain_b ... In all cases I have either a copy of the whole file or the respective function as a part of my source. I'd appreciate comments or suggestions on this or maybe another better approaches to this problem. Thanks in advance, vbr -- http://mail.python.org/mailman/listinfo/python-list