Mark Summerfield <[EMAIL PROTECTED]> wrote: > As you've no doubt realised, this particular implementation gives best > performance when the pattern of use is: lots of edits, lots of > lookups, ..., and gives worst performance when the pattern of use is: > edit, lookup, edit, lookup (in which case using a dict and sorted() is > probably better). > > So there is lots of scope for someone to do a version that has good > performance for all patterns of use:-)
I that's the point though: you can't write one implementation that has good performance for all patterns of use: you have to either compromise on performance somewhere, or use an implementation specifically matched to your use case. For example, if the use case was some updates followed by iterating over the first few keys only, it may be faster to use a heap for iteration. I suspect that for many use patterns you could improve performance significantly by having two levels of invalidation for the the key list: in __setitem__, if the key already exists you don't need to discard the list in __keycache (although if a key or cmp function was provided it may no longer be sorted). In that case it might be worthwhile keeping __keycache but flagging it as needing to be sorted again next time it is used. Given that Python's sort is very fast on sorted or nearly sorted lists this could provide a worthwhile speedup for cases where the values change but the keys don't change often. -- http://mail.python.org/mailman/listinfo/python-list