On Sep 26, 9:31 am, Duncan Booth <[EMAIL PROTECTED]> wrote: > Mark Summerfield <[EMAIL PROTECTED]> wrote: > > As you've no doubt realised, this particular implementation gives best > > performance when the pattern of use is: lots of edits, lots of > > lookups, ..., and gives worst performance when the pattern of use is: > > edit, lookup, edit, lookup (in which case using a dict and sorted() is > > probably better). > > > So there is lots of scope for someone to do a version that has good > > performance for all patterns of use:-) > > I that's the point though: you can't write one implementation that has good > performance for all patterns of use: you have to either compromise on > performance somewhere, or use an implementation specifically matched to > your use case. > > For example, if the use case was some updates followed by iterating over > the first few keys only, it may be faster to use a heap for iteration. > > I suspect that for many use patterns you could improve performance > significantly by having two levels of invalidation for the the key list: in > __setitem__, if the key already exists you don't need to discard the list > in __keycache (although if a key or cmp function was provided it may no > longer be sorted). In that case it might be worthwhile keeping __keycache > but flagging it as needing to be sorted again next time it is used. Given > that Python's sort is very fast on sorted or nearly sorted lists this could > provide a worthwhile speedup for cases where the values change but the keys > don't change often.
More flexibly, keep a set of inserted keys that haven't yet been included in the sorted list, and a set of deleted keys that haven't yet been removed from the sorted list. The cache is invalid if either of these sets are empty - and to make it valid you can choose what to do based on the sizes of the two sets (and the sorted list). For instance, if there's just been one insertion you're probably better doing an insertion rather than a full resort. Of course, there's a few nasty cases here but it's always possible to just throw away the sorted list and reconstruct it from the dict keys when things get too hairy (eg, the user deletes a key that's in the inserted-but-not-yet- sorted set). -- Paul Hankin -- http://mail.python.org/mailman/listinfo/python-list