Hi, I really do not like it. So -1 for me. Your two methods are very specialized whereas the dict type is very generic. Usually, when I see something like this in code, I can smell it's a patch to overcome some shortcomings on a previous design, thereby making the economy of re-designing. Simply said, that's bad programming.
After that patch to provide a solution for only two of the more common use-cases, you are nonetheless stucked with the old solution for all the other use-cases (what if the value type is another dictionary or some user-made class ?). Here's an alternate solution which I think answers all of the problems you mentionned while being generic. === BEGIN SNAP def update_or_another_great_name(self, key, createFunc, updtFunc): try: self[key] <<<= updtFunc(self[key]) ## This is "slow" with Python "=" since the key has to be searched ## twice But the new built-in method just has to update the value the ## first time the key is found. Therefore speed should be ok. return True except KeyError: self[key] = createFunc() return false ## Now your two specialized methods can be easily written as : ## A built-in should be provided for this (if not already proposed) : def identical(val): return val def count(self, key, qty=1): self.update_or_another_great_name(key, identical, partial(operator.add, qty)) ## partial is coming from : http://www.python.org/peps/pep-0309.html ## Using only built-in function (assuming "identical") as arguments makes it ## ok for speed (I guess). def appendlist(self, key, *values): self.update_or_another_great_name(key, partial(list, values), partial(ListType.extend, X = values)) ## The first "partial" usage here is an abuse just to make sure that the ## list is not actually constructed before needed. It should work. ## The second usage is more uncertain as we need to bind the arguments from ## the right. Therefore I have to use the name of the parameter and I am not ## sure if there's one. As this list is very prolific, someone might have an ## idea on how to improve this. === END SNAP By using only built-in constructs, this should be fast enough. Otherwise, optimizing these built-ins is a much more clean and sane way of thinking then messing the API with ad-hoc propositions. Reviewing the problems you mention : > The readability issues with the existing constructs are: > > * They are awkward to teach, create, read, and review. The method update_or_another_great_name is easy to understand, I think. But it might not always be easy to use it efficiently with built-ins. But this is always the case. "Recipees" can be added to show how to efficiently use the method. > * Their wording tends to hide the real meaning (accumulation). Solved. > * The meaning of setdefault() 's method name is not self-evident. Solved. > > The performance issues with the existing constructs are: > > * They translate into many opcodes which slows them considerably. I really don't know what will be the outcome of the solution I propose. I certainly do not know anything about how my Python code translates into opcodes. > * The get() idiom requires two dictionary lookups of the same key. Solved > * The setdefault() idiom instantiates a new, empty list prior to every Solved > call. * That new list is often not needed and is immediately discarded. Solved > * The setdefault() idiom requires an attribute lookup for extend/append. Solved > * The setdefault() idiom makes two function calls. Solved And perhaps, what you say here is also true for your two special use-cases : > For other > uses, plain Python code suffices in terms of speed, clarity, and avoiding > unnecessary instantiation of empty containers: > > if key not in d: > d.key = {subkey:value} > else: > d[key][subkey] = value Much better than adding special cases on a generic class. Special cases always demultiply and if we open the door .... Regards, Francis Girard Le samedi 19 Mars 2005 02:24, Raymond Hettinger a ÃcritÂ: > I would like to get everyone's thoughts on two new dictionary methods: > > def count(self, value, qty=1): > try: > self[key] += qty > except KeyError: > self[key] = qty > > def appendlist(self, key, *values): > try: > self[key].extend(values) > except KeyError: > self[key] = list(values) > > The rationale is to replace the awkward and slow existing idioms for > dictionary based accumulation: > > d[key] = d.get(key, 0) + qty > d.setdefault(key, []).extend(values) > > In simplest form, those two statements would now be coded more readably as: > > d.count(key) > d.appendlist(key, value) > > In their multi-value forms, they would now be coded as: > > d.count(key, qty) > d.appendlist(key, *values) > > The error messages returned by the new methods are the same as those > returned by the existing idioms. > > The get() method would continue to exist because it is useful for > applications other than accumulation. > > The setdefault() method would continue to exist but would likely not make > it into Py3.0. > > > PROBLEMS BEING SOLVED > --------------------- > > The readability issues with the existing constructs are: > > * They are awkward to teach, create, read, and review. > * Their wording tends to hide the real meaning (accumulation). > * The meaning of setdefault() 's method name is not self-evident. > > The performance issues with the existing constructs are: > > * They translate into many opcodes which slows them considerably. > * The get() idiom requires two dictionary lookups of the same key. > * The setdefault() idiom instantiates a new, empty list prior to every > call. * That new list is often not needed and is immediately discarded. > * The setdefault() idiom requires an attribute lookup for extend/append. > * The setdefault() idiom makes two function calls. > > The latter issues are evident from a disassembly: > >>> dis(compile('d[key] = d.get(key, 0) + qty', '', 'exec')) > > 1 0 LOAD_NAME 0 (d) > 3 LOAD_ATTR 1 (get) > 6 LOAD_NAME 2 (key) > 9 LOAD_CONST 0 (0) > 12 CALL_FUNCTION 2 > 15 LOAD_NAME 3 (qty) > 18 BINARY_ADD > 19 LOAD_NAME 0 (d) > 22 LOAD_NAME 2 (key) > 25 STORE_SUBSCR > 26 LOAD_CONST 1 (None) > 29 RETURN_VALUE > > >>> dis(compile('d.setdefault(key, []).extend(values)', '', 'exec')) > > 1 0 LOAD_NAME 0 (d) > 3 LOAD_ATTR 1 (setdefault) > 6 LOAD_NAME 2 (key) > 9 BUILD_LIST 0 > 12 CALL_FUNCTION 2 > 15 LOAD_ATTR 3 (extend) > 18 LOAD_NAME 4 (values) > 21 CALL_FUNCTION 1 > 24 POP_TOP > 25 LOAD_CONST 0 (None) > 28 RETURN_VALUE > > In contrast, the proposed methods use only a single attribute lookup and > function call, they use only one dictionary lookup, they use very few > opcodes, and they directly access the accumulation functions, > PyNumber_Add() or PyList_Append(). IOW, the performance improvement > matches the readability improvement. > > > ISSUES > ------ > > The proposed names could possibly be improved (perhaps tally() is more > active and clear than count()). > > The appendlist() method is not as versatile as setdefault() which can be > used with other object types (perhaps for creating dictionaries of > dictionaries). However, most uses I've seen are with lists. For other > uses, plain Python code suffices in terms of speed, clarity, and avoiding > unnecessary instantiation of empty containers: > > if key not in d: > d.key = {subkey:value} > else: > d[key][subkey] = value > > > > Raymond Hettinger -- http://mail.python.org/mailman/listinfo/python-list