On 25 February 2014 22:36, Tim Chase <python.l...@tim.thechases.com> wrote: > On 2014-02-25 22:21, Duncan Booth wrote: >> > It would save some space if I didn't have to duplicate all the >> > keys into sets (on the order of 10-100k small strings), instead >> > being able to directly perform the set-ops on the dicts. But >> > otherwise, it was pretty readable & straight-forward. >> > >> It doesn't matter whether they were small strings or full-length >> novels, creating a set from a dict doesn't duplicate any strings. > > pre-my-new-learning-about .viewkeys() it sounds like set(my_dict) > would have the overhead of storing an additional reference to a > string per set-entry (rather than duplicating every string). > With .viewkeys()/.keys(), it sounds like that overhead would go away.
You lose the redundant hash-table by using the keys view. The set hash table takes 20-40 bytes per entry on this computer (according to sys.getsizeof). On the same system a short string takes a similar amount of memory. The .keys() view object apparently takes 24 bytes in total so yes it's a reasonable memory saving if the dicts or of any significant size. However the .keys() objects are not necessarily as well optimised so it may be faster to convert to sets for some operations. One example I found some time ago is that for sets A & B will iterate over the smaller set but for dicts A.keys() & B.keys() would always go over the left-hand set (or right-hand - I don't remember). This makes a considerable difference if one set is significantly larger than the other. Oscar -- https://mail.python.org/mailman/listinfo/python-list