Raymond Hettinger wrote: >Stefan Behnel wrote:
I stumbled over the fact that 'frozenset()' doesn't return a constant but
creates a new object everytime. Since it is immutable, I wrote to c.l.py that this behaviour is different from what tuple() & Co do.
It is not quite correct to say that this is what all immutables do:
.>>>x = 500 .>>>y = 600 - 100 .>>>x is y False
I know. The same is true for concateneted strings, etc. But whenever an immutable object is created directly ('by hand'), it holds. It also holds, btw, for tuple() - as opposed to ().
For tuples, it is an optimization of a frequent use case (internally, empty tuples are passed around for empty argument lists).
I definitely see the use. When I tried to figure out how to implement the patch, I looked at the source of tuple objects and saw that there is quite a bunch of cached constants. I'm pretty sure such optimizations are not necessary for sets.
Do you have real use cases for frequent creation of frozenset([])? I would be interested in seeing how the application. As designed, the principal use case for frozensets was in implementing sets of sets. The secondary case was for using sets as dictionary keys. Neither of those use cases entails storing more than one instance of frozenset([]).
I actually use frozensets whenever I know that my set is going to be immutable (I thought that was what they were meant for).
And similar to the usage of tuples as replacements for empty lists, I definitely pass a frozenset whenever I need a dummy set-like object.
If I know that frozenset() is not constant, I may end up with keeping a dummy reference around somewhere that is passed instead of a 'new' frozenset(). But that is definitely more ugly than making frozenset() constant internally.
I actually see the difference between
constant frozenset()
and
constant frozenset([])
and the like. Just imagine things like
frozenset(for i in [False]*1000 if i)
I wouldn't want a guarantee that that one returns a singleton. I guess that check would really make it inefficient: create the object, start adding nothing, finish adding nothing, check if anything was added, discard object, return singleton. Brrrrr...
But frozenset() should still be the most common case of creating empty sets.
I think the main use case is passing them instead of sets whenever a method interface demands a set-like object for read-only access or wants to return an empty set. Tuples cannot always replace this.
I started using sets rather frequently in my source and frozenset() tends to turn up relatively often.
I'll ponder your idea for a bit. As it stands, the patch needs more work (to remove the singleton just before Python exits -- see similar operations for freelists).
I kinda figured that. Would have been too easy, then... :)
I don't mind finishing the patch but am not sure it is the right thing to do. It is essentially an optimization of one case at the expense of added code and of slightly de-optimizing other cases.
Very slightly, I'd say. The only case that eats a couple of instructions at creation time is 'new' being called from a subclass. I wouldn't expect many people to subclass frozenset - but then, that's just me...
Also, I don't know what else needs to be changed and what a difference that makes. But if the added code is reasonably limited, I'd vote for it.
Stefan -- http://mail.python.org/mailman/listinfo/python-list