Gary Herron wrote: > Emanuele D'Arrigo wrote: >> Hi everybody, >> >> while testing a module today I stumbled on something that I can work >> around but I don't quite understand. >> > > *Do NOT use "is" to compare immutable types.* **Ever! **
Huh? How am I supposed to compare immutable types for identity then? Your bizarre instruction would prohibit: if something is None which is the recommended way to compare to None, which is immutable. The standard library has *many* identity tests to None. I would say, *always* use "is" to compare any type whenever you intend to compare by *identity* instead of equality. That's what it's for. If you use it to test for equality, you're doing it wrong. But in the very rare cases where you care about identity (and you almost never do), "is" is the correct tool to use. > It is an implementation choice (usually driven by efficiency > considerations) to choose when two strings with the same value are stored > in memory once or twice. In order for Python to recognize when a newly > created string has the same value as an already existing string, and so > use the already existing value, it would need to search *every* existing > string whenever a new string is created. Not at all. It's quite easy, and efficient. Here's a pure Python string constructor that caches strings. class CachedString(str): _cache = {} def __new__(cls, value): s = cls._cache.setdefault(value, value) return s Python even includes a built-in function to do this: intern(), although I believe it has been removed from Python 3.0. > Clearly that's not going to be efficient. Only if you do it the inefficient way. > However, the C implementation of Python does a limited version > of such a thing -- at least with strings of length 1. No, that's not right. The identity test fails for some strings of length one. >>> a = '\n' >>> b = '\n' >>> len(a) == len(b) == 1 True >>> a is b False Clearly, Python doesn't intern all strings of length one. What Python actually interns are strings that look like, or could be, identifiers: >>> a = 'heresareallylongstringthatisjustmade' \ ... 'upofalphanumericcharacterssuitableforidentifiers123_' >>> >>> b = 'heresareallylongstringthatisjustmade' \ ... 'upofalphanumericcharacterssuitableforidentifiers123_' >>> a is b True It also does a similar thing for small integers, currently something like -10 through to 256 I believe, although this is an implementation detail subject to change. -- Steven -- http://mail.python.org/mailman/listinfo/python-list