On Wed, 12 Oct 2005, Jorgen Grahn wrote: > On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson <[EMAIL PROTECTED]> wrote: >> On Mon, 10 Oct 2005, it was written: > ... >>> There is no way you can avoid making garbage. Python conses everything, >>> even integers (small positive ones are cached). >> >> So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar? > > If the SmallInteger hack is something like this, it does: > >>>> a = 42 >>>> b = 42 >>>> a is b > True >>>> a = 42000 >>>> b = 42000 >>>> a is b > False >>>> > > ... which I guess is what if referred to above as "small positive > ones are cached".
That's not what i meant. In both smalltalk and python, every single variable contains a reference to an object - there isn't the object/primitive distinction you find in less advanced languages like java. Except that in smalltalk, this isn't true: in ST, every variable *appears* to contain a reference to an object, but implementations may not actually work like that. In particular, SmallTalk 80 (and some earlier smalltalks, and all subsequent smalltalks, i think) handles small integers (those that fit in wordsize-1 bits) differently: all variables contain a word, whose bottom bit is a tag bit; if it's one, the word is a genuine reference, and if it's zero, the top bits of the word contain a signed integer. The innards of the VM know about this (where it matters), and do the right thing. All this means that small (well, smallish - up to a billion!) integers can be handled with zero heap space and much reduced instruction counts. Of course, it means that references are more expensive, since they have to be checked for integerness before dereferencing, but since this is a few instructions at most, and since small integers account for a huge fraction of the variables in most programs (as loop counters, array indices, truth values, etc), this is a net win. See the section 'Representation of Small Integers' in: http://users.ipa.net/~dwighth/smalltalk/bluebook/bluebook_chapter26.html#TheObjectMemory26 The precise implementation is sneaky - the tag bit for an integer is zero, so in many cases you can do arithmetic directly on the word, with a few judicious shifts here and there; the tag bit for a pointer is one, and the pointer is stored in two's-complement form *with the bottom bit in the same place as the tag bit*, so you can recover a full-length pointer from the word by complementing the whole thing, rather than having to shift. Since pointers are word-aligned, the bottom bit is always a zero, so in the complement it's always a one, so it can also be the status bit! I think this came from LISP initially (most things do) and was probably invented by Guy Steele (most things were). tom -- That's no moon! -- http://mail.python.org/mailman/listinfo/python-list