Steven D'Aprano wrote: > Python lists are over-allocated: whenever they need to be resized, they > are made a little bit larger than necessary so that appends will be fast. > > See: > > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c > > I'm interested in gathering some statistics on this, and to do so I need > a way of measuring the list's logical size versus its actual size. The > first is easy: it's just len(list). Is there some way of getting the > allocated size of the list?
With some brute force guesswork... >>> a = [None] * 42 >>> for i in range(10): ... print i, ctypes.c_ulong.from_address(id(a)+i*8) ... 0 c_ulong(1L) 1 c_ulong(8488896L) 2 c_ulong(42L) 3 c_ulong(37432768L) 4 c_ulong(42L) 5 c_ulong(1L) 6 c_ulong(8510496L) 7 c_ulong(32L) 8 c_ulong(18446744073709551615L) 9 c_ulong(8935143189711421440L) >>> a = [] >>> for i in range(42): a.append(None) ... >>> for i in range(10): ... print i, ctypes.c_ulong.from_address(id(a)+i*8) ... 0 c_ulong(1L) 1 c_ulong(8488896L) 2 c_ulong(42L) 3 c_ulong(40136160L) 4 c_ulong(46L) 5 c_ulong(0L) 6 c_ulong(0L) 7 c_ulong(0L) 8 c_ulong(0L) 9 c_ulong(0L) ... you can see that the size sits at offset 4*sizeof(long) on my 64-bit Python. With that information we can take a look at a list's overallocation strategy: >>> a = [] >>> for i in range(20): ... print i, ctypes.c_ulong.from_address(id(a)+4*8) ... a.append(None) ... 0 c_ulong(0L) 1 c_ulong(4L) 2 c_ulong(4L) 3 c_ulong(4L) 4 c_ulong(4L) 5 c_ulong(8L) 6 c_ulong(8L) 7 c_ulong(8L) 8 c_ulong(8L) 9 c_ulong(16L) 10 c_ulong(16L) 11 c_ulong(16L) 12 c_ulong(16L) 13 c_ulong(16L) 14 c_ulong(16L) 15 c_ulong(16L) 16 c_ulong(16L) 17 c_ulong(25L) 18 c_ulong(25L) 19 c_ulong(25L) Peter -- http://mail.python.org/mailman/listinfo/python-list