Re: RE Module Performance

Chris Angelico Thu, 25 Jul 2013 12:23:27 -0700

On Fri, Jul 26, 2013 at 5:07 AM,  <[email protected]> wrote:
> Let start with a simple string \textemdash or \texttendash
>
>>>> sys.getsizeof('–')
> 40
>>>> sys.getsizeof('a')
> 26


Most of the cost is in those two apostrophes, look:

>>> sys.getsizeof('a')
26
>>> sys.getsizeof(a)
8

Okay, that's slightly unfair (bonus points: figure out what I did to
make this work; there are at least two right answers) but still, look
at what an empty string costs:

>>> sys.getsizeof('')
25

Or look at the difference between one of these characters and two:

>>> sys.getsizeof('aa')-sys.getsizeof('a')
1
>>> sys.getsizeof('––')-sys.getsizeof('–')
2

That's what the characters really cost. The overhead is fixed. It is,
in fact, almost completely insignificant. The storage requirement for
a non-ASCII, BMP-only string converges to two bytes per character.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: RE Module Performance

Reply via email to