On 2017-11-01 19:26, Ned Batchelder wrote:
From David Beazley (https://twitter.com/dabeaz/status/925787482515533830):
>>> a = 'n'
>>> b = 'ñ'
>>> sys.getsizeof(a)
50
>>> sys.getsizeof(b)
74
>>> float(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: 'ñ'
>>> sys.getsizeof(b)
77
Huh?
It's all explained in PEP 393.
It's creating an additional representation (UTF-8 + zero-byte
terminator) of the value and is caching that, so there'll then be the
bytes for 'ñ' and the bytes for the UTF-8 (0xC3 0xB1 0x00).
When the string is ASCII, the bytes of the UTF-8 representation is
identical to those or the original string, so it can share them.
--
https://mail.python.org/mailman/listinfo/python-list