On 2017-11-01 19:26, Ned Batchelder wrote:
  From David Beazley (https://twitter.com/dabeaz/status/925787482515533830):

      >>> a = 'n'
      >>> b = 'ñ'
      >>> sys.getsizeof(a)
     50
      >>> sys.getsizeof(b)
     74
      >>> float(b)
     Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
     ValueError: could not convert string to float: 'ñ'
      >>> sys.getsizeof(b)
     77

Huh?

It's all explained in PEP 393.

It's creating an additional representation (UTF-8 + zero-byte terminator) of the value and is caching that, so there'll then be the bytes for 'ñ' and the bytes for the UTF-8 (0xC3 0xB1 0x00).

When the string is ASCII, the bytes of the UTF-8 representation is identical to those or the original string, so it can share them.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to