On 2/10/14 9:43 AM, Tim Chase wrote:
On 2014-02-10 06:07, wxjmfa...@gmail.com wrote:
Python does not save memory at all. A str (unicode string)
uses less memory only - and only - because and when one uses
explicitly characters which are consuming less memory.
Not only the memory gain is zero, Python falls back to the
worse case.
sys.getsizeof('a' * 1000000)
1000025
sys.getsizeof('a' * 1000000 + 'oe')
2000040
sys.getsizeof('a' * 1000000 + 'oe' + '\U00010000')
4000048
If Python used UTF-32 for EVERYTHING, then all three of those cases
would be 4000048, so it clearly disproves your claim that "python
does not save memory at all".
The opposite of what the utf8/utf16 do!
sys.getsizeof(('a' * 1000000 + 'oe' +
'\U00010000').encode('utf-8'))
1000023
sys.getsizeof(('a' * 1000000 + 'oe' +
'\U00010000').encode('utf-16'))
2000025
However, as pointed out repeatedly, string-indexing in fixed-width
encodings are O(1) while indexing into variable-width encodings (e.g.
UTF8/UTF16) are O(N). The FSR gives the benefits of O(1) indexing
while saving space when a string doesn't need to use a full 32-bit
width.
-tkc
Please don't engage in this debate with JMF. His mind is made up, and
he will not be swayed, no matter how persuasive and reasonable your
arguments. Just ignore him.
--
Ned Batchelder, http://nedbatchelder.com
--
https://mail.python.org/mailman/listinfo/python-list