On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: > Of course, UTF-8 in a bytes object doesn't make the situation any > better, but does it make it any worse?
> > As it stands, we have > > รจ --[encode>-- Unicode --[reencode>-- UTF-8 > > Why is one encoding format better than the other? This is precisely the logic behind Google using UTF-8 for strings in Go, rather than having some O(1) abstract type like Python has. And many other languages do the same. The argument is that because of the very issues that you mention, having O(1) lookup in a string isn't that important, since looking up a particular index in a unicode string is rarely the right thing to do, so UTF-8 is just fine as a native, in-memory type. -- https://mail.python.org/mailman/listinfo/python-list