Ben Bacarisse <ben.use...@bsb.me.uk>:

> It's 21. The reason being (or at least part of the reason being) that
> 21 bits can be UTF-8 encoded in 4 bytes: 11110xxx 10xxxxxx 10xxxxxx
> 10xxxxxx (3 + 3*6).

I bet the reason is UTF-16. Microsoft and Sun/Oracle would have insisted
on a maximum of 4 bytes per character. UTF-16 can just barely squeeze 21
bits into the scheme and only at the expense of creating an ugly hole
inside Unicode. Politics, politics.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to