Rustom Mody <rustompm...@gmail.com> writes: > On Sunday, March 20, 2016 at 10:32:07 AM UTC+5:30, Steven D'Aprano wrote: <snip> >> Unicode (the character set part of it) is a set of abstract 23-bit numbers, > > 23? Or 21?
It's 21. The reason being (or at least part of the reason being) that 21 bits can be UTF-8 encoded in 4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (3 + 3*6). <snip> -- Ben. -- https://mail.python.org/mailman/listinfo/python-list