Marc-Andre Lemburg <m...@egenix.com> added the comment: John Machin wrote: > > John Machin <sjmac...@users.sourceforge.net> added the comment: > > @lemburg: RFC 2279 was obsoleted by RFC 3629 over 6 years ago.
I know. > The standard now says 21 bits is it. It says that the current Unicode codespace only uses 21 bits. In the early days 16 bits were considered enough, so it wouldn't surprise me, if they extend that range again at some point in the future - after all, leaving 11 bits unused in UCS-4 is a huge waste of space. If you have a reference that the Unicode consortium has decided to stay with that limit forever, please quote it. > F5-FF are declared to be invalid. I don't understand what you mean by > "supporting those possibilities". The code is correctly issuing an error > message. The goal of supporting the new resyncing and FFFD-emitting rules > might be better met however by throwing away the code in the default clause > and instead merely setting the entries for F5-FF in the utf8_code_length > array to zero. Fair enough. Let's do that. The reference in the table should then be updated to RFC 3629. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8271> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com