Ezio Melotti <ezio.melo...@gmail.com> added the comment: After the recent discussions on python-dev I went through the Unicode howto and fixed a few things, then I found this issue so I'm attaching the patch here. The patch addresses mostly markup issues, but it also removes the usage of 'byte string'. A few more things that should be done: * clarify some more terms (e.g. codepoints, code units, characters, possibly scalar values etc.); * mention the differences between narrow and wide builds, including: - a discussion about the UCS-2/UTF-16 implementation of narrow builds; - something about surrogates and surrogate pairs; - effects of slicing and indexing on narrow builds; - functions/methods that (don't) accept non-BMP chars on narrow builds; * something about Unicode supports in the re module (this probably can wait after the 'regex' inclusion).
Also the codecs doc has a section about Unicode and encodings that might be moved to the howto. ---------- assignee: georg.brandl -> resolution: fixed -> stage: -> commit review versions: +Python 3.3 Added file: http://bugs.python.org/file23081/issue4153-2.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4153> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com