Marc-Andre Lemburg <m...@egenix.com> added the comment: Ezio Melotti wrote: > > Ezio Melotti <ezio.melo...@gmail.com> added the comment: > > [This should probably be discussed on python-dev or in another issue, so feel > free to move the conversation there.] > > The current implementation considers printable """all the characters except > those characters defined in the Unicode character database as following > categories are considered printable. > * Cc (Other, Control) > * Cf (Other, Format) > * Cs (Other, Surrogate) > * Co (Other, Private Use) > * Cn (Other, Not Assigned) > * Zl Separator, Line ('\u2028', LINE SEPARATOR) > * Zp Separator, Paragraph ('\u2029', PARAGRAPH SEPARATOR) > * Zs (Separator, Space) other than ASCII space('\x20').""" > > We could also arbitrary exclude all the non-BMP chars, but that shouldn't be > based on the availability of the fonts IMHO.
Without fonts, you can't print the code points, even if the Unicode database defines the code point as not having one of the above classes. And that's probably also the reason why the Unicode database doesn't define a printable property :-) I also find the use of Zl, Zp and Zs in the definition somewhat arbitrary: whitespace is certainly printable. This also doesn't match the isprint() C lib API: http://www.cplusplus.com/reference/clibrary/cctype/isprint/ "A printable character is any character that is not a control character." >> Note that Python3 will send printable code points as-is to the >> console, so whether or not a code point is considered printable >> should take the common availability of fonts being able to display >> the code point into account. Otherwise, a user would just see a >> square box instead of the much more useful escape sequence > > If the concern is about the usefulness of repr() in the console, note that on > the Windows terminal trying to display most of the characters results in an > error (see #5110), and that makes repr() barely usable. > ascii() might be an alternative if the user wants to see the escape sequence > instead of a square box. That's a different problem, but indeed also related to the printable property which was introduced as part of the Unicode repr() change: if the console encoding cannot represent the printable code points, you get an error. I was never a fan of the Unicode repr() change to begin with. The repr() of an object should work in almost all cases. Being able to read the repr() of an object in clear text is only secondary. IMHO, allowing all printable code points to pass through unescaped was not beneficial. We have str() for getting readable representations of objects. Anyway, we're stuck with it now, so have to work around the issues... ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5127> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com