https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103305
Pekka S <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] | |i.fi --- Comment #14 from Pekka S <[email protected]> --- (In reply to Jonathan Wakely from comment #6) > (In reply to Jonathan Wakely from comment #5) > > static const mask blank = space; > > We might want to use blank = _ISspace | _ISblank for this last one, but I > don't really understand what newlib defines those categories as: > > > #define isblank(__c) \ > __extension__ ({ __typeof__ (__c) __x = (__c); \ > (__ctype_lookup(__x)&_B) || (int) (__x) == '\t';}) > (__ctype_lookup(__x)&_ISblank) || (int) (__x) == '\t';}) > > This definition is weird ... why is '\t' not already handled by _ISblank? It has been attempted in the past: https://sourceware.org/legacy-ml/newlib/2009/threads.html#00342 The used 8-bit mask is simply not wide enough to disambiguate all POSIX character classes; Namely space, blank and print classes are the problematic ones to distinguish properly. The naming of newlib character classes does not fully align with POSIX, and this has to do with the restrictions that come from space concerns and limitations. Also, libstdc++-v3/config/locale/newlib/ctype_members.cc does not handle blank class even though newlib supports wctype("blank"). As explained above, in this case it really doesn't matter, since matching a character to a (true POSIX) class using a mask bit alone is not possible. Anyway, I made a similar patch but never got around submitting it. I also used _ISblank | _ISspace since IMHO it is "less wrong" than _ISspace (or equal to space) alone and added a note explaining the issue. (Yes, I was about to repeat the history.)
