On Wed, Apr 08, 2009 at 09:41:18AM +0200, Giacomo A. Catenazzi wrote: > Roger Leigh wrote: > > I wasn't aware that this level of checking was performed, though >> it does make sense. But, does it not reject non 7-bit input in the C >> locale for completeness? >> >> Should tools doing "raw" I/O not be using lower level interfaces >> such as fread() and fwrite() rather than the "formatted" print >> functions which are specified to behave in a locale-dependent >> manner? > > printf is not locale dependent, but on numeric display > (and eventually on some extensions).
Each C FILE* stream has an associated locale. Look at struct _IO_FILE_complete in libio.h. The example program I posted demonstrates that this does actually happen; the output streams use the current locale, and there is a UTF-8 [narrow]/UCS-4 [wide] conversion to the locale codeset on output. When you output a string to a stream, there is a conversion step from the exec charset (either narrow or wide) to the stream's associated locale. I haven't yet found documented exactly where this happens (it's all in the libc internals), but I would hazard a guess that all the "string" functions use this step, where the lower-level byte-based I/O functions skip this step. This machinery is also used by the C++ iostream locale imbue() mechanism. So while printf itself might not do the conversion, it's done at some point, probably when printf copies the formatted string to the stream buffer. Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail. -- To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org