Mike Gran <spk...@yahoo.com>: > On Tuesday, February 14, 2017 1:07 PM, Marko Rauhamaa > <ma...@pacujo.net> wrote: >> Unicode strings are a special data type that have relatively little> >> practical use. Byte strings are much more fundamental. C's "char *" >> is perfect. > > Human language itself is of limited practical use except for > communicating information to people that read languages that have a > text representation.
Unicode is useful, don't get me wrong. However, Unicode is not the same as "human language itself". Unicode is a huge can of worms, and yet not big enough. It is best reserved for the use of text-processing applications. It shouldn't be shoved down the throat of each and every application. A much more fundamental data type is the byte string, which can represent many things, including Unicode. With UTF-8, I mostly don't need an interpretive step to deal with plain text. Sure, I can't know the visual width of my plain text string, but it's not simply the number of Unicode points, either, because of diacritics and other similar complications. >> In particular, filenames are *not*, nor can they be mapped to, >> Unicode strings in Linux. > > True. Linux should follow OpenBSD and make all locales UTF-8. Maybe, but Guile should wait until Linux has made the transition. There are no signs of such a transition at the moment. Linux deals in bytes and couldn't care less about interpreting those bytes. Marko