Re: [bug-gnulib] ISSLASH on Woe32

Bruno Haible Thu, 28 Apr 2005 08:14:03 -0700

Paul Eggert wrote:
> Why would gnulib itself need to care
> about the difference between (2) and (4)?  Either way, gnulib can
> easily look for '/' and '\' in path names.  Isn't it up to the
> supplier of the underlying system-call implementation, and/or the
> gnulib user, to decide whether (2) or (4) is in use?  In other words,
> can't gnulib itself be agnostic about (2) versus (4)?


Let's take an example contained in gnulib. (You can find many more examples
which are half contained in gnulib and half contained in 
coreutils/findutils/...)
Take localcharset.c. (Forget for one moment that the code is currently not
used in Woe32, for different reasons.)

The code currently is essentially

      dir = relocate (LIBDIR);

      /* Concatenate dir and base into freshly allocated file_name.  */
      {
        size_t dir_len = strlen (dir);
        size_t base_len = strlen (base);
        int add_slash = (dir_len > 0 && !ISSLASH (dir[dir_len - 1]));
        file_name = (char *) malloc (dir_len + add_slash + base_len + 1);
        if (file_name != NULL)
          {
            memcpy (file_name, dir, dir_len);
            if (add_slash)
              file_name[dir_len] = DIRECTORY_SEPARATOR;
            memcpy (file_name + dir_len + add_slash, base, base_len + 1);
          }
      }

      fp = fopen (file_name, "r");

In approach (2) LIBDIR will be an UTF-8 encoded pathname. The ISSLASH
operation will therefore work correctly. However, fopen() expects a
string in locale encoding, not in UTF-8 encoding. Therefore we have
to replace the last line with

      char *real_file_name = u8_conv_to_locale (file_name);
      fp = fopen (real_file_name, "r");
      free (real_file_name);

Or, alternatively, replace the whose set of libc functions dealing with
pathnames with wrappers that take an UTF-8 string:

      fp = u8_fopen (file_name, "r");

Whereas in approach (4), we can leave the code as it is.

> For example, EUC-JP is also safe.  Or perhaps you're not
> mentioning this because Microsoft doesn't support EUC-JP?  (I'm not
> familiar with their support for various encodings.)

I'm not familiar with it either. But the most comprehensive charset aliases
table
http://dev.icu-project.org/cgi-bin/viewcvs.cgi/icu/source/data/mappings/convrtrs.txt?rev=1.115
shows that EUC-JP is unknown as a CP<nnn> encoding, whereas UTF-8 is known as
CP1208 and as CP65001. It is also mentioned in
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_17si.asp
Therefore I think it's not possible to recommend an EUC-JP encoded locale
to Windows users.

Bruno



_______________________________________________
bug-gnulib mailing list
bug-gnulib@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnulib

Re: [bug-gnulib] ISSLASH on Woe32

Reply via email to