Motomichi Matsuzaki wrote:

> At Wed, 27 Dec 2000 12:05:57 +0200,
> Maxim Sobolev <[EMAIL PROTECTED]> wrote:
> > Several days ago I got a CD with Russian filenames on it and discovered that
> > I'm unable to read those filenames. After some hacking I produced a patch,
>
> Vladimir Kushnir's patch will be for it.
>
> 
>http://www.freebsd.org/cgi/getmsg.cgi?fetch=270425+0+/usr/local/www/db/text/2000/freebsd-hackers/20001203.freebsd-hackers
>
> and it is based on my patch:
>
> http://triaez.kaisei.org/~mzaki/joliet/
>
> > which should solve this problem in the manner similar to what we have in
> > msdosfs module (i.e. user-provided conversion table). I have to emphasize that
> > it's a temporary solution until we will have iconv support in kernel.
>
> *PLEASE* be careful about filename I18N.
>
> 1. Joliet extension
>
> The Joliet extension are built on Unicode basis,
> and is the "multilingual" filesystem.
> We can found CDs which contain files named by all of
> English, French, Russian, Chinese, and Japanese languages.
> So charset conversion per mount is not sufficient.

You can specify multiple charset conversion tables for each mount point, the problem 
is only to create appropriate conversion
tables (I do not have any CDs with anything than English/Russian filenames :-> ).

> 3. Relation to userland applications
>
> Currently, conversion table between Unicode and local charset are
> widely needed and implemented, for such as the Joliet extension,
> the FAT filesystem, TrueType rasterizers, WWW browsers, and so on.
> We should share the tables as possible for their consintency.
> So the ideal solution to code conversion are not in-kernel table
> but userland shared library.
> Therefore, filename code conversion should also be done in userland
> as possible.
>
> 4. Rough idea of me
>
> My preliminary idea to the filesystem I18N:
>
> * filenames recorded on Unix filesystems (e.g. FFS, MFS) use
>   an arbitrary codeset, for example Unicode.
>
> * interface between kernel and userland should use
>   filesystem-safe encoding, for example UTF-8.
>
> * userland applications can convert from/to the user-requested
>   charsets, such as latin-2, koi8, and euc-jp, using shared library.
>
> * the Joliet extension and UDF, which based on Unicode, need
>   no in-kernel conversion, in case Unix filesystems use Unicode.
>
> * the FAT filesystem, which use both Unicode and conventional
>   codepages, requires in-kernel conversion in order to
>   write the conventional 8.3 names.
>
> Any ideas?

Thanks for the pointing out, but I think that your proposal is too generic to be 
committed any time soon (not even to mention
MFC'ing it). Moreover, as I pointed out, currently efforts to provide generic Unicode 
functionality in kernel/userland are
underway, so it is likely that part of your work will be duplicated/obsoleted.

What I'm proposing here is quick'n'dirty (and limited as so) solution to allow 
mounting CD's with unicode filenames on it.
This solution is targeted to be temporary until iconv-based kernel interfaces will 
appear.

-Maxim



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message
  • ... Maxim Sobolev
    • ... Андрей Чернов
      • ... Maxim Sobolev
        • ... Андрей Чернов
    • ... Motomichi Matsuzaki
      • ... Noriyuki Soda
      • ... Kenichi Okuyama
        • ... Noriyuki Soda
      • ... Maxim Sobolev
        • ... Motomichi Matsuzaki
          • ... Maxim Sobolev
            • ... Motomichi Matsuzaki
              • ... Maxim Sobolev
                • ... Michael C . Wu
                • ... Maxim Sobolev
                • ... Michael C . Wu
                • ... Андрей Чернов
                • ... Michael C . Wu
                • ... Андрей Чернов

Reply via email to