Karsten Blees <karsten.bl...@gmail.com> writes:

> diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt
> index e9a1d5d..e5f6233 100644
> --- a/Documentation/i18n.txt
> +++ b/Documentation/i18n.txt
> @@ -1,18 +1,28 @@
> -At the core level, Git is character encoding agnostic.
> -
> - - The pathnames recorded in the index and in the tree objects
> -   are treated as uninterpreted sequences of non-NUL bytes.
> -   What readdir(2) returns are what are recorded and compared
> -   with the data Git keeps track of, which in turn are expected
> -   to be what lstat(2) and creat(2) accepts.  There is no such
> -   thing as pathname encoding translation.
> +Git is to some extent character encoding agnostic.

I do not think the removal of the text makes much sense here unless
you add the equivalent to the new text below.

>   - The contents of the blob objects are uninterpreted sequences
>     of bytes.  There is no encoding translation at the core
>     level.
>  
> - - The commit log messages are uninterpreted sequences of non-NUL
> -   bytes.
> + - Pathnames are encoded in UTF-8 normalization form C. This

That is true only on some systems like OSX (with HFS+) and Windows,
no?  BSDs in general and Linux do not do any such mangling IIRC.  I
am OK with mangling described as a notable oddball to warn users,
though; i.e. not as a norm as your new text suggests but as an
exception.

> +   platforms. If file system APIs don't use UTF-8 (which may be
> +   file system specific), it is recommended to stick to pure
> +   ASCII file names.

Hmph, who endorsed such a recommendation?  It is recommended to
stick to whatever naming scheme that would not cause troubles to
project participants.  If your participants all want to (and can)
use ISO-8859-1, we do not discourage them from doing so.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to