Hello again.

I appreciate the grown interest in this issue.

Torsten, may I know what is the benefit on your code? My PR solved it
by only tweaking the utf8.c's function 'has_prohibited_utf_bom', which
is likely the shortest way:

https://github.com/git/git/pull/550/files

In order to make sure everything is clear, here is a case list of
current Git behaviour and new one after my PR, regarding this issue.

Current behaviour:

- Placing 'test.txt working-tree-encoding=UTF-16' for a new test.txt
file with either UTF-16 BE or LE BOM, and comitting everything -> The
file gets re-encoded from UTF-8 (as stored internally), to UTF-16 and
the default system/libiconv endianness -> Problem (as long as user
required the opposite endianness for any reason on his project). As a
note, user can see however human-readable diffs on that file.

- Placing  'test.txt working-tree-encoding=UTF-16LE' or 'test.txt
working-tree-encoding=UTF-16BE' for a new test.txt file with either
UTF-16 BE or LE BOM, and comitting everything: we assume user is doing
this because he requires that exact endianness, thus he writes it in
order to attempt preserving it -> Git prohibites commiting it, also no
human-readable diff is shown in the diff viewer/tool being used, but
file is simply shown as binary.

New behaviour:

-  Just got too lazy to repeat it all over, read my PR description:
https://github.com/git/git/pull/550

- Git translations may need to be tweaked to in order to be consistent
with new behaviour.

Thanks for your attention.

Reply via email to