On Mon, Sep 23, 2019 at 03:04:19AM -0700, Alexandr Miloslavskiy via 
GitGitGadget wrote:
> From: Alexandr Miloslavskiy <alexandr.miloslavs...@syntevo.com>

Thanks for the tests, some nit-picks inline.

>
> After I discovered that UTF-16-LE-BOM test was bugged and still
> succeeded...

My interpretation is that the \000\000 must be handled correctly
on all platforms, and that seems to be the case.

Would this make more sense:

After I discovered that UTF-16-LE-BOM test was bugged,
I decided that better tests are required

> ... I decided that better tests are required. Possibly the best
> option here is to compare git results against hardcoded ground truth.


>
> The new tests also cover more interesting chars where (ANSI != UTF-8).
>
> Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavs...@syntevo.com>
> ---
>  t/t0028-working-tree-encoding.sh | 39 ++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/t/t0028-working-tree-encoding.sh 
> b/t/t0028-working-tree-encoding.sh
> index 5493cf3ca9..d0dd5dd0ea 100755
> --- a/t/t0028-working-tree-encoding.sh
> +++ b/t/t0028-working-tree-encoding.sh
> @@ -280,4 +280,43 @@ test_expect_success ICONV_SHIFT_JIS 'check roundtrip 
> encoding' '
>       git reset
>  '
>
> +# $1: checkout encoding
> +# $2: test string
> +# $3: binary test string in checkout encoding
> +test_commit_utf8_checkout_other () {
> +     encoding="$1"
> +     orig_string="$2"
> +     expect_bytes="$3"
> +
> +     test_expect_success "Commit utf-8, checkout ${encoding}" '

General remark:
Do we need the {} here?
${encoding} could be simpler written as $encoding

> +             test_when_finished "git checkout HEAD -- .gitattributes" &&
> +
> +             test_ext="commit_utf8_checkout_${encoding}" &&
> +             test_file="test.${test_ext}" &&
> +
> +             # Commit as utf-8

Another nit-pick:
Looking at the other test cases, should utf-8 be written as UTF-8
for consistency ?

> +             echo "*.${test_ext} text working-tree-encoding=utf-8" 
> >.gitattributes &&
> +             printf "${orig_string}" >"${test_file}" &&
> +             git add "${test_file}" &&
> +             git commit -m "Test data" &&
> +
> +             # Checkout in tested encoding
> +             rm "${test_file}" &&
> +             echo "*.${test_ext} text working-tree-encoding=${encoding}" 
> >.gitattributes &&
> +             git checkout HEAD -- "${test_file}" &&
> +
> +             # Test
> +             printf "${expect_bytes}" > "${test_file}.raw" &&
> +             test_cmp_bin "${test_file}.raw" "${test_file}"

More a style-nit: could we simply write like this:
                printf $expect_bytes > $test_file.raw &&
                test_cmp_bin $test_file.raw $test_file

(Even on other places)

> +     '
> +}
> +
> +test_commit_utf8_checkout_other "UTF-8"        "Test Тест" 
> "\124\145\163\164\040\320\242\320\265\321\201\321\202"
> +test_commit_utf8_checkout_other "UTF-16LE"     "Test Тест" 
> "\124\000\145\000\163\000\164\000\040\000\042\004\065\004\101\004\102\004"
> +test_commit_utf8_checkout_other "UTF-16BE"     "Test Тест" 
> "\000\124\000\145\000\163\000\164\000\040\004\042\004\065\004\101\004\102"
> +test_commit_utf8_checkout_other "UTF-16LE-BOM" "Test Тест" 
> "\377\376\124\000\145\000\163\000\164\000\040\000\042\004\065\004\101\004\102\004"
> +test_commit_utf8_checkout_other "UTF-16BE-BOM" "Test Тест" 
> "\376\377\000\124\000\145\000\163\000\164\000\040\004\042\004\065\004\101\004\102"
> +test_commit_utf8_checkout_other "UTF-32LE"     "Test Тест" 
> "\124\000\000\000\145\000\000\000\163\000\000\000\164\000\000\000\040\000\000\000\042\004\000\000\065\004\000\000\101\004\000\000\102\004\000\000"
> +test_commit_utf8_checkout_other "UTF-32BE"     "Test Тест" 
> "\000\000\000\124\000\000\000\145\000\000\000\163\000\000\000\164\000\000\000\040\000\000\004\042\000\000\004\065\000\000\004\101\000\000\004\102"
> +
>  test_done
> --
> gitgitgadget
>

Reply via email to