On 20/11/2018 16:28, Nguyễn Thái Ngọc Duy wrote:
> Commit b878579ae7 (clone: report duplicate entries on case-insensitive
> filesystems - 2018-08-17) adds a warning to user when cloning a repo
> with case-sensitive file names on a case-insensitive file system. The
> "find duplicate file" check was doing by comparing inode number (and
> only fall back to fspathcmp() when inode is known to be unreliable
> because fspathcmp() can't cover all case folding cases).
>
> The inode check is very simple, and wrong. It compares between a
> 32-bit number (sd_ino) and potentially a 64-bit number (st_ino). When
> an inode is larger than 2^32 (which seems to be the case for APFS), it
> will be truncated and stored in sd_ino, but comparing with itself will
> fail.
>
> As a result, instead of showing a pair of files that have the same
> name, we show just one file (marked before the beginning of the
> loop). We fail to find the original one.
>
> The fix could be just a simple type cast (*)
>
> dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino
>
> but this is no longer a reliable test, there are 4G possible inodes
> that can match sd_ino because we only match the lower 32 bits instead
> of full 64 bits.
>
> There are two options to go. Either we ignore inode and go with
> fspathcmp() on Apple platform. This means we can't do accurate inode
> check on HFS anymore, or even on APFS when inode numbers are still
> below 2^32.
>
> Or we just to to reduce the odds of matching a wrong file by checking
> more attributes, counting mostly on st_size because st_xtime is likely
> the same. This patch goes with this direction, hoping that false
> positive chances are too small to be seen in practice.
>
> While at there, enable the test on Cygwin (verified working by Ramsay
> Jones)
Well, no, I tested the previous version of this patch. However, this
patch also passes the test. (Note _test_ singular - in order to check
that this patch doesn't cause a regression I would need to run the
whole test-suite - that takes 3.5 hours, if I'm not doing anything
else!)
>
> (*) this is also already done inside match_stat_data()
>
> Reported-by: Carlo Arenas <care...@gmail.com>
> Helped-by: Ramsay Jones <ram...@ramsayjones.plus.com>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
> ---
> So I'm going with match_stat_data(). But I don't know, perhaps just
> ignoring inode (like Carlo's original patch) is safer/better?
>
> Tested on case-insensitive JFS on Linux. But I don't think it really
> matters because I'm not even sure if I could push inode above 2^32
> with this. Hacking JFS for this test sounds fun, but no time for that.
>
> entry.c | 4 ++--
> t/t5601-clone.sh | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/entry.c b/entry.c
> index 5d136c5d55..0a3c451f5f 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout
> *state,
> {
> int i, trust_ino = check_stat;
>
> -#if defined(GIT_WINDOWS_NATIVE)
> +#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__)
I was a little curious about this (but couldn't be bothered actually
read the code, post-application), so I removed this hunk from the
patch, rebuilt and ran the test again: it _passed_ the test. :-D
So, ...
ATB,
Ramsay Jones
> trust_ino = 0;
> #endif
>
> @@ -419,7 +419,7 @@ static void mark_colliding_entries(const struct checkout
> *state,
> if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
> continue;
>
> - if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) ||
> + if ((trust_ino && !match_stat_data(&dup->ce_stat_data, st)) ||
> (!trust_ino && !fspathcmp(ce->name, dup->name))) {
> dup->ce_flags |= CE_MATCHED;
> break;
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index f1a49e94f5..c28d51bd59 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' '
> )
> '
>
> -test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file
> detection' '
> +test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' '
> grep X icasefs/warning &&
> grep x icasefs/warning &&
> test_i18ngrep "the following paths have collided" icasefs/warning
>