Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Elijah Newren
On Thu, Aug 9, 2018 at 2:59 PM Jeff King wrote: > On Thu, Aug 09, 2018 at 02:53:42PM -0700, Elijah Newren wrote: > > > On Thu, Aug 9, 2018 at 2:44 PM Jeff King wrote: > > > > The error message isn't quite as good, but does the user really need > > > > all the names of the file? If so, we gave th

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Junio C Hamano
Elijah Newren writes: > A possibly crazy idea: Don't bother reporting the other filename; just > report the OID instead. > > "Error: Foo.txt cannot be checked out because another file with hash > is in the way." Maybe even add a hint for the user: "Run > `git ls-files -s` to see see all files a

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Jeff King
On Thu, Aug 09, 2018 at 02:53:42PM -0700, Elijah Newren wrote: > On Thu, Aug 9, 2018 at 2:44 PM Jeff King wrote: > > > The error message isn't quite as good, but does the user really need > > > all the names of the file? If so, we gave them enough information to > > > figure it out, and this is

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Elijah Newren
On Thu, Aug 9, 2018 at 2:44 PM Jeff King wrote: > > The error message isn't quite as good, but does the user really need > > all the names of the file? If so, we gave them enough information to > > figure it out, and this is a really unusual case anyway, right? > > Besides, now we're back to line

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Jeff King
On Thu, Aug 09, 2018 at 02:40:58PM -0700, Elijah Newren wrote: > > I worry that the false positives make this a non-starter. I mean, if > > clone creates files 'A' and 'B' (both equal) and then tries to create > > 'b', would the collision code reports that 'b' collided with 'A' because > > that w

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Elijah Newren
On Thu, Aug 9, 2018 at 2:14 PM Jeff Hostetler wrote: > On 8/9/2018 10:23 AM, Jeff King wrote: > > On Wed, Aug 08, 2018 at 05:41:10PM -0700, Junio C Hamano wrote: > >> If we found that there is something when we tried to write out > >> "Foo.txt", if we open "Foo.txt" on the working tree and hash-ob

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Jeff King
On Thu, Aug 09, 2018 at 05:14:16PM -0400, Jeff Hostetler wrote: > > Clever. You might still run into false positives when there is > > duplicated content in the repository (especially, say, zero-length > > files). But the fact that you only do the hashing on known duplicates > > helps with that.

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Jeff Hostetler
On 8/9/2018 10:23 AM, Jeff King wrote: On Wed, Aug 08, 2018 at 05:41:10PM -0700, Junio C Hamano wrote: If we have an equivalence-class hashmap and feed it inodes (or again, some system equivalent) as the keys, we should get buckets of collisions. I guess one way to get "some system equival

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-09 Thread Jeff King
On Wed, Aug 08, 2018 at 05:41:10PM -0700, Junio C Hamano wrote: > > If we have an equivalence-class hashmap and feed it inodes (or again, > > some system equivalent) as the keys, we should get buckets of > > collisions. > > I guess one way to get "some system equivalent" that can be used as > the

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-08 Thread Junio C Hamano
Jeff King writes: > I think we really want to avoid doing that normalization ourselves if we > can. There are just too many filesystem-specific rules. Exactly; not having to learn these rules is the major (if not whole) point of the "let checkout notice the collision and then deal with it" appro

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-08 Thread Jeff King
On Wed, Aug 08, 2018 at 03:48:04PM -0400, Jeff Hostetler wrote: > > ce_match_stat() may not be a very good measure to see if two paths > > refer to the same file, though. After a fresh checkout, I would not > > be surprised if two completely unrelated paths have the same size > > and have same mt

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-08 Thread Jeff Hostetler
On 8/7/2018 3:31 PM, Junio C Hamano wrote: Nguyễn Thái Ngọc Duy writes: One nice thing about this is we don't need platform specific code for detecting the duplicate entries. I think ce_match_stat() works even on Windows. And it's now equally expensive on all platforms :D ce_match_

Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-07 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy writes: > One nice thing about this is we don't need platform specific code for > detecting the duplicate entries. I think ce_match_stat() works even > on Windows. And it's now equally expensive on all platforms :D ce_match_stat() may not be a very good measure to see if

[PATCH v2] clone: report duplicate entries on case-insensitive filesystems

2018-08-07 Thread Nguyễn Thái Ngọc Duy
Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by