On 8. 6. 2026 17:18, Sean McBride wrote:
On 5 Jun 2026, at 15:46, Johan Corveleyn wrote:
It detected identical files not only in our /branches vs
/trunk but also among the pristine copies
I'm curious about this last statement. What do you mean with "among
the pristine copies"? Among themselves in one single pristine area of
a WC? That would be ... unexpected, since they use the SHA-1 hash as
their filename. So unless you have SHA-1 collisions in there, there
should be no duplicate files in there.
For example, given this folder:
|% ls -la drwxr-xr-x 7 sean staff 224 Feb 15 2020 .svn drwxr-xr-x 5
sean staff 160 Feb 19 2025 branches drwxr-xr-x 25 sean staff 800 Feb
11 14:56 trunk |
I do a dry-run with the tool and it outputs:
|using
./.svn/pristine/1e/1eb3de0f0fd4c1b67327614eb3db918f1a97e36d.svn-base
as the clone origin (first seen) cloning to
./branches/2.5.x/Docs/UserManuals/Image.png cloning to
./trunk/Docs/UserManuals/Image.png |
In other words it has found 3 identical files. It does not matter that
their file names are different, their file contents are the same.
These 3 copies get reduced to 1 copy and 2 "pointers".
OK, that makes sense, there's only one pristine file but two copies in
the WC. That's not surprising, given that you check out the whole tree,
trunk and branches included. Also it means that our pristine
deduplication works. :)
-- Brane