On 8. 6. 2026 17:18, Sean McBride wrote:

On 5 Jun 2026, at 15:46, Johan Corveleyn wrote:

        It detected identical files not only in our /branches vs
        /trunk but also among the pristine copies

    I'm curious about this last statement. What do you mean with "among
    the pristine copies"? Among themselves in one single pristine area of
    a WC? That would be ... unexpected, since they use the SHA-1 hash as
    their filename. So unless you have SHA-1 collisions in there, there
    should be no duplicate files in there.

For example, given this folder:

|% ls -la drwxr-xr-x 7 sean staff 224 Feb 15 2020 .svn drwxr-xr-x 5 sean staff 160 Feb 19 2025 branches drwxr-xr-x 25 sean staff 800 Feb 11 14:56 trunk |

I do a dry-run with the tool and it outputs:

|using ./.svn/pristine/1e/1eb3de0f0fd4c1b67327614eb3db918f1a97e36d.svn-base as the clone origin (first seen) cloning to ./branches/2.5.x/Docs/UserManuals/Image.png cloning to ./trunk/Docs/UserManuals/Image.png |

In other words it has found 3 identical files. It does not matter that their file names are different, their file contents are the same.

These 3 copies get reduced to 1 copy and 2 "pointers".


OK, that makes sense, there's only one pristine file but two copies in the WC. That's not surprising, given that you check out the whole tree, trunk and branches included. Also it means that our pristine deduplication works. :)

-- Brane

Reply via email to