On Tue, Nov 20, 2018 at 10:06:57AM -0500, Derrick Stolee wrote: > On 11/20/2018 4:44 AM, Jeff King wrote: > > In cases like this I think it's often a good idea to have a perf test. > > Those are expensive anyway, and we'd have the double benefit of > > exercising the code and showing off the performance improvement. But the > > delta-island code only makes sense on a very specialized repo: one where > > you have multiple related but diverging histories. You could simulate > > that by picking two branches in a repository, but the effect is going to > > be miniscule. > > The changes in this series look correct. Too bad it is difficult to test. > > Perhaps we should add a performance test for the --delta-islands check that > would trigger the failure (and maybe a clone afterwards). There are lots of > freely available forks of git.git that present interesting fork structure. > Here are three that would suffice to make this interesting: > > https://github.com/git/git > https://github.com/git-for-windows/git > https://github.com/microsoft/git > > Of course, running it on a specific repo is up to the person running the > perf suite.
I hadn't really considered the possibility of reconstructing a fork network repository from public repos. It probably would be possible to include a script which does so, although: - I suspect it's going to be pretty expensive. We can use --reference to reduce the size of subsequent clones, but just the repacks you have to do to assemble the final shared repo can get pretty expensive. - That's 3 forks. Our real-world case has over 4000. I'm not sure of the size of the effects for smaller cases. So in short, I think it's an interesting avenue to explore, but it might turn out to be a dead-end. I'm not going to prioritize it right now, but if somebody wants to play with it, I'd be happy to look over the results. -Peff