On Wed, Feb 21, 2018 at 12:45:24PM -0700, Bob Proulx wrote: > arn...@skeeve.com wrote: > > I would have thought most people would do 'git pull' instead of 'git clone' > > and that pulling wouldn't be quite as intensive, but who knows... > > I think that these days most people do not keep persistent state. The > cynic in me assumes they are working from their phone. Instead they > clone a new repository, do something, then toss it away. Even for > continuous integration systems many people do not use a local cache. > I can't prove this but it is just my observation of the way people > work around me in real life. This makes large projects very I/O > intensive when things happen.
As a relative young'un with a smartphone, I can say that doing development on my phone sounds like a bad time! I typically keep full clones around, even for one-off things. Source code is small and gigabytes are cheap. > I was actually using clone generically there. But if people were > pulling then I would have expected the processes to have finished > quickly. But we do periodically see large repositories getting cloned > at the same time due to project announcements all run at the same time > and take up time. I don't know which projects were getting pulled or > cloned since we do not log that information. But previously Emacs has > been the cause of it becaues the repos is large (and originally was > larger) making cloning need a lot of bandwidth. When Emacs announces > a new release there is usually a spike. So using it as an example > without saying that was the project this time. I imagine there are CI systems that do full clones and then check out the commit they want. Of course this is a big waste of bandwidth, so in Guix we are looking into the possibility of shallow cloning for related use cases. This may be less efficient on the server; I guess it depends on the server implementation. Still, we favor release tarballs because they are small and the required software is much less complex.
signature.asc
Description: PGP signature