On Tue, Feb 23, 2016 at 7:50 PM, Kristian Fiskerstrand <k...@gentoo.org> wrote: > > On 02/24/2016 01:33 AM, Duncan wrote: >> >> IMO, what's actually happening here is the slow deprecation of >> rsync mirrors in favor of git. I doubt they'd be created at all >> if gentoo were > > I don't agree to this at all. For one thing git is very resource > intensive compared to rsync mirroring,
Is this actually true? For the typical use case of daily or close to daily updates I'd think that git would be much more efficient. rsync has to traverse an entire directory tree (both client and server-side, though of course either could have it cached) and synchronize across the network the metadata for every file to determine what has changed, and then figure out what changed in each file and transfer it. With a large git repository with only a few hundred new commits the client just tells the server what its last commit is, the server walks back in history to find it, and then the server can quickly identify all the new commits/trees/blobs and send just those. With the COW design of git this is very efficient, not requiring traversing any subdirectory in which no files have changed. In the degenerate case where nothing has changed, an rsync still needs to walk the full tree and send a file list, while git just sends a commit ID and terminates. Now, for an infrequent sync (think months) where most of the tree has changed I could certainly buy that a webrsync would be far more efficient for everybody. And just like rsync git is easy to mirror, with github being an example of a service that will mirror anybody's repo for free and they seem to have no end to their bandwidth (though I've found that pushing a full historical gentoo git tree to them does make them choke on it for about 30min before it shows up). So, while I'll agree with the validity of your other points, I'd be interested in actual data to back up the resource claim. I could see that going either way, and that is likely to be based on how well-optimized everything is. Linus did a pretty good job with git. > For one thing we can't expect users to keep an up > to date copy of all gentoo developer's OpenPGP keys to verify each git > commit, additionally this will cause issues with retirement and > similar situations (certificate revocation, subkey rotations, expiries). Well, we could do something (eventually) to make tracking keys easier, but I'll still buy that the thick manifests are more secure. Git commit signatures are only bound to their contents with sha1. I get that nobody has demonstrated a practical attack on that, but I think most crypto experts wouldn't heartily endorse the design. Keep in mind that we do have git mirrors that include metadata/etc hosted on Github. I know people have concerns with their software being proprietary but as far as syncing goes it is just a mirror. I doubt most of us audit all the distfiles mirrors we use to make sure they're only using FOSS ftp/http servers and so on. There really isn't any reason that it couldn't be hosted on infra either, assuming they wanted the extra load (and I don't see the point in it, since it is just a mirror, and if it ever goes away it is trivial to just point the scripts that generate it to push to some other mirror instead - git itself is completely FOSS). Again, I have nothing against devs maintaining rsync and changelogs, and users making use of them. I just don't see it as the end of the world if devs decide to stop taking care of them. -- Rich