Using http for this is inefficient It makes for slower file transfer because you keep rerunning path mtu probes and tcp slow start It makes extra socket handles opening and closing
Between major mirrors you don't have proxies (maybe NAT and firewall with stateful multilayer inspection but that"s different) So ftp is available and is more suitable protocol for bulk file transfer In the case of CPAN you don't have to go the log route. If the mirror knows it last synch time it can use rsync to get the modlist et al and import to SQLITE then query by date to come up with the list of files to fetch -- via ftp. . ------Original Message------ From: Aristotle Pagaltzis To: module-authors@perl.org Sent: Mar 28, 2010 10:13 PM Subject: Re: Trimming the CPAN - "Automatic Purging" * Nicholas Clark <n...@ccl4.org> [2010-03-28 18:20]: > I'm missing something here, I suspect. Yes, you are. > How can HTTP be more efficient than rsync? The only obvious > method to me of mirroring a CPAN site by HTTP is to instruct > a client (such as wget) to get it all. As Arthur has repeatedly pointed this out: by first fetching a transaction log from the remote end, then playing it forward from the last synch point. (This is essentially what CPAN::Mini already does.) It’s not very efficient protocol-wise, but it sure is rather cheap in terms of server I/O. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/> Sent from my BlackBerry® smartphone with Nextel Direct Connect