Thanks Paul, but I'm having a hard time finding the precise version I would like to archive on any ftp mirror. My scrape is actually working quite correctly now, though, since I added a sleep in there -- the source and machine-installation instructions are tidily tucked away in different directories, with names, locations, and success/failure logged into a key-value (more of a dictionary-ish, really) text file. I get that an ftp request is more civilized, but this scrape is quite convenient for me. If it is more palatable to the community, I can increase the sleep time in the loop to a couple of minutes or even a few or more, and throw it on one of my raspberry pis and forget about it for a while, since I'm not in a major hurry.
Have a great Monday folks, John On Sat, Jun 12, 2021 at 5:19 PM Paul Wise <p...@debian.org> wrote: > On Sat, Jun 12, 2021 at 8:15 PM John E Petersen wrote: > > > If I find it is possible to simply download the entire collection, > without having to host a mirror, I may very well go that route. > > That is definitely possible, there are two sides to every Debian > mirror: 1) downloading Debian 2) making the files available on the > web. The second part is definitely optional and many Debian folks do > just the first part in order to serve their personal machines with > Debian packages. > > > If I continue the scraping route, would adding wait time in my loop > between downloads make my repeated access less of a problem? I would like > to let it run until it is finished. It is tedious to restart my scrape > periodically. > > Please use the ftpmirror method recommended by Étienne, it is more > likely to produce a correct result than scraping and much less likely > to get blocked. The Debian archive is only updated every six hours, so > it would be a waste of bandwidth to update more often than that. > > -- > bye, > pabs > > https://wiki.debian.org/PaulWise >