Thanks Paul, but I'm having a hard time finding the precise version I would
like to archive on any ftp mirror. My scrape is actually working quite
correctly now, though, since I added a sleep in there -- the source and
machine-installation instructions are tidily tucked away in different
directories, with names, locations, and success/failure logged into a
key-value (more of a dictionary-ish, really) text file. I get that an ftp
request is more civilized, but this scrape is quite convenient for me. If
it is more palatable to the community, I can increase the sleep time in the
loop to a couple of minutes or even a few or more, and throw it on one of
my raspberry pis and forget about it for a while, since I'm not in a major
hurry.

Have a great Monday folks,
John

On Sat, Jun 12, 2021 at 5:19 PM Paul Wise <p...@debian.org> wrote:

> On Sat, Jun 12, 2021 at 8:15 PM John E Petersen wrote:
>
> > If I find it is possible to simply download the entire collection,
> without having to host a mirror, I may very well go that route.
>
> That is definitely possible, there are two sides to every Debian
> mirror: 1) downloading Debian 2) making the files available on the
> web. The second part is definitely optional and many Debian folks do
> just the first part in order to serve their personal machines with
> Debian packages.
>
> > If I continue the scraping route, would adding wait time in my loop
> between downloads make my repeated access less of a problem? I would like
> to let it run until it is finished. It is tedious to restart my scrape
> periodically.
>
> Please use the ftpmirror method recommended by Étienne, it is more
> likely to produce a correct result than scraping and much less likely
> to get blocked. The Debian archive is only updated every six hours, so
> it would be a waste of bandwidth to update more often than that.
>
> --
> bye,
> pabs
>
> https://wiki.debian.org/PaulWise
>

Reply via email to