On Thu, 26 Dec 2019 16:13:33 +0000 "goleo ." <goleo...@gmail.com> wrote:
> I was wondering how much space distfiles on "ftp" take, so because > I couldn't see that in my web browser clearly, I downloaded the page > https://ftp.openbsd.org/pub/OpenBSD/distfiles/ as distfiles.txt With wget, you can download the HTML of a web page, and also recurse into links within it. $ wget -r -l 0 -A '*.html' --no-parent -O everything.html https://ftp.openbsd.org/pub/OpenBSD/distfiles/ This command recurses into an infinite number of links without going up in the hierarchy and into the parent directory, downloads only other .html files (from which more links can be acquired), and appends everything to an "everything.html" file. After a few minutes running and just ~1.7MiB of HTML downloaded, it tried to recurse into a lot of non-existing directories, so I cut it short there. The figure may not be perfect. $ grep -E '[0-9]$' everything.html | sed 's|.* \([0-9]*\)$|\1|' | awk '{sum+=$1} END{print sum / 1024 / 1024}' 65629 The sum of all filesizes, which are listed in kebibytes, divided by 1024^2, to turn it into gibibytes, returns 65629 gibibytes or about 65 tebibytes. This number seems a little absurd, I'm not sure if I made a mistake. It does not seem completely implausible either however, the tree does have files dating all the way back to 1990. https://ftp.openbsd.org/pub/OpenBSD/distfiles/ja-fonts/