On 26/10/2024 17.25, Mike Gilbert wrote:
On Sat, Oct 26, 2024 at 1:03 AM Robin H. Johnson <robb...@gentoo.org> wrote:
On Tue, Oct 22, 2024 at 11:24:29PM -0400, Eli Schwartz wrote:
Apache has a mirror network, which only covers the most recent release
of any given package. They also have an additional site which does not
appear to be a CDN, and is throttled and can maybe ban you if you use it
too much. Unfortunately, it is also the ONLY way to actually get
historic releases of many packages.

Use it, and use it last -- after every other mirror has been tried,
which should handle latest releases. In combination with GENTOO_MIRRORS
this should ensure that users can actually download software when
needed, without running afoul of throttling.
This does not actually do it "last" as you claim.

Portage shuffles the list of thirdpartymirrors:
https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/package/ebuild/fetch.py#n1140

While it increases mirror burden; this should likely be done as a
distinct thirdpartymirror:
apache-historical https://archive.apache.org/dist/

And that gets used in ebuilds when distfiles fall off the main mirrors
[until such time as strictly ordered behavior is available].

I assume Eli was trying to avoid having to update old ebuilds.

I don't really see any point in adding an apache-historical entry in
thirdpartymirrors;

Such an entry would probably violate the "no single URL entries in mirros" rule anyway.

just inline https://archive.apache.org/dist/ into
SRC_URI if the ebuild requires it.

I assume the problem is that the ebuild may not require it when being written, but at some later point in the ebuild's future.

Tex packaging is in a similar situation.


I wonder if we could introduce priority groups into the mirror entry specification. As an example

foo https://mirror.foo.org | https://ftp.baz.de/foo … | htttps://archive.mirror.foo.org

The priority groups are separated by '|'.

Here, the first priority is the a DNS bouncer, that delegates to your nearest mirror. This is commonly found for some archives.

The second priority groups contains a selected set of real mirrors. Having it is helpful in cases where the DNS bouncer is malfunctioning. Those events are rare but I believe there have been some reports.

The last priority group consists of the URL to the long-term archive, which consists of all artifacts ever released but with rate limiting.

The packages manager would first select one or more entries from the first priority group, before proceeding to the next group(s).

I believe Apache and Tex packages would benefit from this.

Could this be something for EAPI 9? (I think it needs a new EAPI, but happy stand corrected.)

- Flow





Attachment: OpenPGP_0x8CAC2A9678548E35.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to