On 26/10/2024 17.25, Mike Gilbert wrote:
On Sat, Oct 26, 2024 at 1:03 AM Robin H. Johnson <robb...@gentoo.org> wrote:On Tue, Oct 22, 2024 at 11:24:29PM -0400, Eli Schwartz wrote:Apache has a mirror network, which only covers the most recent release of any given package. They also have an additional site which does not appear to be a CDN, and is throttled and can maybe ban you if you use it too much. Unfortunately, it is also the ONLY way to actually get historic releases of many packages.Use it, and use it last -- after every other mirror has been tried, which should handle latest releases. In combination with GENTOO_MIRRORS this should ensure that users can actually download software when needed, without running afoul of throttling.This does not actually do it "last" as you claim. Portage shuffles the list of thirdpartymirrors: https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/package/ebuild/fetch.py#n1140 While it increases mirror burden; this should likely be done as a distinct thirdpartymirror: apache-historical https://archive.apache.org/dist/ And that gets used in ebuilds when distfiles fall off the main mirrors [until such time as strictly ordered behavior is available].I assume Eli was trying to avoid having to update old ebuilds. I don't really see any point in adding an apache-historical entry in thirdpartymirrors;
Such an entry would probably violate the "no single URL entries in mirros" rule anyway.
just inline https://archive.apache.org/dist/ into SRC_URI if the ebuild requires it.
I assume the problem is that the ebuild may not require it when being written, but at some later point in the ebuild's future.
Tex packaging is in a similar situation.I wonder if we could introduce priority groups into the mirror entry specification. As an example
foo https://mirror.foo.org | https://ftp.baz.de/foo … | htttps://archive.mirror.foo.org
The priority groups are separated by '|'.Here, the first priority is the a DNS bouncer, that delegates to your nearest mirror. This is commonly found for some archives.
The second priority groups contains a selected set of real mirrors. Having it is helpful in cases where the DNS bouncer is malfunctioning. Those events are rare but I believe there have been some reports.
The last priority group consists of the URL to the long-term archive, which consists of all artifacts ever released but with rate limiting.
The packages manager would first select one or more entries from the first priority group, before proceeding to the next group(s).
I believe Apache and Tex packages would benefit from this.Could this be something for EAPI 9? (I think it needs a new EAPI, but happy stand corrected.)
- Flow
OpenPGP_0x8CAC2A9678548E35.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature