On 26.01.2020 17.19, Kristian Klausen via arch-mirrors wrote:
Hi

I'm considering setting up a Arch Linux mirror and I'm considering a different design.


Hi

I just got time to implement this and the setup looks like this:
Cloudflare -> Cloudflare Workers -> Backblaze B2 bucket <- Tier1 mirror

The files is synced from mirror.ams1.nl.leaseweb.net every hour to the Backblaze B2 bucket and they are fetched from the bucket with the help of a Cloudflare Workers script. Cloudflare is configured to cache everything (size <=2GB*), database files is cached for 5 minute everything else is cached for 24 hours.
* CF is sponsoring a plan with a higher limit than the 512MB default

I have done some quick testing, and time to first byte isn't impressive (at least not when downloading from Europe), but the speed is acceptable (80-100MB/s is achievable if the file is cached, and 8-12MB/s if not (tested from Europe)).

To make it easier to implement, I took some shortcuts:
* Directory listing isn't implemented
* "latest" files isn't synced
* Only packages in "pool/" is synced, the package files in the different repo isn't synced, but if you request a package (\.pkg\.tar\.(xz|zst)(|.sig)$) it is automatic retrieved from the pool/ directory. This means that you can download ex Firefox from both:
https://archlinux.amirror.xyz/extra/os/x86_64/firefox-73.0.1-1-x86_64.pkg.tar.zst
https://archlinux.amirror.xyz/community/os/x86_64/firefox-73.0.1-1-x86_64.pkg.tar.zst

I'm not sure if the shortcuts is acceptable, but it can be fixed if it is a issue.

Also please note that: archive, other and sources isn't synced.

Feel free to try it out: https://archlinux.amirror.xyz/

Best regards
Kristian Klausen

So instead of mirroring the whole thing, the idea is to mirror only the database files (core.db etc) and download the packages on demand from a Tier 1 mirror (and let nginx cache them). By doing it that way, I only download requested packages from the Tier 1 mirrors, instead of downloading the whole thing (saving Tier 1 bandwidth).

To provide even better performance a CDN (ex: Cloudflare) could be used to provide more caching. So we end up with a setup like this:
Cloudflare -> Nginx cache -> Tier1 mirrors (nginx with multiple upstream)

Do I miss something? Is this a bad idea?
If I do setup a mirror like that, is there any chance it could be added as a official mirror?

Best regards
Kristian Klausen

Reply via email to