On Fri, May 03, 2024 at 06:23:05PM +1200, Michael Hudson-Doyle wrote: > If we want to make apt update quicker / lighter on resources we should > figure out if we can stop publishing some of the hashes (which entirely > dominate the size of the compressed package lists). We currently have 4 > hashes in the lists (md5, sha1, sha256, sha512) -- I know Dimitri was > trying to get us to the point that we could stop publishing MD5 at least > but there are a few things out there that hardcode a dependence on it. > Maybe oracular is a good time to turn off some hashes and see what breaks.
I did some further analysis. Summary of results: Adding proposed increases download size by 7% at worst. Stripping older hashes reduces download size by around 35% to 40%. With stripping, adding proposed would make a difference to download size of 4% at worst, and overall we'd still have a >30% improvement. Note however that when I say that there's an increase in download size of 7% at worst, that's 2.4 MB. IMHO, that's negligible. The most we could get in savings by stripping hashes is 15 MB, and that's assuming no previous/ongoing cache. Analysis I suggest therefore that we don't need to worry about size from the perspective of adding proposed. Stripping hashes will provide some worthwhile benefit but I don't think we need to block adding proposed on this. I've filed LP: #2067752 to track the removing of the old hashes. Detailed results: Noble Download | Without proposed | 29.2 MB | | With proposed | 29.9 MB | | Difference | 0.7 MB / 102% | Considering just the Packages files from the download: | | Not stripped | Stripped | Difference | | Without proposed | 16431k | 11035k | 5396k / 67% | | With proposed | 16880k | 11199k | 5681k / 66% | | Difference | 449k / 103% | 164k / 101% | 5232k / 68% | Jammy Download | Without proposed | 33.9 MB | | With proposed | 36.3 MB | | Difference | 2.4 MB / 107% | Considering just the Packages files from the download: | | Not stripped | Stripped | Difference | | Without proposed | 23661k | 15229k | 8432k / 64% | | With proposed | 25135k | 15805k | 9330k / 63% | | Difference | 1474k / 106% | 576k / 104% | 7856k / 67% | Focal | Without proposed | 33.2 MB | | With proposed | 34.5 MB | | Difference | 1.3 MB / 104% | Considering just the Packages files from the download: | | Not stripped | Stripped | Difference | | Without proposed | 23385k | 13256k | 10129k / 57% | | With proposed | 24214k | 13756k | 10458k / 57% | | Difference | 829k / 104% | 500k / 104% | 9629k / 59% | Notes: To compare like for like, I used `xz -9` from each corresponding series both for the stripped estimate, and recompressed using that xz for the not stripped estimate. In practice, Launchpad would presumably use a newer-ish xz across all series. Method: Using lxd ubuntu:<series> container images find /var/lib/apt/lists /var/cache/apt -type f -delete apt-get update # note how much it says it downloaded, eg. "Fetched 33.9 MB in 5s (6519 kB/s)" Add proposed (`add-apt-repository -p proposed` or edit deb822 by hand on Noble due to LP: #2061128 and also manually on Focal) find /var/lib/apt/lists /var/cache/apt -type f -delete apt-get update # note how much it says it downloaded, eg. "Fetched 33.9 MB in 5s (6519 kB/s)" apt-get install -y dctrl-tools mkdir {un,}stripped cp /var/lib/apt/lists/*Packages unstripped cd unstripped for i in *; do grep-dctrl -I -s MD5sum,SHA1,SHA256 . < $i > ../stripped/$i;done xz -9 * find -type f|xargs du -c # record "unstripped" "with proposed" sizes find -type f|grep -v proposed|xargs du -c # record "unstripped" "without proposed" sizes cd ../stripped xz -9 * find -type f|xargs du -c # record "stripped" "with proposed" sizes find -type f|grep -v proposed|xargs du -c # record "stripped" "without proposed" sizes
signature.asc
Description: PGP signature
-- ubuntu-devel mailing list ubuntu-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel