On 4/5/2022 17:49, Jason A. Donenfeld wrote: > Hi Matt, > > On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <matts...@gentoo.org> wrote: >> >> On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx...@gentoo.org> wrote: >>> By the way, we're not currently _checking_ two hash functions during >>> src_prepare(), are we? >> >> I don't know, but the hash-checking is definitely checked before >> src_prepare(). > > Er, during the builtin fetch phase. Anyway, you know what I meant. :) > > Anyway, looking at the portage source code, to answer my own question, > it looks like the file is actually being read twice and both hashes > computed. I would have at least expected an optimization like: > > hash1_init(&hash1); > hash2_init(&hash2); > for chunks in file: > hash1_update(&hash1, chunk); > hash2_update(&hash2, chunk); > hash1_final(&hash1, out1); > hash2_final(&hash2, out2); > > But actually what's happening is the even less efficient: > > hash1_init(&hash1); > for chunks in file: > hash1_update(&hash1, chunk); > hash1_final(&hash1, out1); > hash2_init(&hash2); > for chunks in file: > hash2_update(&hash2, chunk); > hash1_final(&hash2, out2); > > So the file winds up being open and read twice. For huge tarballs like > chromium or libreoffice... > > But either way you do it - the missed optimization above or the > unoptimized reality below - there's still twice as much work being > done. This is all unless I've misread the source code, which is > possible, so if somebody knows this code well and I'm wrong here, > please do speak up.
Not to go off-topic, but where in Portage's source is this logic at? It seems like an easy fix for a slightly more efficient Portage. -- Joshua Kinard Gentoo/MIPS ku...@gentoo.org rsa6144/5C63F4E3F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic