On 2/27/23 17:44, Richard W.M. Jones wrote: > On Mon, Feb 27, 2023 at 08:42:23AM -0600, Eric Blake wrote: >> Or intentionally choose a hash that can be computed out-of-order, such >> as a Merkle Tree. But we'd need a standard setup for all parties to >> agree on how the hash is to be computed and checked, if it is going to >> be anything more than just a linear hash of the entire guest-visible >> contents. > > Unfortunately I suspect that by far the easiest way for people who > host images to compute checksums is to run 'shaXXXsum' on them or sign > them with a GPG signature, rather than engaging in a novel hash > function. Indeed that's what is happening now: > > https://alt.fedoraproject.org/en/verify.html
If the output is produced with unordered writes, but the complete output needs to be verified with a hash *chain*, that still allows for some level of asynchrony. The start of the hashing need not be delayed until after the end of output, only after the start of output. For example, nbdcopy could maintain the highest offset up to which the output is contiguous, and on a separate thread, it could be hashing the output up to that offset. Considering a gigantic output, as yet unassembled blocks could likely not be buffered in memory (that's why the writes are unordered in the first place!), so the hashing thread would have to re-read the output via NBD. Whether that would cause performance to improve or to deteriorate is undecided IMO. If the far end of the output network block device can accommodate a reader that is independent of the writers, then this level of overlap is beneficial. Otherwise, this extra reader thread would just add more thrashing, and we'd be better off with a separate read-through once writing is complete. Laszlo _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs