On Thu, Jun 13, 2019 at 05:04:34PM +0200, Guus Sliepen wrote: > On Fri, Jun 07, 2019 at 07:29:49PM +0200, Adam Borowski wrote: > > > I care about two use cases: > > * boxes with HDDs or SD cards > > * datacenter VMs, buildds > [...] > > No, there's no such thing as a 1-way machine that can > > install a modern distro anymore[3]: oldest machine I own, a non-NX Pentium4, > > is already -j2; when 3 years ago I needed the cheapest possible box with > > • USB, • local storage, • ethernet; it had 4 cores and 512MB RAM. Non-SMP > > is dead and buried, forget about ever optimizing for that. > > Non-SMP is pretty alive when it comes to VM guests. So if you claim you > care about that usecase, please do optimize for that as well.
That's -j1 which is a degenerate but valid case. It won't run any slower than current dpkg does. > > * let's not care about power loss during install. So no fsyncs, and no > > writing a single byte that's going to be overwritten later. Do a global > > sync() only when entering grub-install. > > With KVM installs, I usually configure it to use unsafe IO, which > basically has the same effect as eatmydata. If the installation was > succesful, I can switch the IO mode back to something reliable. This > indeed makes a huge difference in install speeds. Yeah but even with eatmydata it's pointless to write the whole status file after every step, then sometimes parse it back. > > * being able to unpack in parallel also means you don't need to care about > > order: install can go before apt-download has finished. This is awesome > > when your mirror has a slower link than that 10Gb... We can install > > package X the moment apt has fetched it even though it's still downloading > > packages Y and Z. > > (Nb: what's a good way to know apt is done? I screen-scrape > > -oDebug::pkgAcquire looking for "Dequeuing" which is a nasty hack.) > > We already know before downloading packages what their dependencies are, > so we can order the download such that the ones with the least > dependencies are downloaded first, and so on. This will allow starting > to install stuff while downloading other packages in a safe way. Good idea but I already decided to ignore dependencies altogether, to further improve parallelizing unpack. Ordering could improve apt+dpkg for upgrades, though. > It might be interesting to create a bootgraph-like chart of the > installation process, to identify the actual bottlenecks and potentials > for parallelization. Maybe we already have such a tool? The current graph looks pretty linear. You can just timestamp debootstrap / apt messages. > > So... any comments so far? Any hints how to cheat the configure step? > > If two packages don't (reverse-)depend on each other in some way, how > safe is it to configure them in parallel? Alas, not at all -- postinsts assume they have exclusive control, dpkg errors out instead of waiting when a lock is taken, etc. Roughly 6-in-7 packages have nice ENOENT-compliant postinst, but that's about the only case that can be cleanly parallelized. Anything else would require some heurestics and/or manual review. For example, update-alternatives calls apparently require serializing upon a single lock, but are parallelizeable with anything else. A big majority of postinsts in general apply only to upgrades, but most are written by hand, with more unique snowflake cases than my socks drawer. There's quite a bit of "# Automatically added by dh_ponies" stanzas, but a cursory look shows they're accompanied by manual parts often enough to not warrant even automating them away. Even worse, while manually written postinsts stay unchanged in subsequent uploads, eg. dh_installdeb inserts its version number, making the builds non-reproducible. My current plan is to do the hard work manually, and store data about elideable postinsts by their mangled hash, such as: deadbeef -> ignore (for completely skippable) or deadbeef -> Lock: alternatives, to mark scripts that need to be run exclusively within their group but without regard to anything else. Meow! -- How to squander your resources: those silly Swedes have a sauce named "hovmästarsås", the best thing ever to put on cheese, yet they waste it solely on mere salmon.