Package: dgit-infrastructure
Version: 13.16
Severity: important
t2u job 1459 failed Irrecoverable because the builder host rebooted
mid-build. I asked DSA, and:
Aurelien Jarno via RT writes ("[rt.debian.org #9884] Reboot(?) of
tag2upload-builder-01"):
> On all our hosts, reboot is only possible after taking the
> /var/run/reboot-lock lock. Therefore for the critical part of the
> tag2upload service you should take this lock. For instance to take
> the lock for the duration of a script:
>
> flock -s -n /var/run/reboot-lock your_script
>
> You can use the -E option to return a different error code when the
> lock hasn't been acquired.
>
> Of course you should minimize the time when the lock is taken to not
> make reboot difficult or impossible.
We should implement this. I'm not sure exactly how, though - this
would have to happen on the builder outside the VM, and we currently
don't run many commands there. dgit-repos-server doesn't get a way to
do that right now. But maybe the oracled could do it.
I guess we can think about this in the context of the retry work.
Ian.
--
Ian Jackson <[email protected]> These opinions are my own.
Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.