Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-14 Thread Michael Richardson
Philip Prindeville wrote: >> In general, I think that this decision needs to up-leveled to as a >> build option. There are many cases where I would agree: you want the >> box to die rather than potentially come up insecurely. > A while ago I posted an option to “bake in” a defau

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-14 Thread Philip Prindeville
> On May 14, 2020, at 8:23 AM, Michael Richardson wrote: > > [snip] > > It depends a lot on the relative cost of sending a service person there to > repair the device (push the button, reflash or replace the device), vs the > risk of the box not operating at all. > > In the NAT44 home router

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-14 Thread Michael Richardson
Philip Prindeville wrote: >> A reboot with some logging on disk would allow for remote sysupgrades >> to have some kind of recoverability. > What if the failure left the box in a partially compromised state? > Would you want your firewall to “fail open”? I wouldn’t. It depends

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
On Wed, May 13, 2020 at 6:29 PM Philip Prindeville < philipp_s...@redfish-solutions.com> wrote: > > My lights-out machine rooms all have a dialup line, a terminal server, and > a power strip where I can cycle outlets… because I don’t like driving 7 > hours each way for 1 hour of productive work. >

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Philip Prindeville
> [snip] > > (2) If the box is in an indeterminate state then it’s not always clear that > there’s a safe path forward, and sometimes this is something that a human > needs to ascertain. > > There's no human that can ascertain anything. The board that is being > upgraded is being upgraded f

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
> > How the entire upgrade process works would be a good subject for > documenting on the Wiki if it’s not already. > Feel free :-) > How long are you thinking this I/O will take to complete? > Longer than the blazing speed of /bin/sh looping over /proc/* (1) It shouldn’t be happening very of

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Philip Prindeville
> On May 13, 2020, at 1:41 PM, John Clark via openwrt-devel > wrote: > > I’ve never had this problem with ‘reboot’, but there doesn’t seem to be a > really nice way to ‘reboot into a firmware upgrade initram image’, do the > work, then reboot with new firmware. kexec? _

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Philip Prindeville
> On May 13, 2020, at 11:59 AM, Michael Jones wrote: > > E.g. if /lib/upgrade/script2 has returned, at all, the system needs to > reboot, because at this point /sbin/upgraded should be the only process > running. > > if /lib/upgrade/script2 has not returned after 1 hour, there's no chance th

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Philip Prindeville
Inline... > On May 12, 2020, at 11:17 PM, Michael Jones wrote: > > I've been investigating a problem with sysupgrade failing with the error > message "Failed to kill all processes", and then hanging indefinitely. > > This happens maybe once every 10-20 sysupgrades, and it's kind of a pain. >

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread John Clark via openwrt-devel
The sender domain has a DMARC Reject/Quarantine policy which disallows sending mailing list messages using the original "From" header. To mitigate this problem, the original message has been wrapped automatically by the mailing list software.--- Begin Message --- > On May 12, 2020, at 10:17 PM,

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread John Clark via openwrt-devel
The sender domain has a DMARC Reject/Quarantine policy which disallows sending mailing list messages using the original "From" header. To mitigate this problem, the original message has been wrapped automatically by the mailing list software.--- Begin Message --- > On May 12, 2020, at 10:17 PM,

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
On Wed, May 13, 2020 at 12:29 PM Michael Jones wrote: > > > On Wed, May 13, 2020 at 3:58 AM Jo-Philipp Wich wrote: > >> Hi, >> >> > Stuff like umounting external disks, fsync / swapoff etc. come to mind as >> well >> which should be doable at this point. >> >> >> > Right, that's also feasible. >

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
On Wed, May 13, 2020 at 3:58 AM Jo-Philipp Wich wrote: > Hi, > > > > > That loop-kill-all thing should be a kind of last resort really, > what's > > actually needed is some kind of "init 1" procd equivalent which > shuts down all > > services in a more or less clean manner. > > > > >

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
On Wed, May 13, 2020 at 4:42 AM Kevin 'ldir' Darbyshire-Bryant < l...@darbyshire-bryant.me.uk> wrote: > > > > On 13 May 2020, at 09:57, Jo-Philipp Wich wrote: > > > > Hi, > > > >> > >>That loop-kill-all thing should be a kind of last resort really, > what's > >>actually needed is some kin

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Kevin 'ldir' Darbyshire-Bryant
> On 13 May 2020, at 09:57, Jo-Philipp Wich wrote: > > Hi, > >> >>That loop-kill-all thing should be a kind of last resort really, what's >>actually needed is some kind of "init 1" procd equivalent which shuts >> down all >>services in a more or less clean manner. >> Beware th

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Jo-Philipp Wich
Hi, > > That loop-kill-all thing should be a kind of last resort really, what's > actually needed is some kind of "init 1" procd equivalent which shuts > down all > services in a more or less clean manner. > > > Oddly enough, the /lib/upgrade/stage2 script has some aspect of this.

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Michael Jones
> That loop-kill-all thing should be a kind of last resort really, what's > actually needed is some kind of "init 1" procd equivalent which shuts down > all > services in a more or less clean manner. > > Oddly enough, the /lib/upgrade/stage2 script has some aspect of this. It explicitly shuts down

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-13 Thread Jo-Philipp Wich
Hi Michael, > [...] > > Now that the very rough summary is out of the way, I have 4 questions. > > 1) I notice that the shell script /lib/upgrade/stage2 is doing a tight loop > with kill -9 to terminate processes. However, it's only looping a maximum of > 10 times, and its going as fast as the s

Re: [OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-12 Thread Reiner Karlsberg
Applause, applause. The first (partial) docs of the magic of sysupgrade. And its pitfalls. Having had various issues with sysupgrade myself in the past (also doing sysupgrade OTA), I add following notes: - Having open files on storage devices (i.e. for swap, but also explicitly opened) broke s

[OpenWrt-Devel] Sysupgrade and Failed to kill all processes

2020-05-12 Thread Michael Jones
I've been investigating a problem with sysupgrade failing with the error message "Failed to kill all processes", and then hanging indefinitely. This happens maybe once every 10-20 sysupgrades, and it's kind of a pain. So far I've determined this workflow that the sysupgrade command follows. Note,