On Wed, May 13, 2020 at 3:58 AM Jo-Philipp Wich <j...@mein.io> wrote:
> Hi, > > > > > That loop-kill-all thing should be a kind of last resort really, > what's > > actually needed is some kind of "init 1" procd equivalent which > shuts down all > > services in a more or less clean manner. > > > > > > Oddly enough, the /lib/upgrade/stage2 script has some aspect of this. It > > explicitly shuts down (kill -9) telnet, dropbear, and ash before looping > with > > sigTERM, and then again with sigKILL. > > > > I find it very odd that it's explicitly singling out telnet, dropbear, > and > > ash. My OpenWRT build doesn't have any of these installed in the first > place. > > E.g. I have OpenSSH, and it's jumping straight to kill -9 instead of > sending > > sigTERM first like it should. > > These are (in the case of telnet, were) the default services offering shell > access in standard images the sysupgrade script was tailored for. > > The intention is to kill all user shell sessions to prevent interference > with > the subsequent upgrade process. An openssh case simply hasn't been added > since > it is uncommon, especially on lower end devices. > > The subsequent TERM / KILL loops are a poor mans attempt to cleanly shut > down > services. It obviously won't work for things having expensive teardown > procedures (databases, squid proxy, etc.) - those really should be handled > manually by the user before invoking sysupgrade. I mean obviously one can > extend the grace period, but I guess there will always be unhandled cases. > > I merely meant that i thought it odd that instead of using sigTERM on the user-interactable processes, we jump straight to sigKILL. I don't really see why singling out the user interactable processes does any good, if they'd be sigTERM and then sigKILL-ed like everything else. > Uhm, yeah sure, we could try writing the image again I guess. But > eventually > you have to give up if the storage device simply cannot be written cleanly. > > Of course. Eventually we know it won't succeed, but a flaky storage doesn't necessarily mean a second attempt won't succeed. Or an attempt to write the data in smaller pieces. My concern is that one error and giving up will lead to more soft-bricks than two errors and giving up. We could bikeshed on this forever though. I merely meant that one retry isn't unreasonable. 50 probably is. > Stuff like umounting external disks, fsync / swapoff etc. come to mind as > well > which should be doable at this point. > > > Right, that's also feasible. In fact I don't see any code at all for unmounting existing filesystem mounts. Thanks for pointing that out.
_______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel