Hello, One problem we noticed in the analysis of the boot problem of bayfront after the recent downtime¹ is that an interactive REPL would be opened after an unbound variable was found in the shepherd config file:
--8<---------------cut here---------------start------------->8--- [ 13.098907] shepherd[1]: Service root started. [ 13.100711] shepherd[1]: Service root running with value #t. [ 13.103824] shepherd[1]: Service root has been started. [ 13.426102] shepherd[1]: ice-9/boot-9.scm:1685:16: In procedure raise-exception: [ 13.428099] shepherd[1]: Unbound variable: make-forkexec-constructor/container [ 13.429912] shepherd[1]: [ 13.431108] shepherd[1]: Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. [ 13.441983] shepherd[1]: GNU Guile 3.0.9 [ 13.442728] shepherd[1]: Copyright (C) 1995-2023 Free Software Foundation, Inc. [ 13.443947] shepherd[1]: [ 13.444427] shepherd[1]: Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'. [ 13.445679] shepherd[1]: This program is free software, and you are welcome to redistribute it [ 13.446919] shepherd[1]: under certain conditions; type `,show c' for [ 13.447072] shepherd[1]: details. [ 13.448737] shepherd[1]: [ 13.449239] shepherd[1]: Enter `,help' for help. --8<---------------cut here---------------end--------------->8--- This was unhelpful because we couldn’t interact with that REPL remotely (no IPMI). Even when you can interact, it’s of limited use; in this case, if you type “,q”, it tries to continue and fails: --8<---------------cut here---------------start------------->8--- Uncaught exception in task: In fibers.scm: 172:8 7 (_) In ice-9/exceptions.scm: 406:15 6 (_) In ice-9/boot-9.scm: 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) In shepherd/service.scm: 824:39 4 (_) this is because we’re effectively adding #f in the middle of the list passed to ‘register-services’ (see below). This REPL-on-error “feature” comes from Guix System, not Shepherd, in the config file generated from (gnu services shepherd): ;; Arrange to spawn a REPL if something goes wrong. This is better ;; than a kernel panic. (call-with-error-handling (lambda () (register-services (parameterize ((current-warning-port (%make-void-port "w"))) (map (lambda (file) (save-module-excursion (lambda () (set-current-module (make-user-module)) (load-compiled file)))) '#$(map scm->go files)))))) The rationale mentioned in the comment no longer holds: starting from Shepherd 0.10.2, the config file is loaded in the background; if it’s evaluation fails, shepherd keeps running (see ‘tests/config-failure.sh’, which tests this behavior). I think we should change the above to log and gracefully handle failure to load an individual service file. Ludo’. ¹ https://lists.gnu.org/archive/html/info-guix/2024-05/msg00000.html