Ludovic Courtès <l...@gnu.org> skribis: > 678 May 3 16:03:40 localhost vmunix: [ 10.221211] shepherd[1]: Service > dbus-system has been started. > 679 May 3 16:03:40 localhost vmunix: [ 10.222093] shepherd[1]: Service > loopback has been started. > 680 May 3 16:03:40 localhost wpa_supplicant[398]: Successfully initialized > wpa_supplicant > 681 May 3 16:03:40 localhost shepherd[1]: Service wpa-supplicant could not > be started. > 682 May 3 16:03:40 localhost shepherd[1]: Service networking depends on > wpa-supplicant. > 683 May 3 16:03:40 localhost shepherd[1]: Service networking could not be > started. > 684 May 3 16:03:40 localhost wpa_supplicant[400]: dbus: Could not request > service name: already registered > 685 May 3 16:03:40 localhost wpa_supplicant[400]: Failed to initialize > wpa_supplicant > 686 May 3 16:03:45 localhost shepherd[1]: Service wpa-supplicant could not > be started.
My guess is that it goes like this: 1. shepherd starts ‘networking’, which triggers the start of ‘wpa-supplicant’ (PID 398), which immediately “fails”. Thus ‘networking’ is not started. 2. shepherd continues and starts ‘wpa-supplicant’ directly (PID 400). This time it fails for good; after 5 seconds, since the PID file didn’t show up, shepherd says again that it could not be started. Indeed, by looking at shepherd.conf from: guix gc -R $(guix system build gnu/system/install.scm) | grep shepherd.conf one can see that ‘networking’ comes before ‘wpa-supplicant’ in the expression: (for-each start '(… networking … wpa-supplicant …)) So why is ‘wpa-supplicant’ marked as failing to start on the first attempt? The only reason I can think of is if ‘read-pid-file’ from (shepherd service) returns immediately and returns #f instead of a number. That can actually happen if the PID file exists but is empty (or contains garbage). You would expect wpa_supplicant to create its PID file atomically: write it under a different name, then rename(2)… but no: --8<---------------cut here---------------start------------->8--- int os_daemonize(const char *pid_file) { #if defined(__uClinux__) || defined(__sun__) return -1; #else /* defined(__uClinux__) || defined(__sun__) */ if (os_daemon(0, 0)) { perror("daemon"); return -1; } if (pid_file) { FILE *f = fopen(pid_file, "w"); if (f) { fprintf(f, "%u\n", getpid()); fclose(f); } } return -0; #endif /* defined(__uClinux__) || defined(__sun__) */ } --8<---------------cut here---------------end--------------->8--- So there is a possibility, albeit unlikely, for shepherd to see the PID file after it’s been open but before it’s been written to. (This problem is not limited to the installer.) I’m not 100% convinced that this is what’s happening there but that’s the only lead I have. I’m surprised we haven’t seen other reports before. Thoughts? Ludo’.