Hello Timothy, Thanks for chiming in.
Timothy Sample <samp...@ngyro.com> skribis: > And here’s a reproducer: > > (let loop () > (define fd #f) > (let ((P (open-input-file "/dev/null"))) > ;; Does not change the revealed count of P. > (set! fd (fileno P))) > (let ((port (open-input-file "/dev/null"))) > (dup->port port "r" fd) > (close-port port) > (loop))) > > This results in EBADF in seemingly exactly the same way. (I had to run > it a few times: sometimes it runs out of file descriptors first.) This > happens on bootstrap Guile (2.0.9) and modern Guile. Nice reproducer; I fully agree with your analysis. See3in that ‘install-current-ports!’ creates temporary ports (via ‘dup’) for no reason since nobody captures their reference and they get GC’d soon after, I rewrote it like this: --8<---------------cut here---------------start------------->8--- (define (install-current-ports!) "Install all current ports into their usual file descriptors. For example, if @code{current-input-port} is a @code{file-port?}, make the process file descriptor 0 refer to the file open for @code{current-input-port}. If any current port is a @code{port?} but not a @code{file-port?}, its corresponding file descriptor will refer to @file{/dev/null}." ;; XXX: Input/output ports? Closing other FDs? (for-each (lambda (i) (gc) ;to trigger bugs (let ((current-port (fd->current-port i))) (match (current-port) ((? file-port? port) (dup->fdes port i)) (#f #t)))) (iota *fd-count*))) --8<---------------cut here---------------end--------------->8--- But this illustrates another problem: in the child process, right before ‘execve’, the finalization thread may be restarted, in which case it creates a new pipe. In the example below, the finalization pipe is on FDs 9 and 7, but ‘install-current-ports!’ blindly dups to FD 7, thereby closing one end of the finalization pipe that was just created: --8<---------------cut here---------------start------------->8--- 23647 pipe2([7, 9], O_CLOEXEC) = 0 23647 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 23647 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f84204b3990, parent_tid=0x7f84204b3990, exit_signal=0, stack=0x7f841fb24000, stack_size=0x98ef80, tls=0x7f84204b36c0} => {parent_tid=[23648]}, 88) = 23648 […] 23647 write(9, "\0", 1) = 1 23648 <... read resumed>"\0", 1) = 1 23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23648 read(7, <unfinished ...> 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35845839}) = 0 23647 dup2(12, 7) = 7 23647 fcntl(7, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) 23647 lseek(7, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 23647 dup2(12, 7) = 7 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35899320}) = 0 23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0 23647 write(9, "\0", 1) = 1 23648 <... read resumed>"\0", 1) = 1 23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23648 read(7, <unfinished ...> 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39539830}) = 0 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39555997}) = 0 23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0 23647 madvise(0x7f8421d74000, 8192, MADV_DONTNEED) = 0 23647 write(9, "\0", 1) = -1 EPIPE (Broken pipe) 23647 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=23647, si_uid=1000} --- --8<---------------cut here---------------end--------------->8--- After that dup2(12, 7) call, writing to the finalization pipe yields SIGPIPE, which terminates the process (here it corresponds to a subshell running ‘expr’). Since we’re going to exec right after fork, we could turn off finalization around ‘primitive-fork’ such that the child doesn’t attempt to restart the finalization thread before exec. The Shepherd has code like this: --8<---------------cut here---------------start------------->8--- (define %set-automatic-finalization-enabled?! ;; When using a statically-linked Guile, for instance in the initrd, we ;; cannot resolve this symbol, but most of the time we don't need it ;; anyway. Thus, delay it. (let ((proc (delay (pointer->procedure int (dynamic-func "scm_set_automatic_finalization_enabled" (dynamic-link)) (list int))))) (lambda (enabled?) "Switch on or off automatic finalization in a separate thread. Turning finalization off shuts down the finalization thread as a side effect." (->bool ((force proc) (if enabled? 1 0)))))) (define-syntax-rule (without-automatic-finalization exp ...) "Turn off automatic finalization within the dynamic extent of EXP." (let ((enabled? #t)) (dynamic-wind (lambda () (set! enabled? (%set-automatic-finalization-enabled?! #f))) (lambda () exp ...) (lambda () (%set-automatic-finalization-enabled?! enabled?))))) --8<---------------cut here---------------end--------------->8--- Problem is, we cannot use the FFI on the statically-linked Guile. We could implement fork+exec in C, but we don’t have a C compiler at this early bootstrap stage. Thoughts? Ludo’.