Hello Timothy,

Thanks for chiming in.

Timothy Sample <samp...@ngyro.com> skribis:

> And here’s a reproducer:
>
>   (let loop ()
>     (define fd #f)
>     (let ((P (open-input-file "/dev/null")))
>       ;; Does not change the revealed count of P.
>       (set! fd (fileno P)))
>     (let ((port (open-input-file "/dev/null")))
>       (dup->port port "r" fd)
>       (close-port port)
>       (loop)))
>
> This results in EBADF in seemingly exactly the same way.  (I had to run
> it a few times: sometimes it runs out of file descriptors first.)  This
> happens on bootstrap Guile (2.0.9) and modern Guile.

Nice reproducer; I fully agree with your analysis.

See3in that ‘install-current-ports!’ creates temporary ports (via ‘dup’)
for no reason since nobody captures their reference and they get GC’d
soon after, I rewrote it like this:

--8<---------------cut here---------------start------------->8---
(define (install-current-ports!)
  "Install all current ports into their usual file descriptors.  For
example, if @code{current-input-port} is a @code{file-port?}, make the
process file descriptor 0 refer to the file open for
@code{current-input-port}.  If any current port is a @code{port?} but
not a @code{file-port?}, its corresponding file descriptor will refer
to @file{/dev/null}."
  ;; XXX: Input/output ports?  Closing other FDs?
  (for-each (lambda (i)
              (gc)  ;to trigger bugs
              (let ((current-port (fd->current-port i)))
                (match (current-port)
                  ((? file-port? port)
                   (dup->fdes port i))
                  (#f #t))))
            (iota *fd-count*)))
--8<---------------cut here---------------end--------------->8---

But this illustrates another problem: in the child process, right before
‘execve’, the finalization thread may be restarted, in which case it
creates a new pipe.

In the example below, the finalization pipe is on FDs 9 and 7, but
‘install-current-ports!’ blindly dups to FD 7, thereby closing one end
of the finalization pipe that was just created:

--8<---------------cut here---------------start------------->8---
23647 pipe2([7, 9], O_CLOEXEC)          = 0
23647 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0
23647 
clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
 child_tid=0x7f84204b3990, parent_tid=0x7f84204b3990, exit_signal=0, 
stack=0x7f841fb24000, stack_size=0x98ef80, tls=0x7f84204b36c0} => 
{parent_tid=[23648]}, 88) = 23648
[…]
23647 write(9, "\0", 1)                 = 1
23648 <... read resumed>"\0", 1)        = 1
23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
23648 read(7,  <unfinished ...>
23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35845839}) = 0
23647 dup2(12, 7)                       = 7
23647 fcntl(7, F_GETFL)                 = 0x8002 (flags O_RDWR|O_LARGEFILE)
23647 lseek(7, 0, SEEK_CUR)             = -1 ESPIPE (Illegal seek)
23647 dup2(12, 7)                       = 7
23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35899320}) = 0
23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0
23647 write(9, "\0", 1)                 = 1
23648 <... read resumed>"\0", 1)        = 1
23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
23648 read(7,  <unfinished ...>
23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39539830}) = 0
23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39555997}) = 0
23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0
23647 madvise(0x7f8421d74000, 8192, MADV_DONTNEED) = 0
23647 write(9, "\0", 1)                 = -1 EPIPE (Broken pipe)
23647 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=23647, 
si_uid=1000} ---
--8<---------------cut here---------------end--------------->8---

After that dup2(12, 7) call, writing to the finalization pipe yields
SIGPIPE, which terminates the process (here it corresponds to a subshell
running ‘expr’).

Since we’re going to exec right after fork, we could turn off
finalization around ‘primitive-fork’ such that the child doesn’t attempt
to restart the finalization thread before exec.  The Shepherd has code
like this:

--8<---------------cut here---------------start------------->8---
(define %set-automatic-finalization-enabled?!
  ;; When using a statically-linked Guile, for instance in the initrd, we
  ;; cannot resolve this symbol, but most of the time we don't need it
  ;; anyway.  Thus, delay it.
  (let ((proc (delay
                (pointer->procedure int
                                    (dynamic-func
                                     "scm_set_automatic_finalization_enabled"
                                     (dynamic-link))
                                    (list int)))))
    (lambda (enabled?)
      "Switch on or off automatic finalization in a separate thread.
Turning finalization off shuts down the finalization thread as a side effect."
      (->bool ((force proc) (if enabled? 1 0))))))

(define-syntax-rule (without-automatic-finalization exp ...)
  "Turn off automatic finalization within the dynamic extent of EXP."
  (let ((enabled? #t))
    (dynamic-wind
      (lambda ()
        (set! enabled? (%set-automatic-finalization-enabled?! #f)))
      (lambda ()
        exp ...)
      (lambda ()
        (%set-automatic-finalization-enabled?! enabled?)))))
--8<---------------cut here---------------end--------------->8---

Problem is, we cannot use the FFI on the statically-linked Guile.

We could implement fork+exec in C, but we don’t have a C compiler at
this early bootstrap stage.

Thoughts?

Ludo’.



Reply via email to