Kevin Ryde <[EMAIL PROTECTED]> writes: > Looks about right. What's the child process doing? It's supposed to > be writing to the parent to say continue. (Unless it failed to fork > there should be some child, either running or a zombie.)
(Consider the following info preliminary. I haven't had any time to try and figure out the actual cause, but since I just discovered this, and I have to stop for the moment, I wanted to let everyone else have a look.) After further investigation, it appears that particular child might not be running, at least not on some of the runs. I switched back to the original code (the code that would hang), added some debug statements, ran strace -p -s 100, etc. on "make check", and found that the child appears to be segfaulting at least some of the time here (in popen.scm): (port-for-each (lambda (pt-entry) ;;(dbg-out (list 'pt-entry pt-entry)) (false-if-exception (let ((pt-fileno (fileno pt-entry))) (if (not (or (= pt-fileno input-fdes) (= pt-fileno output-fdes) (= pt-fileno error-fdes))) (close-fdes pt-fileno)))))) When I uncomment the dbg-out statement above (which just writes the arg and a newline to an output-port and then forces the output), I see this on the console: ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg "list-copy" "Wrong type argument in position ~A: ~S" (1 (pt-entry . #<freed cell 0x40305830; GC missed a reference>)) ((pt-entry . #<freed cell 0x40305830; GC missed a reference>)))) this in the dbg-out output file: ... (pt-entry #<output: string 81079e0>) (pt-entry #<output: string 8106650>) (pt-entry #<freed cell 0x40643c18; GC missed a reference>) and this in the strace (1402 is the forked child process): 1402 write(7, "ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg \"list-copy\" \"Wrong t"..., 263) = -1 EBADF (Bad file descriptor) 1402 write(2, "ERROR", 5) = -1 EBADF (Bad file descriptor) 1402 write(2, "\nException during displaying of ", 32) = -1 EBADF (Bad file descriptor) 1402 write(7, "ERROR: popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg \"list-copy\" \"Wrong t"..., 263) = -1 EBADF (Bad file descriptor) 1402 write(2, "ERROR", 5) = -1 EBADF (Bad file descriptor) 1402 write(2, "\nException during displaying of ", 32) = -1 EBADF (Bad file descriptor) 1402 exit_group(1) = ? If I omit the dbg-out statement in the above code, then I can just see the child die due to a SEGV in the strace log (2126 is the child): 2126 close(12 <unfinished ...> 2123 <... close resumed> ) = 0 2126 <... close resumed> ) = 0 2123 access("/etc/ld.so.nohwcap", F_OK <unfinished ...> 2126 close(10 <unfinished ...> 2123 <... access resumed> ) = -1 ENOENT (No such file or directory) 2126 <... close resumed> ) = 0 2123 open("/lib/tls/i686/cmov/libdl.so.2", O_RDONLY <unfinished ...> 2126 close(29 <unfinished ...> 2123 <... open resumed> ) = 5 2126 <... close resumed> ) = 0 2123 read(5, <unfinished ...> 2126 --- SIGSEGV (Segmentation fault) @ 0 (0) --- So I started from a clean tree, enabled core dumps, and here's what gdb had to say about the resulting core: Program terminated with signal 11, Segmentation fault. #0 0x400729ca in scm_fileno (port=0x0) at ioext.c:180 180 port = SCM_COERCE_OUTPORT (port); (gdb) where #0 0x400729ca in scm_fileno (port=0x0) at ioext.c:180 #1 0x4005ad41 in ceval (x=0x404, env=0x40372710) at eval.c:4218 #2 0x4005b26e in ceval (x=<value optimized out>, env=0x40372710) at eval.c:3634 In any case, as I said, consider all this preliminary. For everything but the core dump, I wasn't working from a clean tree. -- Rob Browning rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu GPG starting 2002-11-03 = 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4 _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel