On Sat, 18 Jun 2022, Jacob Moody wrote:
I've attempted to reproduce it, trying to remove the libthread/notify
factors. I've come up with this:
#include <u.h>
#include <libc.h>
static void
proc_udp(void*)
{
char resp[512];
char req[] = "request";
int fd;
int n;
int pid;
fd = dial("udp!185.157.221.201!5678", nil, nil, nil);
if(fd < 0)
exits("can't dial");
if(write(fd, req, strlen(req)) != strlen(req))
exits("can't write");
pid = getpid();
fprint(1, "start %d\n", pid);
n = read(fd, resp, sizeof(resp)-1);
fprint(1, "end %d %d\n", pid, n);
exits(nil);
}
void
main(int, char**)
{
int i;
Waitmsg *wm;
for(i = 0; i < 10; i++){
switch(fork()){
case -1:
sysfatal("fork %r");
case 0:
proc_udp(nil);
sysfatal("ret");
default:
break;
}
}
for(i = 0; i < 10; i++){
wm = wait();
print("proc %d died with message %s\n", wm->pid, wm->msg);
}
exits(nil);
}
This code makes it pretty obvious that we are losing some children;
on my machine this program never exits. I see some portion of the
readers correctly returning -1, and the parent is able to get their
Waitmsg but not all of them.
Moody I think this old thread will interest you:
https://marc.info/?t=112730920400001&r=1&w=2
Russ Cox explained there:
It appears that your program, at its core, it is doing this:
void
readproc(void *v)
{
int fd;
char buf[100];
fd = (int)v;
read(fd, buf, sizeof buf);
}
void
threadmain(int argc, char **argv)
{
int p[2];
pipe(p);
proccreate(readproc, (void*)p[0], 8192);
proccreate(readproc, (void*)p[1], 8192);
close(p[0]);
/* and here you expect the first readproc to be done */
close(p[1]);
/* and here the second */
}
Each read call is holding up a reference to its channel
inside the kernel, so that even though you've closed the fd
and removed the ref from the fd table, there is still a reference
to each side of the pipe in the form of the process blocked
on the read.
I've never been sure whether the implicit ref held during
the system call is good behavior, but it's hard to change.
In your case, writing 0 (or anything) makes the read
finish, releasing the last ref to the underlying pipe when
the system call finishes, and then everything cleans up
as expected. So you've found your workaround, and now
we understand why it works.
------------------------------------------
9fans: 9fans
Permalink:
https://9fans.topicbox.com/groups/9fans/Tfa6823048ad90a21-M6e48031f9e8673387c0b47b8
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription