On Sun, Feb 07, 2016 at 08:49:32PM +0100, Manuel Bouyer wrote: > On Sun, Feb 07, 2016 at 03:23:58PM +0100, Manuel Bouyer wrote: > > It looks like the read(2) syscall returns a EAGAIN when the caller > > expect it to block if there's no data available. > > I could capture the pipe setup before getting the stream of EAGAIN: > 20110 1 nagios EMUL "netbsd" > 20110 1 nagios CALL read(3,0xbb51a000,0x80000) > 20110 1 nagios GIO fd 3 read 114 bytes > "job_id=5513\0type=0\0command=/usr/pkg/libexec/nagios/check_snmp -H > 10.\ > 128.12.0 -o .1.3.6.1.2.1.1.5.0\0timeout=60\0\^A\0\0\0" > 20110 1 nagios RET read 114/0x72 > 20110 1 nagios CALL __gettimeofday50(0xbb59d08c,0) > 20110 1 nagios RET __gettimeofday50 0 > 20110 1 nagios CALL __gettimeofday50(0xbf7fe4a4,0) > 20110 1 nagios RET __gettimeofday50 0 > 20110 1 nagios CALL __gettimeofday50(0xbf7fe4ac,0) > 20110 1 nagios RET __gettimeofday50 0 > 20110 1 nagios CALL __gettimeofday50(0xbf7fe468,0) > 20110 1 nagios RET __gettimeofday50 0 > 20110 1 nagios CALL pipe > 20110 1 nagios RET pipe 4, 5 > 20110 1 nagios CALL pipe > 20110 1 nagios RET pipe 6, 7 > 20110 1 nagios CALL fcntl(4,4,4) > 20110 1 nagios RET fcntl 0 > 20110 1 nagios CALL fcntl(6,4,4) > 20110 1 nagios RET fcntl 0 > 20110 1 nagios CALL fork > 20110 1 nagios RET fork 1822/0x71e > 20110 1 nagios CALL close(5) > 20110 1 nagios RET close 0 > 20110 1 nagios CALL close(7) > 20110 1 nagios RET close 0 > 20110 1 nagios CALL read(4,0xbf7fd4e0,0x1000) > 20110 1 nagios RET read -1 errno 35 Resource temporarily unavailable > 20110 1 nagios CALL read(4,0xbf7fd4e0,0x1000) > 20110 1 nagios RET read -1 errno 35 Resource temporarily unavailable > 20110 1 nagios CALL read(4,0xbf7fd4e0,0x1000) > 20110 1 nagios RET read -1 errno 35 Resource temporarily unavailable > > If I read this properly, the 2 fnctl calls do a F_SETFL with > O_NONBLOCK. So it looks normal for the read to return EAGAIN in this case. > > What I find strange is that there's no call to poll(2) or select(2) > before before the call to read(2). > If I read the sources properly the process should use poll(2) before calling > read(2). I can't see why this would not show up in the ktrace output.
There is a call to pool. What happens is that it gets a POLLIN event for both fd 3 (which really has data to read) and fd 4 (wich doesn't). The read callback for fd 4 expects to be called only when there's really data to be read, and if read returns EAGAIN it loops until it gets data. poll is called with a set of descriptors, and returns that there is 1 descriptor ready to be read. But the POLLIN flag is set for both descriptors 3 and 4. Now the question is why is the POLLIN flag set when there's no data to read ? zeroing out revents before callin poll(2) doens't help. The man page says: This implementation differs from the historical one in that a given file descriptor may not cause poll() to return with an error. In cases where this would have happened in the historical implementation (e.g. trying to poll a revoke(2)d descriptor), this implementation instead copies the events bitmask to the revents bitmask. Attempting to perform I/O on this descriptor will then return an error. This behaviour is believed to be more useful. Does it do so if the file descriptor's error is EAGAIN ? If so that's no very usefull ... -- Manuel Bouyer <bou...@antioche.eu.org> NetBSD: 26 ans d'experience feront toujours la difference --