On 2013-08-01 19:15, Jan Kiszka wrote: > Hi all, > > I'm tracking down a nasty stall of tap input over a custom 1.3.x QEMU > version. Under certain load, our tap backend stops reading from the char > device, and that even if we reset the guest. The frontend device > (pcnet32) is able to receive (can_receive would return > 0), but the ^^^^^^^ Yes, the pcnet lacks qemu_flush_queued_packets, like certain other NIC models already have. We added that to pcnet_init and pcnet_start (patch will follow), but that didn't make a difference, likely due to what I described below.
Jan > tap's fd is no longer registered with the iohandler list. > > I was digging into the involved code and found something fishy: > > net/tap.c: > static void tap_send(void *opaque) > { > ... > size = qemu_send_packet_async(&s->nc, buf, size, > tap_send_completed); > if (size == 0) { > tap_read_poll(s, false); > } > > So, if tap_send is registered for the mainloop polling (ie. can_receive > returned true before starting to poll) but qemu_send_packet_async > returns 0 now as qemu_can_send_packet/can_receive happens to report > false in the meantime, we will disable read polling. If also write > polling is off, the fd will be completely removed from the iohandler > list. But even if write polling remains on, I wonder what should bring > read polling back? > > We only have an unhandy reproduction scenario, so I wasn't able to > confirm this theory on the target yet (and will not be before Monday, > unfortunately). But any comments on this would be very welcome. > > Thanks, > Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux