On Wed, Oct 30, 2013 at 02:29:00PM +0400, Fedorov Sergey wrote: > On 10/29/2013 06:55 PM, Stefan Hajnoczi wrote: > >On Mon, Oct 21, 2013 at 03:44:46PM +0400, Fedorov Sergey wrote: > >>After our discussion about this patch I decided to keep my patch in > >>our branch until rebase onto a new release. Recently I have rebased > >>our branch onto v1.5.3 and reverted my patch. Then I face an issue > >>when using user-mode networking with USB network device for mounting > >>root file system through NFS. Fragmented UDP packets from host to > >>guest does not handled properly. Seems that some fragments is lost > >>or somehow stalled. See guest tcpdump log below. > >> > >>03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > >>proto UDP (17), length 164) > >> 10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh > >> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 > >>512 bytes @ 0 max 4096 verf 0000000000000000 > >>03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus > >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000 > >>03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2 > 10.0.2.15: udp > >>03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > >>proto UDP (17), length 164) > >> 10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh > >> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 > >>512 bytes @ 0 max 4096 verf 0000000000000000 > >>03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus > >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000 > >>03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2 > 10.0.2.15: udp > >>... > >> > >>I didn't investigate the cause of the problem in detail. I just reverted > >> > >>commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4 > >>Author: Luigi Rizzo <ri...@iet.unipi.it> > >>Date: Tue Feb 5 17:53:31 2013 +0100 > >> > >> net: fix qemu_flush_queued_packets() in presence of a hub > >> > >>And then applied my patch. After that everything works fine for me. > >>See guest tcpdump log below. > >> > >>04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > >>proto UDP (17), length 164) > >> 10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh > >> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 > >>512 bytes @ 0 max 4096 verf 0000000000000000 > >>04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus > >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000 > >>04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+], > >>proto UDP (17), length 1500) > >> 10.0.2.2 > 10.0.2.15: udp > >>04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags > >>[none], proto UDP (17), length 240) > >> 10.0.2.2 > 10.0.2.15: udp > >> > >>So there must be something wrong with already applied patch. What > >>could you suggest? > >The next step is to investigate the cause. > > > >Perhaps hw/usb/dev-network.c:usb_net_handle_datain() is not calling > >qemu_flush_queued_packets() every time in_buf[] is read completely. > >This if statement looks strange to me: > > > >if (s->in_ptr >= s->in_len && > > (is_rndis(s) || (s->in_len & (64 - 1)) || !len)) { > > /* no short packet necessary */ > > usb_net_reset_in_buf(s); > >} > > > >Try placing printfs to find out whether qemu_flush_queued_packets() is > >getting called when you see packet loss. > > > >Stefan > > > > Seems that I have figured out the problem. net_hub_flush() does not > flush source port. And qemu_flush_queued_packets() also returns > after calling net_hub_flush(). So I think the problem is that > neither qemu_flush_queued_packets() nor net_hub_flush() call > qemu_net_queue_flush() for the source port. So I think it sould be > fixed in qemu_flush_queued_packets() by removing the return > statement after calling net_hub_flush(). That fix does work for me. > So I could submit that patch after getting permission for that.
Sounds good to me. I have CCed Luigi in case he wants to comment. Stefan