Hi,The problem doesn't seem to happen with latest ovs 2.5.0 git branch (commit 
ac93328273238b5dc86353222264fa4f30ad95e8, dpdk 16.04).
This is the stack traces we got with 2.5.0 release:
before running traffic:

(gdb) info threads  Id   TargetId         Frame  29   Thread 0x7fb44f004700 
(LWP 4498)"dpdk_watchdog1" 0x00007fb44f0bef2d in nanosleep () 
at../sysdeps/unix/syscall-template.S:81  28   Thread 0x7fb44e803700 (LWP 
4499)"vhost_thread2" 0x00007fb44f0e6ae3 in select () 
at../sysdeps/unix/syscall-template.S:81  27   Thread 0x7fb44e002700 (LWP 
4500)"urcu3" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  26   Thread 0x7fb3fbfff700 (LWP 
4601)"handler82" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  25   Thread 0x7fb418ff9700 (LWP 
4602)"handler79" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  24   Thread 0x7fb4197fa700 (LWP 
4603)"handler78" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  23   Thread 0x7fb419ffb700 (LWP 
4604)"handler77" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  22   Thread 0x7fb44d400700 (LWP 
4605)"handler80" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  21   Thread 0x7fb44cbff700 (LWP 
4606)"handler81" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  20   Thread 0x7fb43ffff700 (LWP 
4607)"handler83" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  19   Thread 0x7fb43f7fe700 (LWP 
4608)"handler84" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  18   Thread 0x7fb43effd700 (LWP 
4609)"handler85" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  17   Thread 0x7fb43e7fc700 (LWP 
4610)"handler86" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  16   Thread 0x7fb43dffb700 (LWP 
4611)"handler87" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  15   Thread 0x7fb43d7fa700 (LWP 
4612)"handler89" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  14   Thread 0x7fb43cff9700 (LWP 
4613)"handler88" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  13   Thread 0x7fb423fff700 (LWP 
4614)"handler91" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  12   Thread 0x7fb4237fe700 (LWP 
4615)"handler90" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  11   Thread 0x7fb422ffd700 (LWP 
4616)"handler92" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  10   Thread 0x7fb4227fc700 (LWP 
4617)"handler93" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  9    Thread 0x7fb421ffb700 (LWP 
4618)"revalidator94" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  8    Thread 0x7fb4217fa700 (LWP 
4619)"revalidator95" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  7    Thread 0x7fb420ff9700 (LWP 4620) 
"revalidator96"0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  6    Thread 0x7fb41bfff700 (LWP 
4621)"revalidator97" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  5    Thread 0x7fb41b7fe700 (LWP 
4622)"revalidator98" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  4    Thread 0x7fb41affd700 (LWP 
4623)"revalidator99" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  3    Thread 0x7fb41a7fc700 (LWP 
4624)"revalidator100" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  2    Thread 0x7fb3fb7fe700 (LWP 
4625)"pmd101" 0x00000000005c8088 in dp_netdev_process_rxq_port.isra ()* 1    
Thread 0x7fb45074eb00 (LWP 4497)"ovs-vswitchd" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81 (gdb) thread 2[Switching to thread 2 
(Thread 0x7fb3fb7fe700 (LWP 4625))]#0  0x00000000005c8088 
indp_netdev_process_rxq_port.isra () (gdb) bt#0  0x00000000005c8088 in 
dp_netdev_process_rxq_port.isra()#1  0x00000000005c84aa in pmd_thread_main ()#2 
 0x0000000000648c54 in ovsthread_wrapper ()#3  0x00007fb44f8c10a4 in 
start_thread(arg=0x7fb3fb7fe700) at pthread_create.c:309#4  0x00007fb44f0ed87d 
in clone () at../sysdeps/unix/sysv/linux/x86_64/clone.S:111 After running 
traffic and ovs stuck: (gdb) info threads  Id   TargetId         Frame  29   
Thread 0x7fb44f004700 (LWP 4498)"dpdk_watchdog1" 0x00007fb44f0bef2d in 
nanosleep () at../sysdeps/unix/syscall-template.S:81  28   Thread 
0x7fb44e803700 (LWP 4499)"vhost_thread2" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  27   Thread 0x7fb44e002700 (LWP 
4500)"urcu3" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  26   Thread 0x7fb3fbfff700 (LWP 
4601)"handler82" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  25   Thread 0x7fb418ff9700 (LWP 
4602)"handler79" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  24   Thread 0x7fb4197fa700 (LWP 
4603)"handler78" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  23   Thread 0x7fb419ffb700 (LWP 
4604)"handler77" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  22   Thread 0x7fb44d400700 (LWP 
4605)"handler80" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  21   Thread 0x7fb44cbff700 (LWP 
4606)"handler81" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  20   Thread 0x7fb43ffff700 (LWP 
4607)"handler83" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  19   Thread 0x7fb43f7fe700 (LWP 
4608)"handler84" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  18   Thread 0x7fb43effd700 (LWP 
4609)"handler85" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  17   Thread 0x7fb43e7fc700 (LWP 
4610)"handler86" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  16   Thread 0x7fb43dffb700 (LWP 
4611)"handler87" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  15   Thread 0x7fb43d7fa700 (LWP 
4612)"handler89" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  14   Thread 0x7fb43cff9700 (LWP 
4613)"handler88" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  13   Thread 0x7fb423fff700 (LWP 
4614)"handler91" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  12   Thread 0x7fb4237fe700 (LWP 
4615)"handler90" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  11   Thread 0x7fb422ffd700 (LWP 
4616)"handler92" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  10   Thread 0x7fb4227fc700 (LWP 
4617)"handler93" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  9    Thread 0x7fb421ffb700 (LWP 
4618)"revalidator94" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  8    Thread 0x7fb4217fa700 (LWP 
4619)"revalidator95" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  7    Thread 0x7fb420ff9700 (LWP 
4620)"revalidator96" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  6    Thread 0x7fb41bfff700 (LWP 
4621)"revalidator97" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  5    Thread 0x7fb41b7fe700 (LWP 
4622)"revalidator98" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  4    Thread 0x7fb41affd700 (LWP 
4623)"revalidator99" 0x00007fb44f0e4d3d in poll () at 
../sysdeps/unix/syscall-template.S:81  3    Thread 0x7fb41a7fc700 (LWP 
4624)"revalidator100" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81  2    Thread 0x7fb3fb7fe700 (LWP 
4625)"pmd101" 0x00000000004664c9 in rte_vhost_dequeue_burst ()* 1    Thread 
0x7fb45074eb00 (LWP 4497)"ovs-vswitchd" 0x00007fb44f0e4d3d in poll () 
at../sysdeps/unix/syscall-template.S:81(gdb) thread 2[Switching to thread 2 
(Thread 0x7fb3fb7fe700 (LWP 4625))]#0  0x00000000004664c9 in 
rte_vhost_dequeue_burst ()(gdb) bt#0  0x00000000004664c9 in 
rte_vhost_dequeue_burst ()#1  0x000000000069fb85 in netdev_dpdk_vhost_rxq_recv 
()#2  0x00000000005f25d1 in netdev_rxq_recv ()#3  0x00000000005c8076 
indp_netdev_process_rxq_port.isra ()#4  0x00000000005c84aa in pmd_thread_main 
()#5  0x0000000000648c54 in ovsthread_wrapper ()#6  0x00007fb44f8c10a4 in 
start_thread(arg=0x7fb3fb7fe700) at pthread_create.c:309#7  0x00007fb44f0ed87d 
in clone () at../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Detach and 
re-attach: (gdb) thread 2[Switching to thread 2 (Thread 0x7fb450719b00 (LWP 
5315))]#0  pthread_cond_timedwait@@GLIBC_2.3.2 () 
at../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238238    
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such fileor 
directory.(gdb) bt#0  pthread_cond_timedwait@@GLIBC_2.3.2 () 
at../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238#1  
0x00007fb44f6b3ce3 in handle_fildes_io(arg=<optimized out>) at 
../sysdeps/pthread/aio_misc.c:645#2  0x00007fb44f8c10a4 in 
start_thread(arg=0x7fb450719b00) at pthread_create.c:309#3  0x00007fb44f0ed87d 
in clone () at../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Detach and 
re-attach: (gdb) thread 2[Switching to thread 2 (Thread 0x7fb3fb7fe700 (LWP 
4625))]#0  0x0000000000463844 in rte_pktmbuf_free ()(gdb) bt#0  
0x0000000000463844 in rte_pktmbuf_free ()#1  0x00000000004668f4 in 
rte_vhost_dequeue_burst ()#2  0x000000000069fb85 in netdev_dpdk_vhost_rxq_recv 
()#3  0x00000000005f25d1 in netdev_rxq_recv ()#4  0x00000000005c8076 
indp_netdev_process_rxq_port.isra ()#5  0x00000000005c84aa in pmd_thread_main 
()#6  0x0000000000648c54 in ovsthread_wrapper ()#7  0x00007fb44f8c10a4 in 
start_thread(arg=0x7fb3fb7fe700) at pthread_create.c:309#8  0x00007fb44f0ed87d 
in clone () at../sysdeps/unix/sysv/linux/x86_64/clone.S:111  Thanks
 

    On Tuesday, 26 April 2016 11:29 AM, Daniele Di Proietto 
<diproiet...@ovn.org> wrote:
 

 

2016-04-26 9:08 GMT-07:00 Traynor, Kevin <kevin.tray...@intel.com>:

> -----Original Message-----
> From: discuss [mailto:discuss-boun...@openvswitch.org] On Behalf Of
> Kochba, Alon
> Sent: Tuesday, April 26, 2016 4:38 PM
> To: Ben Pfaff <b...@ovn.org>; Yi Ba <yby.develo...@yahoo.com>
> Cc: b...@openvswitch.org
> Subject: Re: [ovs-discuss] ovs get stuck when running traffic from VM
> to VM on same compute
>
> Hi Ben,
>
> Could you point us to the commit that fixed this issue?
> We already tried patching with this commit which seemed relevant, but
> the issue still recreated -
> https://github.com/openvswitch/ovs/commit/f519a72d9a3708fbc5f796f176e7
> c8bd3dcfb738
>
> We will retry with your suggestion of using the 2.5 branch code, but
> we might want to backport the specific fix unless there is a 2.5.1
> release including it.
> If the commit linked above is the one you were thinking of, please
> note a small difference - in the commit the rcu is blocked waiting for
> vhost_thread to quiesce, while in our case rcu is blocked waiting for
> pmd to quiesce.

It sounds similar to the problem that this commit fixed. If so the fix
is applied to master and 2.5 branches.

https://github.com/openvswitch/ovs/commit/61c4e39460a7db3be7262a3b2af767a84167a9d8


Could you try applying the above commit and see if it fixes the problem?

If you manage to reproduce the problem, could you get a backtrace of the 
blocked thread (pmd101 in this case)?

Thanks,

Daniele 


  
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to