Mechthild Buescher <mechthild.buesc...@ericsson.com> writes:

> Hi Aaron,
>
> I think that the vhost-sock-permissions is not needed - I will check
> whether it makes a different. It's a left-over from an earlier
> configuration where we had the problem that qemu (started by libvirt)
> wasn't able to access the socket. This problem has been solved.

Can you share the solution?  It's unrelated, but this is something I'm
trying to solve at the moment.

> The tx queues are not configured by us, so I don't know where this
> value comes from. Maybe it's the default value?! So, it's not
> intended.

Okay, that could likely be the problem.  Please try setting the TX
queues first, and if that resolves the crash there is either a setup script or
possibly code error.

> cat /proc/cpuinfo 
> processor     : 0
> vendor_id     : GenuineIntel
> cpu family    : 6
> model         : 62
> model name    : Intel(R) Xeon(R) CPU E5-2658 v2 @ 2.40GHz

Thanks for this, it helps to locate which vectorization code dpdk will
use.

> stepping      : 4
> microcode     : 0x40c
> cpu MHz               : 1200.000
> cache size    : 25600 KB
> physical id   : 0
> siblings      : 20
> core id               : 0
> cpu cores     : 10
> apicid                : 0
> initial apicid        : 0
> fpu           : yes
> fpu_exception : yes
> cpuid level   : 13
> wp            : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
> xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
> ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2
> x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida
> arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
> fsgsbase smep erms
> bogomips      : 4799.93
> clflush size  : 64
> cache_alignment       : 64
> address sizes : 46 bits physical, 48 bits virtual
> power management:
>
> Thanks again,
>
> BR/Mechthild
>
> -----Original Message-----
> From: Aaron Conole [mailto:acon...@redhat.com] 
> Sent: Monday, July 11, 2016 11:06 PM
> To: Mechthild Buescher
> Cc: Stokes, Ian; b...@openvswitch.org
> Subject: Re: [ovs-discuss] bug in ovs-vswitchd?!
>
> Mechthild Buescher <mechthild.buesc...@ericsson.com> writes:
>
>> Hi Aaron,
>>
>> Sorry for being unclear regarding the VM: I meant the DPDK usage 
>> inside the VM. So, the fault happens when using the VM. Inside the VM 
>> I can either bind the interfaces to DPDK or to linux - in both cases, 
>> the fault occurs.
>>
>> And I haven't applied any patch. I used the latest available version 
>> from the master branch - I don't know whether any patch is upstreamed 
>> to the master branch.
>
> Okay - I wonder what the vhost-sock-permissions command line is all about, 
> then?
>
> Can you confirm that 21 tx queues is not intended, then (21 tx queues
>> is showing in your configuration output)?
>
> Also, please send the cpu information (cat /proc/cpuinfo on the host).
>
>> Thanks in advance for your help,
>>
>> BR/Mechthild
>>
>> -----Original Message-----
>> From: Aaron Conole [mailto:acon...@redhat.com]
>> Sent: Monday, July 11, 2016 7:22 PM
>> To: Mechthild Buescher
>> Cc: Stokes, Ian; b...@openvswitch.org
>> Subject: Re: [ovs-discuss] bug in ovs-vswitchd?!
>>
>> Mechthild Buescher <mechthild.buesc...@ericsson.com> writes:
>>
>>> Hi Ian,
>>>
>>> Thanks for the fast reply! I also did some further investigations 
>>> where I could see that ovs-vswitchd usually keeps alive for receiving 
>>> packets but crashes for sending packets.
>>>
>>> Regarding your question:
>>> 1. We are running 1 VM with 2 vhost ports (in the simplified setup; 
>>> in the complete setup we use 1 VM & 5 vhost ports)
>>>
>>> 2. We are using libvirt to start the VM which is configured to use qemu:
>>> /usr/bin/qemu-system-x86_64 -name guest=ubuntu11_try,debug-threads=on
>>> -S -machine pc-i440fx-wily,accel=kvm,usb=off -cpu host -m 8192 
>>> -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object 
>>> memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/mnt/huge_1G/l
>>> i
>>> bvirt/qemu,share=yes,size=8589934592
>>> -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -uuid 
>>> 8a2ad7a3-9da1-4c69-a2ff-c7a680d9bc4a -no-user-config -nodefaults 
>>> -chardev 
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-300-ubuntu11_
>>> t
>>> ry/monitor.sock,server,nowait -mon
>>> chardev=charmonitor,id=monitor,mode=control -rtc base=utc 
>>> -no-shutdown -boot strict=on -device
>>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
>>> file=/root/perf/vms/ubuntu11.qcow2,format=qcow2,if=none,id=drive-virt
>>> i
>>> o-disk0
>>> -device
>>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,i
>>> d
>>> =virtio-disk0,bootindex=1
>>> -netdev tap,fd=21,id=hostnet0 -device
>>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:3c:92:47,bus=pci.
>>> 0
>>> ,addr=0x3
>>> -netdev tap,fd=23,id=hostnet1 -device
>>> virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:3c:a3:47,bus=pci.
>>> 0
>>> ,addr=0x4 -chardev
>>> socket,id=charnet2,path=/var/run/openvswitch/vhost111 -netdev
>>> type=vhost-user,id=hostnet2,chardev=charnet2 -device
>>> virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:a0:11:02,bus=pci.
>>> 0
>>> ,addr=0x5 -chardev
>>> socket,id=charnet3,path=/var/run/openvswitch/vhost112 -netdev
>>> type=vhost-user,id=hostnet3,chardev=charnet3 -device
>>> virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:a0:11:03,bus=pci.
>>> 0
>>> ,addr=0x6
>>> -chardev pty,id=charserial0 -device
>>> isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device
>>> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
>>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on
>>>
>>> 3. In the VM we have both kind of interface bindings, virtio-pci and 
>>> igb_uio. For both type of interfaces, the crash of ovs-vswitchd can 
>>> be observed (The VM is still alive).
>>>
>>> 4.  The ovs-vswitchd is started as follows and is configured to use
>>> vxlan tunnel:
>>> ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true 
>>> ovs-vsctl --no-wait set Open_vSwitch .
>>> other_config:dpdk-lcore-mask=0x1 ovs-vsctl --no-wait set Open_vSwitch 
>>> . other_config:dpdk-socket-mem=4096,0
>>> ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=6 
>>> ovs-vsctl --no-wait set Interface dpdk0 options:n_rxq=2 ovs-vsctl 
>>> --no-wait set Interface vhost111 options:n_rxq=1 ovs-vsctl --no-wait 
>>> set Interface vhost112 options:n_rxq=1 ovs-vsctl --no-wait set 
>>> Open_vSwitch . other_config:vhost-sock-permissions=766
>>
>> Have you, perchance, applied some extra patches?  This was proposed,
>>> but not accepted, as a possible workaround for a permissions issue 
>>> with ovs dpdk.
>>
>>> ovs-vswitchd --pidfile=$DB_PID --detach --monitor 
>>> --log-file=$LOG_FILE -vfile:dbg --no-chdir -vconsole:emer --mlockall 
>>> unix:$DB_SOCK
>>>
>>> ovs-vsctl add-port br-int vhost111 -- set Interface vhost111 
>>> type=dpdkvhostuser ofport_request=11 ovs-vsctl add-port br-int
>>> vhost112 -- set Interface vhost112 type=dpdkvhostuser
>>> ofport_request=12 ovs-vsctl add-br br-int -- set bridge br-int 
>>> datapath_type=netdev ovs-vsctl set Bridge br-int
>>> other_config:datapath-id=0000f2b811144f41
>>> ovs-vsctl set Bridge br-int protocols=OpenFlow13 ovs-vsctl add-port 
>>> br-int vxlan0 -- set interface vxlan0 type=vxlan
>>> options:remote_ip=10.1.2.2 options:key=flow ofport_request=100
>>>
>>> 5. The ovs-log is attached - it contains the log from start to crash 
>>> (with debug information). The crash has been provoked by setting up 
>>> an virtio-pci in the VM, so no DPDK is used in the VM for this scenario.
>>>
>>> 6. The DPDK versions are:
>>> Host: dpdk 16.04 latest commit
>>> b3b9719f18ee83773c6ed7adda300c5ac63c37e9
>>> VM: (not used in this scenario) dpdk 2.2.0
>>
>> For confirmation, this happens whether or not you use a VM?  I just
>>> want to make sure.  It's usually best to pair dpdk versions whenever 
>>> possible.
>>
>>> BR/Mechthild
>>>
>>> -----Original Message-----
>>> From: Stokes, Ian [mailto:ian.sto...@intel.com]
>>> Sent: Thursday, July 07, 2016 1:57 PM
>>> To: Mechthild Buescher; b...@openvswitch.org
>>> Subject: RE: bug in ovs-vswitchd?!
>>>
>>> Hi Mechthild,
>>>
>>> I've tried to reproduce this issue on my setup (Fedora 22 kerenl
>>> 4.1.8) but have not been able to reproduce it.
>>>
>>> A few questions to help the investigation
>>>
>>> 1. Are you running 1 or 2 VMs in the setup (i.e. 1 vm with 2 vhost 
>>> user ports or 2 vms with 1 vhost user port each?) 2. What are the 
>>> parameters being used to launch the VM/s attached to the vhost user 
>>> ports?
>>> 3. Inside the VM, are the interfaces bound to? igb_uio (i.e. using 
>>> dpdk app inside the guest) or are the interfaces being used as kernel 
>>> devices inside the VM?
>>> 4. What parameters are you launching OVS with?
>>> 5. Can you provide an ovs log?
>>> 6. Can you confirm the DPDK version you are using in the host/VM (If 
>>> being used in the VM).
>>>
>>> Thanks
>>> Ian
>>>
>>>> From: discuss [mailto:discuss-boun...@openvswitch.org] On Behalf Of 
>>>> Mechthild Buescher
>>>> Sent: Wednesday, July 06, 2016 1:54 PM
>>>> To: b...@openvswitch.org
>>>> Subject: [ovs-discuss] bug in ovs-vswitchd?!
>>>> 
>>>> Hi all,
>>>> 
>>>> we are using ovs with dpdk-interfaces and vhostuser-interfaces and 
>>>> want to try the VMs with different multi-queue settings. When we
>>>> specify 2 cores and 2 multi-queues for a dpdk-interface but only
>>>> one queue >for the vhost-interfaces, ovs-vswitchd crashes at start
>>>> of the VM (or latest when traffic is sent).
>>>>
>>>> The version of ovs is : 2.5.90, master branch, latest commit
>>>> 7a15be69b00fe8f66a3f3929434b39676f325a7a)
>>>> It has been built and is running on: Linux version 3.13.0-87-generic
>>>> (buildd@lgw01-25) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
>>>> ) #133-Ubuntu SMP Tue May 24 18:32:09 UTC 2016
>>>>
>>>> The configuration is:
>>>> ovs-vsctl show
>>>> 0e191ed4-040b-458c-bad8-feb6f7c90e3a
>>>>     Bridge br-prv
>>>>         Port br-prv
>>>>             Interface br-prv
>>>>                 type: internal
>>>>         Port "dpdk0"
>>>>             Interface "dpdk0"
>>>>                 type: dpdk
>>>>                 options: {n_rxq="2"}
>>>>     Bridge br-int
>>>>         Port br-int
>>>>             Interface br-int
>>>>                 type: internal
>>>>         Port "vhost112"
>>>>             Interface "vhost112"
>>>>                 type: dpdkvhostuser
>>>>                 options: {n_rxq="1"}
>>>>         Port "vhost111"
>>>>             Interface "vhost111"
>>>>                 type: dpdkvhostuser
>>>>                options: {n_rxq="1"}
>>>>         Port "vxlan0"
>>>>             Interface "vxlan0"
>>>>                 type: vxlan
>>>>                 options: {key=flow, remote_ip="10.1.2.2"}
>>>>
>>>> ovs-appctl dpif-netdev/pmd-rxq-show
>>>> pmd thread numa_id 0 core_id 1:
>>>>                 port: vhost112   queue-id: 0
>>>>                 port: dpdk0        queue-id: 1  pmd thread numa_id 0 
>>>>core_id 2:
>>>>                 port: dpdk0        queue-id: 0
>>>>                 port: vhost111   queue-id: 0
>>>>
>>>> appctl-m dpif/show
>>>>                 br-int:
>>>>                                 br-int 65534/6: (tap)
>>>>                                 vhost111 11/3: (dpdkvhostuser: 
>>>>configured_rx_queues=1, configured_tx_queues=1, 
>>>>requested_rx_queues=1,
>>>>requested_tx_queues=21)
>>>>                                 vhost112 12/5: (dpdkvhostuser: 
>>>>configured_rx_queues=1, configured_tx_queues=1, 
>>>>requested_rx_queues=1,
>>>>requested_tx_queues=21)
>>>>                                 vxlan0 100/4: (vxlan: key=flow,
>>>>remote_ip=10.1.2.2)
>>>>                 br-prv:
>>>>                                 br-prv 65534/1: (tap)
>>>>                                 dpdk0 1/2: (dpdk: 
>>>>configured_rx_queues=2, configured_tx_queues=21, 
>>>>requested_rx_queues=2,
>>>>requested_tx_queues=21)
>>
>> I'm a little concerned about the numbers reported here.  21 tx
>> queues is a bit much, I think.  I haven't tried reproducing this
>> yet, but can you confirm this is desired?
>>
>>>> 
>>>>  (gdb) bt
>>>> #0  0x00000000005356e4 in ixgbe_xmit_pkts_vec ()
>>>> #1  0x00000000006df384 in rte_eth_tx_burst (nb_pkts=<optimized out>, 
>>>> tx_pkts=<optimized out>, queue_id=1, port_id=<optimized out>)
>>>>     at
>>>> /opt/dpdk-16.04/x86_64-native-linuxapp-gcc//include/rte_ethdev.h:279
>>>> 1
>>>> #2  dpdk_queue_flush__ (qid=<optimized out>, dev=<optimized out>) at
>>>> lib/netdev-dpdk.c:1099
>>>> #3  dpdk_queue_flush (qid=<optimized out>, dev=<optimized out>) at
>>>> lib/netdev-dpdk.c:1133
>>>> #4  netdev_dpdk_rxq_recv (rxq=0x7fbe127ad4c0, 
>>>> packets=0x7fc26761e408,
>>>> c=0x7fc26761e400) at lib/netdev-dpdk.c:1312
>>>> #5  0x000000000061be98 in netdev_rxq_recv (rx=<optimized out>,
>>>> batch=batch@entry=0x7fc26761e400) at lib/netdev.c:628
>>>> #6  0x00000000005f17bb in dp_netdev_process_rxq_port 
>>>> (pmd=pmd@entry=0x29ea810, rxq=<optimized out>, port=<optimized out>, 
>>>> port=<optimized out>)
>>>>     at lib/dpif-netdev.c:2619
>>>> #7  0x00000000005f1b27 in pmd_thread_main (f_=0x29ea810) at
>>>> lib/dpif-netdev.c:2864
>>>> #8  0x000000000067dde4 in ovsthread_wrapper (aux_=<optimized out>) 
>>>> at
>>>> lib/ovs-thread.c:342
>>>> #9  0x00007fc26b90e184 in start_thread (arg=0x7fc26761f700) at
>>>> pthread_create.c:312
>>>> #10 0x00007fc26af2237d in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>>>> 
>>>> This is the minimal configuration which leads to the fault. Our 
>>>> complete configuration contains more vhostuser interfaces than
>>>> above. We observed that only the combination of 2 cores/queues for
>>>> dpdk- interface and 1 queue for vhostuser interfaces results in an
>>>> ovs-vswitchd crash, in detail:
>>>> Dpdk0: 1 cores/queues & all vhost-ports: 1 queue => successful
>>>> Dpdk0: 2 cores/queues & all vhost-ports: 1 queue => crash
>>>> Dpdk0: 2 cores/queues & all vhost-ports: 2 queue => successful
>>>> Dpdk0: 4 cores/queues & all vhost-ports: 1 queue => successful
>>>> Dpdk0: 4 cores/queues & all vhost-ports: 2 queue => successful
>>>> Dpdk0: 4 cores/queues & all vhost-ports: 4 queue => successful
>>>> 
>>>> Do you have any suggestions? 
>>
>> Can you please also supply the cpu (model number) that you're using?
>>
>> Thanks,
>> Aaron
>>
>>>> Best regards,
>>>>
>>>> Mechthild Buescher
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to