2016-07-13 2:54 GMT-07:00 Mechthild Buescher < mechthild.buesc...@ericsson.com>:
> Hi Daniele, > > > > You are right – after pulling the latest master branch, the ovs-vswitchd > doesn’t crash any more. Whether it’s > b59cc14e032da370021794bfd1bdd3d67e88a9a3 (netdev-dpdk: Use instant sending > instead of queueing of packets.) or one of the other queue-related patches, > I can’t say. There were a bunch of updates since last week ;-) > Thanks for confirming this. I think we're going to need to backport that commit to branch-2.5 > > > A quick repetition of some of our performance measurements shows that the > performance is slightly better for multi-queue than for single queue and a > little bit higher than before with using the older ovs-version. But that > needs to be proofed.. > > > > The configured_tx_queues for dpdk ports is still 21. I saw another patch > which introduces n_txq which I haven’t tried yet but will do soon…. > That n_txq is only used for testing, the number of transmission queues is determined by the datapath or by qemu in case of vhost. > > > Thank you very much! > > > > BR/Mechthild > > > > *From:* Daniele Di Proietto [mailto:diproiet...@ovn.org] > *Sent:* Wednesday, July 13, 2016 3:40 AM > *To:* Mechthild Buescher > *Cc:* Aaron Conole; b...@openvswitch.org > > *Subject:* Re: [ovs-discuss] bug in ovs-vswitchd?! > > > > Hi Mechthild, > > Have you tried with the latest master? > > I suspect this might have already been fixed by b59cc14e032d("netdev-dpdk: > Use instant sending instead of queueing of packets."). In this case I need > to backport that on branch-2.5 as well. > > Thanks, > > Daniele > > > > 2016-07-12 17:23 GMT-07:00 Mechthild Buescher < > mechthild.buesc...@ericsson.com>: > > Hi Aaron, > > I checked whether vhost-sock-permissions is needed to run our tests and > can confirm that it's not needed. The removal of vhost-sock-permissions, > however, does not make a difference for 2 cores and 1 queue - ovs-vswitchd > is still crashing. Regarding the solution for the socket access: It was > needed to updated some libvirt related apparmor files to get it running > (e.g. in the dedicated /etc/apparmor.d/libvirt/libvirt-<uuid>.files). As > far as I remember, for older libvirt version, it was also needed to update > /etc/qemu.conf. But the updates very much depend on which libvirt version > is used and where the sockets are stored. So I cannot give a general > solution to socket-access problem as there might be different reason. Of > course qemu must be client of the socket if ovs is the server - I guess you > know this ;-) > > I failed to configure less than 21 tx queues, neither > ovs-vsctl --no-wait set Open_vSwitch . other_config:n-dpdk-txqs=2 > nor > ovs-vsctl --no-wait set Interface dpdk0 options:n_txq=2 > changes the value of > dpdk0 1/2: (dpdk: configured_rx_queues=2, configured_tx_queues=21, > requested_rx_queues=2, requested_tx_queues=21) > > But since the number of tx_queues is 21 for all our configurations, I > wonder why it leads to a crash of ovs-vswitchd only if the number of cores > is 2 and the number of queues is 1. > > Best regards, > > Mechthild > > -----Original Message----- > From: Aaron Conole [mailto:acon...@redhat.com] > > Sent: Tuesday, July 12, 2016 4:46 PM > To: Mechthild Buescher > Cc: Stokes, Ian; b...@openvswitch.org > Subject: Re: [ovs-discuss] bug in ovs-vswitchd?! > > Mechthild Buescher <mechthild.buesc...@ericsson.com> writes: > > > Hi Aaron, > > > > I think that the vhost-sock-permissions is not needed - I will check > > whether it makes a different. It's a left-over from an earlier > > configuration where we had the problem that qemu (started by libvirt) > > wasn't able to access the socket. This problem has been solved. > > Can you share the solution? It's unrelated, but this is something I'm > trying to solve at the moment. > > > The tx queues are not configured by us, so I don't know where this > > value comes from. Maybe it's the default value?! So, it's not > > intended. > > Okay, that could likely be the problem. Please try setting the TX queues > first, and if that resolves the crash there is either a setup script or > possibly code error. > > > cat /proc/cpuinfo > > processor : 0 > > vendor_id : GenuineIntel > > cpu family : 6 > > model : 62 > > model name : Intel(R) Xeon(R) CPU E5-2658 v2 @ 2.40GHz > > Thanks for this, it helps to locate which vectorization code dpdk will use. > > > stepping : 4 > > microcode : 0x40c > > cpu MHz : 1200.000 > > cache size : 25600 KB > > physical id : 0 > > siblings : 20 > > core id : 0 > > cpu cores : 10 > > apicid : 0 > > initial apicid : 0 > > fpu : yes > > fpu_exception : yes > > cpuid level : 13 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx > > pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl > > xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor > > ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 > > x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida > > arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid > > fsgsbase smep erms > > bogomips : 4799.93 > > clflush size : 64 > > cache_alignment : 64 > > address sizes : 46 bits physical, 48 bits virtual > > power management: > > > > Thanks again, > > > > BR/Mechthild > > > > -----Original Message----- > > From: Aaron Conole [mailto:acon...@redhat.com] > > Sent: Monday, July 11, 2016 11:06 PM > > To: Mechthild Buescher > > Cc: Stokes, Ian; b...@openvswitch.org > > Subject: Re: [ovs-discuss] bug in ovs-vswitchd?! > > > > Mechthild Buescher <mechthild.buesc...@ericsson.com> writes: > > > >> Hi Aaron, > >> > >> Sorry for being unclear regarding the VM: I meant the DPDK usage > >> inside the VM. So, the fault happens when using the VM. Inside the VM > >> I can either bind the interfaces to DPDK or to linux - in both cases, > >> the fault occurs. > >> > >> And I haven't applied any patch. I used the latest available version > >> from the master branch - I don't know whether any patch is upstreamed > >> to the master branch. > > > > Okay - I wonder what the vhost-sock-permissions command line is all > about, then? > > > > Can you confirm that 21 tx queues is not intended, then (21 tx queues > >> is showing in your configuration output)? > > > > Also, please send the cpu information (cat /proc/cpuinfo on the host). > > > >> Thanks in advance for your help, > >> > >> BR/Mechthild > >> > >> -----Original Message----- > >> From: Aaron Conole [mailto:acon...@redhat.com] > >> Sent: Monday, July 11, 2016 7:22 PM > >> To: Mechthild Buescher > >> Cc: Stokes, Ian; b...@openvswitch.org > >> Subject: Re: [ovs-discuss] bug in ovs-vswitchd?! > >> > >> Mechthild Buescher <mechthild.buesc...@ericsson.com> writes: > >> > >>> Hi Ian, > >>> > >>> Thanks for the fast reply! I also did some further investigations > >>> where I could see that ovs-vswitchd usually keeps alive for > >>> receiving packets but crashes for sending packets. > >>> > >>> Regarding your question: > >>> 1. We are running 1 VM with 2 vhost ports (in the simplified setup; > >>> in the complete setup we use 1 VM & 5 vhost ports) > >>> > >>> 2. We are using libvirt to start the VM which is configured to use > qemu: > >>> /usr/bin/qemu-system-x86_64 -name > >>> guest=ubuntu11_try,debug-threads=on > >>> -S -machine pc-i440fx-wily,accel=kvm,usb=off -cpu host -m 8192 > >>> -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object > >>> memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/mnt/huge_1G/ > >>> l > >>> i > >>> bvirt/qemu,share=yes,size=8589934592 > >>> -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -uuid > >>> 8a2ad7a3-9da1-4c69-a2ff-c7a680d9bc4a -no-user-config -nodefaults > >>> -chardev > >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-300-ubuntu11 > >>> _ > >>> t > >>> ry/monitor.sock,server,nowait -mon > >>> chardev=charmonitor,id=monitor,mode=control -rtc base=utc > >>> -no-shutdown -boot strict=on -device > >>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive > >>> file=/root/perf/vms/ubuntu11.qcow2,format=qcow2,if=none,id=drive-vir > >>> t > >>> i > >>> o-disk0 > >>> -device > >>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0, > >>> i > >>> d > >>> =virtio-disk0,bootindex=1 > >>> -netdev tap,fd=21,id=hostnet0 -device > >>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:3c:92:47,bus=pci. > >>> 0 > >>> ,addr=0x3 > >>> -netdev tap,fd=23,id=hostnet1 -device > >>> virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:3c:a3:47,bus=pci. > >>> 0 > >>> ,addr=0x4 -chardev > >>> socket,id=charnet2,path=/var/run/openvswitch/vhost111 -netdev > >>> type=vhost-user,id=hostnet2,chardev=charnet2 -device > >>> virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:a0:11:02,bus=pci. > >>> 0 > >>> ,addr=0x5 -chardev > >>> socket,id=charnet3,path=/var/run/openvswitch/vhost112 -netdev > >>> type=vhost-user,id=hostnet3,chardev=charnet3 -device > >>> virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:a0:11:03,bus=pci. > >>> 0 > >>> ,addr=0x6 > >>> -chardev pty,id=charserial0 -device > >>> isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device > >>> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device > >>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on > >>> > >>> 3. In the VM we have both kind of interface bindings, virtio-pci and > >>> igb_uio. For both type of interfaces, the crash of ovs-vswitchd can > >>> be observed (The VM is still alive). > >>> > >>> 4. The ovs-vswitchd is started as follows and is configured to use > >>> vxlan tunnel: > >>> ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true > >>> ovs-vsctl --no-wait set Open_vSwitch . > >>> other_config:dpdk-lcore-mask=0x1 ovs-vsctl --no-wait set > >>> Open_vSwitch . other_config:dpdk-socket-mem=4096,0 > >>> ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=6 > >>> ovs-vsctl --no-wait set Interface dpdk0 options:n_rxq=2 ovs-vsctl > >>> --no-wait set Interface vhost111 options:n_rxq=1 ovs-vsctl --no-wait > >>> set Interface vhost112 options:n_rxq=1 ovs-vsctl --no-wait set > >>> Open_vSwitch . other_config:vhost-sock-permissions=766 > >> > >> Have you, perchance, applied some extra patches? This was proposed, > >>> but not accepted, as a possible workaround for a permissions issue > >>> with ovs dpdk. > >> > >>> ovs-vswitchd --pidfile=$DB_PID --detach --monitor > >>> --log-file=$LOG_FILE -vfile:dbg --no-chdir -vconsole:emer --mlockall > >>> unix:$DB_SOCK > >>> > >>> ovs-vsctl add-port br-int vhost111 -- set Interface vhost111 > >>> type=dpdkvhostuser ofport_request=11 ovs-vsctl add-port br-int > >>> vhost112 -- set Interface vhost112 type=dpdkvhostuser > >>> ofport_request=12 ovs-vsctl add-br br-int -- set bridge br-int > >>> datapath_type=netdev ovs-vsctl set Bridge br-int > >>> other_config:datapath-id=0000f2b811144f41 > >>> ovs-vsctl set Bridge br-int protocols=OpenFlow13 ovs-vsctl add-port > >>> br-int vxlan0 -- set interface vxlan0 type=vxlan > >>> options:remote_ip=10.1.2.2 options:key=flow ofport_request=100 > >>> > >>> 5. The ovs-log is attached - it contains the log from start to crash > >>> (with debug information). The crash has been provoked by setting up > >>> an virtio-pci in the VM, so no DPDK is used in the VM for this > scenario. > >>> > >>> 6. The DPDK versions are: > >>> Host: dpdk 16.04 latest commit > >>> b3b9719f18ee83773c6ed7adda300c5ac63c37e9 > >>> VM: (not used in this scenario) dpdk 2.2.0 > >> > >> For confirmation, this happens whether or not you use a VM? I just > >>> want to make sure. It's usually best to pair dpdk versions whenever > >>> possible. > >> > >>> BR/Mechthild > >>> > >>> -----Original Message----- > >>> From: Stokes, Ian [mailto:ian.sto...@intel.com] > >>> Sent: Thursday, July 07, 2016 1:57 PM > >>> To: Mechthild Buescher; b...@openvswitch.org > >>> Subject: RE: bug in ovs-vswitchd?! > >>> > >>> Hi Mechthild, > >>> > >>> I've tried to reproduce this issue on my setup (Fedora 22 kerenl > >>> 4.1.8) but have not been able to reproduce it. > >>> > >>> A few questions to help the investigation > >>> > >>> 1. Are you running 1 or 2 VMs in the setup (i.e. 1 vm with 2 vhost > >>> user ports or 2 vms with 1 vhost user port each?) 2. What are the > >>> parameters being used to launch the VM/s attached to the vhost user > >>> ports? > >>> 3. Inside the VM, are the interfaces bound to? igb_uio (i.e. using > >>> dpdk app inside the guest) or are the interfaces being used as > >>> kernel devices inside the VM? > >>> 4. What parameters are you launching OVS with? > >>> 5. Can you provide an ovs log? > >>> 6. Can you confirm the DPDK version you are using in the host/VM (If > >>> being used in the VM). > >>> > >>> Thanks > >>> Ian > >>> > >>>> From: discuss [mailto:discuss-boun...@openvswitch.org] On Behalf Of > >>>> Mechthild Buescher > >>>> Sent: Wednesday, July 06, 2016 1:54 PM > >>>> To: b...@openvswitch.org > >>>> Subject: [ovs-discuss] bug in ovs-vswitchd?! > >>>> > >>>> Hi all, > >>>> > >>>> we are using ovs with dpdk-interfaces and vhostuser-interfaces and > >>>> want to try the VMs with different multi-queue settings. When we > >>>> specify 2 cores and 2 multi-queues for a dpdk-interface but only > >>>> one queue >for the vhost-interfaces, ovs-vswitchd crashes at start > >>>> of the VM (or latest when traffic is sent). > >>>> > >>>> The version of ovs is : 2.5.90, master branch, latest commit > >>>> 7a15be69b00fe8f66a3f3929434b39676f325a7a) > >>>> It has been built and is running on: Linux version > >>>> 3.13.0-87-generic > >>>> (buildd@lgw01-25) (gcc version 4.8.4 (Ubuntu > >>>> 4.8.4-2ubuntu1~14.04.3) > >>>> ) #133-Ubuntu SMP Tue May 24 18:32:09 UTC 2016 > >>>> > >>>> The configuration is: > >>>> ovs-vsctl show > >>>> 0e191ed4-040b-458c-bad8-feb6f7c90e3a > >>>> Bridge br-prv > >>>> Port br-prv > >>>> Interface br-prv > >>>> type: internal > >>>> Port "dpdk0" > >>>> Interface "dpdk0" > >>>> type: dpdk > >>>> options: {n_rxq="2"} > >>>> Bridge br-int > >>>> Port br-int > >>>> Interface br-int > >>>> type: internal > >>>> Port "vhost112" > >>>> Interface "vhost112" > >>>> type: dpdkvhostuser > >>>> options: {n_rxq="1"} > >>>> Port "vhost111" > >>>> Interface "vhost111" > >>>> type: dpdkvhostuser > >>>> options: {n_rxq="1"} > >>>> Port "vxlan0" > >>>> Interface "vxlan0" > >>>> type: vxlan > >>>> options: {key=flow, remote_ip="10.1.2.2"} > >>>> > >>>> ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 0 core_id > >>>>1: > >>>> port: vhost112 queue-id: 0 > >>>> port: dpdk0 queue-id: 1 pmd thread numa_id > >>>>0 core_id 2: > >>>> port: dpdk0 queue-id: 0 > >>>> port: vhost111 queue-id: 0 > >>>> > >>>> appctl-m dpif/show > >>>> br-int: > >>>> br-int 65534/6: (tap) > >>>> vhost111 11/3: (dpdkvhostuser: > >>>>configured_rx_queues=1, configured_tx_queues=1, > >>>>requested_rx_queues=1, > >>>>requested_tx_queues=21) > >>>> vhost112 12/5: (dpdkvhostuser: > >>>>configured_rx_queues=1, configured_tx_queues=1, > >>>>requested_rx_queues=1, > >>>>requested_tx_queues=21) > >>>> vxlan0 100/4: (vxlan: key=flow, > >>>>remote_ip=10.1.2.2) > >>>> br-prv: > >>>> br-prv 65534/1: (tap) > >>>> dpdk0 1/2: (dpdk: > >>>>configured_rx_queues=2, configured_tx_queues=21, > >>>>requested_rx_queues=2, > >>>>requested_tx_queues=21) > >> > >> I'm a little concerned about the numbers reported here. 21 tx queues > >> is a bit much, I think. I haven't tried reproducing this yet, but > >> can you confirm this is desired? > >> > >>>> > >>>> (gdb) bt > >>>> #0 0x00000000005356e4 in ixgbe_xmit_pkts_vec () > >>>> #1 0x00000000006df384 in rte_eth_tx_burst (nb_pkts=<optimized > >>>> out>, tx_pkts=<optimized out>, queue_id=1, port_id=<optimized out>) > >>>> at > >>>> /opt/dpdk-16.04/x86_64-native-linuxapp-gcc//include/rte_ethdev.h:27 > >>>> 9 > >>>> 1 > >>>> #2 dpdk_queue_flush__ (qid=<optimized out>, dev=<optimized out>) > >>>> at > >>>> lib/netdev-dpdk.c:1099 > >>>> #3 dpdk_queue_flush (qid=<optimized out>, dev=<optimized out>) at > >>>> lib/netdev-dpdk.c:1133 > >>>> #4 netdev_dpdk_rxq_recv (rxq=0x7fbe127ad4c0, > >>>> packets=0x7fc26761e408, > >>>> c=0x7fc26761e400) at lib/netdev-dpdk.c:1312 > >>>> #5 0x000000000061be98 in netdev_rxq_recv (rx=<optimized out>, > >>>> batch=batch@entry=0x7fc26761e400) at lib/netdev.c:628 > >>>> #6 0x00000000005f17bb in dp_netdev_process_rxq_port > >>>> (pmd=pmd@entry=0x29ea810, rxq=<optimized out>, port=<optimized > >>>> out>, port=<optimized out>) > >>>> at lib/dpif-netdev.c:2619 > >>>> #7 0x00000000005f1b27 in pmd_thread_main (f_=0x29ea810) at > >>>> lib/dpif-netdev.c:2864 > >>>> #8 0x000000000067dde4 in ovsthread_wrapper (aux_=<optimized out>) > >>>> at > >>>> lib/ovs-thread.c:342 > >>>> #9 0x00007fc26b90e184 in start_thread (arg=0x7fc26761f700) at > >>>> pthread_create.c:312 > >>>> #10 0x00007fc26af2237d in clone () at > >>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 > >>>> > >>>> This is the minimal configuration which leads to the fault. Our > >>>> complete configuration contains more vhostuser interfaces than > >>>> above. We observed that only the combination of 2 cores/queues for > >>>> dpdk- interface and 1 queue for vhostuser interfaces results in an > >>>> ovs-vswitchd crash, in detail: > >>>> Dpdk0: 1 cores/queues & all vhost-ports: 1 queue => successful > >>>> Dpdk0: 2 cores/queues & all vhost-ports: 1 queue => crash > >>>> Dpdk0: 2 cores/queues & all vhost-ports: 2 queue => successful > >>>> Dpdk0: 4 cores/queues & all vhost-ports: 1 queue => successful > >>>> Dpdk0: 4 cores/queues & all vhost-ports: 2 queue => successful > >>>> Dpdk0: 4 cores/queues & all vhost-ports: 4 queue => successful > >>>> > >>>> Do you have any suggestions? > >> > >> Can you please also supply the cpu (model number) that you're using? > >> > >> Thanks, > >> Aaron > >> > >>>> Best regards, > >>>> > >>>> Mechthild Buescher > _______________________________________________ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss > > >
_______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss