On 19/11/2020 11:21, Stokes, Ian wrote:
>> Hi,
>>                We are seeing a ovs-vswitchd service crash with segfault in 
>> the
>> librte_vhost library when a DPDK application within a guest VM is stopped.
>>
>>                We are using OVS 2.11.1 on CentOS 7.6 (3.10.0-1062 Linux 
>> kernel) with
>> DPDK 18.11.2.
> 
> Hi,
> 
> Is there a reason you are using OVS 2.11.1 and DPDK 18.11.2?  These are quite 
> old.
> 
> As a first step I would recommend using the latest of these branches that 
> have been validated with by the OVS community.
> 
> As of now this would be OVS 2.11.4 and DPDK 18.11.9 to check if the issue is 
> still present there my suspicion is that this could be an issue resolved in 
> the DPDK library since 18.11.2.
> 

+1, there's 58 commits in the vhost library on 18.11 branch since
18.11.2, so it might be already fixed. 18.11.10 is the latest release,
while below is in from 18.11.7.

$ git log --oneline v18.11.2..HEAD . | grep crash
90b5ba739f vhost: fix crash on port deletion

If you are planning to continue to use 18.11 for a while, I think you
will want to test the 18.11.11 Release Candidate that will be available
in a few weeks. It is the last planned 18.11 release, so any issues you
find *after* it is released won't be fixed.

Kevin.



> Regards
> Ian
> 
>>
>>                We are using OVS-DPDK on the host and the guest VM is running 
>> a DPDK
>> application. With some traffic, if the application service within the VM is
>> restarted, then OVS crashes.
>>
>>                This crash is not seen if the guest VM is restarted (instead 
>> of stopping
>> the application within the VM).
>>
>>                The crash trackback (attached below) points to the
>> rte_memcpy_generic() function in rte_memcpy.h. It looks like the crash occurs
>> when vhost is trying to dequeue the packets from the guest VM (as the
>> application in the guest VM has stopped and the huge pages are returned to 
>> the
>> guest kernel).
>>
>>                We have tried enabling iommu in ovs by setting
>> "other_config:vhost-iommu-support=true" and enabling iommu in qemu using
>> the following configuration in the guest domain XML:
>> <iommu model='intel'>
>>     <driver intremap='on'/>
>> </iommu>
>>                With iommu enabled ovs-vswitchd still crashes when guest VM 
>> restarts
>> the network service.
>>
>>                Is this a known problem? Anyone else seen a crash like this?  
>> How can
>> we protect the ovs-vswitchd from crashing when a guest VM restarts the
>> network application or service?
>>
>> Thanks
>> Alex
>> ------------------------------------------------------------------------
>>
>> Log:
>> Oct 7 19:54:16 Branch81-Bravo kernel: [2245909.596635] pmd16[25721]:
>> segfault at 7f4d1d733000 ip 00007f4d2ae5d066 sp 00007f4d1ce65618 error 4 in
>> librte_vhost.so.4[7f4d2ae52000+1a000]
>> Oct 7 19:54:19 Branch81-Bravo systemd[1]: ovs-vswitchd.service: main process
>> exited, code=killed, status=11/SEGV
>>
>> Environment:
>> CentOs 7.6.1810
>> openvswitch-2.11.1-1.el7.centos.x86_64
>> openvswitch-kmod-2.11.1-1.el7.centos.x86_64
>> dpdk-18.11-2.el7.centos.x86_64
>> 3.10.0-1062.4.1.el7.x86_64
>> qemu-kvm-ev-2.12.0-18.el7.centos_6.1.1
>>
>> Core dump trace:
>> (gdb) bt
>> #-1 0x00007ffff205602e in rte_memcpy_generic (dst=<optimized out>,
>> src=0x7fffcef3607c, n=<optimized out>)
>> at /usr/src/debug/dpdk-18.11/x86_64-native-linuxapp-
>> gcc/include/rte_memcpy.h:793
>> Backtrace stopped: Cannot access memory at address 0x7ffff20558f0
>>
>> (gdb) list *0x00007ffff205602e
>> 0x7ffff205602e is in rte_memcpy_generic (/usr/src/debug/dpdk-18.11/x86_64-
>> native-linuxapp-gcc/include/rte_memcpy.h:793).
>> 788 }
>> 789
>> 790 /**
>> 791 * For copy with unaligned load
>> 792 */
>> 793 MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
>> 794
>> 795 /**
>> 796 * Copy whatever left
>> 797 */
>>
>> (gdb) list *0x00007ffff205c192
>> 0x7ffff205c192 is in rte_vhost_dequeue_burst (/usr/src/debug/dpdk-
>> 18.11/lib/librte_vhost/virtio_net.c:1192).
>> 1187 * In zero copy mode, one mbuf can only reference data
>> 1188 * for one or partial of one desc buff.
>> 1189 */
>> 1190 mbuf_avail = cpy_len;
>> 1191 } else {
>> 1192 if (likely(cpy_len > MAX_BATCH_LEN ||
>> 1193 vq->batch_copy_nb_elems >= vq->size ||
>> 1194 (hdr && cur == m))) {
>> 1195 rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *,
>> 1196 mbuf_offset),
>> (gdb)
>>
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 

Reply via email to