Hi,

Is this related to following glibc bug? I'm not so sure about this because
when I check the glibc source of installed version (2.35), the proposed
patch has been applied.

https://sourceware.org/bugzilla/show_bug.cgi?id=12889

I can confirm that this problem only happen if I use statefull ACL which is
related to conntrack. The racing situation happen when massive unreachable
replies are received. For example, if I run etcd on VMs but one etcd node
has been disabled which causes massive connection attempts and unreachable
replies.

Best regards.

On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution <mrxlazuar...@gmail.com>
wrote:

> Hi Michael,
>
> Have you found the solution for this case? I find the same weird problem
> without any information about which conntrack entries are causing
> this issue.
>
> I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
> problem is disappear after I remove some Kubernutes cluster VMs and some DB
> cluster VMs.
>
> Best regards.
>
>
>> Date: Thu, 29 Sep 2022 07:56:32 +0000
>> From: "Plato, Michael" <michael.pl...@tu-berlin.de>
>> To: "ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>
>> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
>> Content-Type: text/plain; charset="us-ascii"
>>
>> Hi,
>>
>> we are about to roll out our new openstack infrastructure based on yoga
>> and during our testing we observered that the openvswitch-switch systemd
>> unit restarts several times a day, causing network interruptions for all
>> VMs on the compute node in question.
>> After some research we found that the ovs-vswitchd crashes with the
>> following assertion failure:
>>
>> "2022-09-29T06:51:05.195Z|00003|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
>> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
>> conn_update_state()"
>>
>> To get more information about the connection that leads to this assertion
>> failure, I added some debug code to conntrack.c .
>> We have seen that we can trigger this issue when trying to connect from a
>> VM to a destination which is unreachable. For example curl
>> https://www.google.de:444
>>
>> Shortly after that we get an assertion and the debug code says:
>>
>> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
>> src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst
>> ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444
>> zone/rev zone 2/2 nw_proto/rev nw_proto 6/6
>>
>> ovs-appctl dpctl/dump-conntrack | grep "444"
>>
>> tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT)
>>
>> Versions:
>> ovs-vsctl --version
>> ovs-vsctl (Open vSwitch) 2.17.2
>> DB Schema 8.3.0
>>
>> ovn-controller --version
>> ovn-controller 22.03.0
>> Open vSwitch Library 2.17.0
>> OpenFlow versions 0x6:0x6
>> SB DB Schema 20.21.0
>>
>> DPDK 21.11.2
>>
>> We are now unsure if this is a misconfiguration or if we hit a bug.
>>
>> Thanks for any feedback
>>
>> Michael
>>
>>
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to