Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread LIU Yulong via discuss
Hi Eelco, Thank you. Patch sent to the mail list: https://mail.openvswitch.org/pipermail/ovs-dev/2024-March/412474.html On Wed, Mar 13, 2024 at 5:34 PM Eelco Chaudron wrote: > > > > On 13 Mar 2024, at 10:19, LIU Yulong wrote: > > > Hi guys, > > > > Send a pull request with that try_lock movem

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread Eelco Chaudron via discuss
On 13 Mar 2024, at 10:19, LIU Yulong wrote: > Hi guys, > > Send a pull request with that try_lock movement fix based on the former tests: > https://github.com/openvswitch/ovs/pull/421 > > Does that make sense to you? I’m a bit behind emails, etc. so did not look at your emails yet. But for OVS

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-13 Thread LIU Yulong via discuss
Hi guys, Send a pull request with that try_lock movement fix based on the former tests: https://github.com/openvswitch/ovs/pull/421 Does that make sense to you? Thank you. LIU Yulong On Tue, Mar 12, 2024 at 3:11 PM LIU Yulong wrote: > > Updates: > > Ukey attributes we already have: > > lo

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-12 Thread LIU Yulong via discuss
Updates: Ukey attributes we already have: long long int created OVS_GUARDED;/* Estimate of creation time. */ unsigned int state_thread OVS_GUARDED;/* Thread that transitions. */ Added more attributes [1] to the ukey: const char *state_before OVS_GUARDED; /* locator

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-03-01 Thread LIU Yulong via discuss
Hi, Add some updates: 1. We added a debug attribute `state_before ` to the ukey to record more life cycle details of a ukey: state_where = 0x55576027b868 "ofproto/ofproto-dpif-upcall.c:", [1], it is UKEY_DELETED. state_before = 0x55576027b630 "ofproto/ofproto-dpif-upcall.c:", [2], it was

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-27 Thread Eelco Chaudron via discuss
On 27 Feb 2024, at 9:49, LIU Yulong wrote: > Yes, that makes sense. > > Another question is how to distinguish the core at line of > ovs_mutex_trylock in revalidator_sweep__ is after the free(ukey), > since the core trace has no timestamp. This is hard to figure out without adding a time variab

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-27 Thread LIU Yulong via discuss
Yes, that makes sense. Another question is how to distinguish the core at line of ovs_mutex_trylock in revalidator_sweep__ is after the free(ukey), since the core trace has no timestamp. This line in the function 'ukey_create__' should be the only place where ovs allocated the memory for ukey: ht

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss
On 27 Feb 2024, at 4:44, LIU Yulong wrote: > @Eelco, as you suggested, added such circular buffer to my local OVS: > https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46 I should also add allocate logging, or else you might not know if a buffer was allocated at the

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
@Eelco, as you suggested, added such circular buffer to my local OVS: https://github.com/gotostack/ovs/commit/939d88c3c5fcdb446b01f2afa8f1e80c3929db46 gdb shows such data structure: 2232 ukey_free_buffer.index = (ukey_free_buffer.index + 1) % (1024 * 1024); // Circular buffer (gdb) p ukey_fre

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
@Ilya, thank you, I will add that patch. @Eelco, thank you again, I will add a RL log to the free(ukey). Hope we can get something useful. On Mon, Feb 26, 2024 at 7:55 PM Ilya Maximets wrote: > > On 2/26/24 11:20, Eelco Chaudron wrote: > > > > > > On 26 Feb 2024, at 11:10, LIU Yulong wrote: > >

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Ilya Maximets via discuss
On 2/26/24 11:20, Eelco Chaudron wrote: > > > On 26 Feb 2024, at 11:10, LIU Yulong wrote: > >> Hi Eelco, >> >> Thank you for the quick response. >> >> I did not add those logs, because in order to reproduce the issue, we >> have to send lots of packets to the host. >> So there are too many ukeys

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss
On 26 Feb 2024, at 11:10, LIU Yulong wrote: > Hi Eelco, > > Thank you for the quick response. > > I did not add those logs, because in order to reproduce the issue, we > have to send lots of packets to the host. > So there are too many ukeys created/deleted to do logging. Maybe a circular buffe

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
Hi Eelco, Thank you for the quick response. I did not add those logs, because in order to reproduce the issue, we have to send lots of packets to the host. So there are too many ukeys created/deleted to do logging. And can we ensure that this [1] is the only place for ovs to free the ukey? [1]

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread Eelco Chaudron via discuss
On 26 Feb 2024, at 9:33, LIU Yulong wrote: > Hi, > > I have read the code by comparing the call stack of the core files > carefully, and found > a potential race condition. Please confirm whether the following 3 threads > have a race condition. Just did some code trace, can such > race condition

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-26 Thread LIU Yulong via discuss
Hi, I have read the code by comparing the call stack of the core files carefully, and found a potential race condition. Please confirm whether the following 3 threads have a race condition. Just did some code trace, can such race condition happen? * PMD thread1 ===

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-21 Thread Eelco Chaudron via discuss
On 21 Feb 2024, at 4:26, LIU Yulong wrote: > Thank you very much for your reply. > > The problem is not easy to reproduce, we have to wait a random long time to > see > if the issue happens again. It can be more than one day or longer. > OVS 2.17 with dpdk 20.11 had run to core before, so it's

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-20 Thread LIU Yulong via discuss
Thank you very much for your reply. The problem is not easy to reproduce, we have to wait a random long time to see if the issue happens again. It can be more than one day or longer. OVS 2.17 with dpdk 20.11 had run to core before, so it's hard to say if it is related to DPDK. I'm running the ovs

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Eelco Chaudron via discuss
On 19 Feb 2024, at 13:09, Ilya Maximets wrote: > On 2/19/24 11:14, Eelco Chaudron wrote: >> >> >> On 19 Feb 2024, at 10:34, LIU Yulong wrote: >> >>> Hi OVS experts, >>> >>> Our ovs-vswitchd runs to core at the ovs_mutex_trylock(&ukey->mutex) in the >>> function revalidator_sweep__. >>> >>> I've

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Ilya Maximets via discuss
On 2/19/24 11:14, Eelco Chaudron wrote: > > > On 19 Feb 2024, at 10:34, LIU Yulong wrote: > >> Hi OVS experts, >> >> Our ovs-vswitchd runs to core at the ovs_mutex_trylock(&ukey->mutex) in the >> function revalidator_sweep__. >> >> I've sent the mail before but have no response. >> https://mail.

Re: [ovs-discuss] ovs-vswitchd core at revalidator_sweep__

2024-02-19 Thread Eelco Chaudron via discuss
On 19 Feb 2024, at 10:34, LIU Yulong wrote: > Hi OVS experts, > > Our ovs-vswitchd runs to core at the ovs_mutex_trylock(&ukey->mutex) in the > function revalidator_sweep__. > > I've sent the mail before but have no response. > https://mail.openvswitch.org/pipermail/ovs-discuss/2023-August/0526