On 5/8/24 18:01, Numan Siddique wrote:
> On Wed, May 8, 2024 at 8:42 AM Шагов Георгий via discuss <
> ovs-discuss@openvswitch.org> wrote:
> 
>> Hello everyone
>>
>>
>>
>> In some aspect it might be considered as a continuation of this thread:
>> (link1), yet it is different
>>
>> After we have upgrade from OVN 22.03 to OVN 24.03, we have indeed found
>> increase in performance in 3-4 times
>>
>> And yet still we do observe high CPU load for NorthD process; taking
>> deeper into the logs we have found:
>>
>>
>>
> 
> Thanks for reporting this issue.
> 
> 
> 2024-05-07T08:36:46.505Z|18503|poll_loop|INFO|wakeup due to [POLLIN] on fd
>> 15 (10.34.22.66:60716<->10.34.22.66:6642) at lib/stream-fd.c:157 (94% CPU
>> usage)
>>
>> *2024-05-07T08:37:38.857Z|18504|inc_proc_eng|INFO|node: northd, recompute
>> (missing handler for input SB_datapath_binding) took 52313ms*
>>
>> *2024-05-07T08:37:48.335Z|18505|inc_proc_eng|INFO|node: lflow, recompute
>> (failed handler for input northd) took 7759ms*
>>
>> *2024-05-07T08:37:48.718Z|18506|timeval|WARN|Unreasonably long 62213ms
>> poll interval (56201ms user, 2900ms system)*
>>
>>
>>
>> As you can see there is a significant delay in 52 secs
>>

This is huge indeed!

>> Correct me please, if I am in the wrong, but IMU: ‘*missing handler for*’
>> – practically means absence of the inc-engine handler from some node (in
>> this sample: *SB_datapath_binding*)
>>
> 
> That's correct.
> 
> Before plunging into Development it would be great to clarify/adjust with
>> Community’s position
>>
>>    - Why there is not handler for this node?
>>
>>
> Our approach has been to add a handler  for any input change only if it is
> frequent or if it can be easily handled.
> We also have skipped adding handlers if it increases the code complexity.
> Having said that I think we are open
> to adding more handlers if it makes sense or if it results in scale
> improvements.
> 
> Right now we fall back to a full recompute of northd engine for any changes
> to a logical switch or logical router.
> Does your deployment create/delete logical switches/routers frequently ?
> Is it possible to enable ovn debug logs
> and share them ?  I'm curious to know what are the changes to SB datapath
> binding.
> 
> Feel free to share your OVN NB and SB DBs if you're ok with it.  I can
> deploy those DBs and see why recompute is so expensive.
> 
> 
> 
>>    - Any particular reason for this or just the peculiarity of our
>>    installation highlighted this issue?
>>
>>
> My guess is that your installation is frequently creating , deleting or
> modifying logical switches or routers.
> 
> 
>>    -
>>    - Do you think there is a reason in implementing that handler? (
>>    *SB_datapath_binding*)
>>
>>
> I'm fine adding a handler if it helps in the scale.   In our use cases, we
> don't frequently create/delete the logical switches and routers
> and hence it is ok to fall back to full recomputes for such changes.
> 
> 
>>    -
>>
>>
>>
>> Any ideas are highly appreciated.
>>
> 
> You're welcome to work on it and submit patches to add a handler for
> SB_datapath_binding.
> 
> @Dumitru Ceara <dce...@redhat.com> @Han Zhou <hz...@ovn.org> if you've any
> reservations on adding more handlers please do comment here.
> 

In general, especially if it fixes a scalability issue like this one,
it's probably fine.  In practice it depends a bit on how much complexity
this would add to the code.

But the best way to tell is to have a way to reproduce this, e.g., NB/SB
databases and the NB/SB jsonrpc update that caused the recompute.

Regards,
Dumitru

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to