Hi List,

Sorry for the cross post, I accidentally posted this to the bug list as
well.

Looking for some help troubleshooting an issue I'm seeing as part of an
openstack install of OVN.

Environment
Openstack: 2023.2
ovn 23.09.3

Symptoms:
ovn-northd is sitting at about 80/90% cpu usage, with no apparent cause.
The log is showing the following

2025-03-16T09:41:31.909Z|01411|poll_loop|INFO|wakeup due to 0-ms timeout at
lib/reconnect.c:677 (82% CPU usage)
2025-03-16T09:41:37.931Z|01412|poll_loop|INFO|Dropped 244 log messages in
last 6 seconds (most recently, 0 seconds ago) due to excessive rate
2025-03-16T09:41:37.931Z|01413|poll_loop|INFO|wakeup due to [POLLIN] on fd
12 (10.20.3.5:48108<->10.20.3.7:6642) at lib/stream-fd.c:157 (85% CPU usage)

When running an strace against its pid

write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
write(9, "\0", 1)                       = 1
sendto(12, "{\"id\":126601,\"method\":\"transact\""..., 1562, 0, NULL, 0) =
1562
accept(7, 0x7ffc21c15160, [128])        = -1 EAGAIN (Resource temporarily
unavailable)
write(9, "\0", 1)                       = 1
poll([{fd=15, events=POLLIN}, {fd=7, events=POLLIN}, {fd=5, events=POLLIN},
{fd=12, events=POLLIN}], 4, 1546) = 1 ([{fd=12, revents=POLLIN}])
getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=2357, tv_usec=89555},
ru_stime={tv_sec=62, tv_usec=784079}, ...}) = 0
write(9, "\0", 1)                       = 1
recvfrom(15, 0x61091006da4a, 1446, 0, NULL, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
recvfrom(15, 0x61091006da4a, 1446, 0, NULL, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
recvfrom(12, "{\"id\":null,\"method\":\"update2\",\"p"..., 1564, 0, NULL,
NULL) = 1564
recvfrom(12, "ce416b4-d662-43e5-863d-06ccc1152"..., 4096, 0, NULL, NULL) =
388
recvfrom(12, 0x610911862914, 3708, 0, NULL, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
recvfrom(12, 0x610911862914, 3708, 0, NULL, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
recvfrom(12, 0x610911862914, 3708, 0, NULL, NULL) = -1 EAGAIN (Resource
temporarily unavailable)

So I'm not 100% sure that it's an ovn issue, or that it's a neutron-server
calling it too many times.

The odd thing is this is a lab environment with very little traffic or
change taking place.

Any suggestions on troubleshooting or narrowing down the cause would be
gratefully received.

Rgds
Steve.

-- 
This email contains information, which is private and confidential, and is 
intended for the person(s) named above. All commercial rights to the 
content included herein are owned exclusively by Nscale Global Holdings 
Limited or its affiliates (collectively, "Nscale"). Any use, distribution, 
copying, or disclosure by any other person without the prior written 
permission of Nscale is strictly prohibited. If you have received this 
email in error or you do not consent to receiving messages of this kind, 
then please inform me as soon as possible.
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to