Hi All, In an cluster with OVN 24.03.5 we are observing in a few chassis that works as dedicated OVN Interconnection Gateways the ovn-controller process running almost in 100% of CPU usage:
2025-05-20T16:58:39.546Z|689641|poll_loop|INFO|wakeup due to [POLLIN] on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (95% CPU usage) 2025-05-20T16:58:45.488Z|689642|poll_loop|INFO|Dropped 48 log messages in last 6 seconds (most recently, 1 seconds ago) due to excessive rate 2025-05-20T16:58:45.488Z|689643|poll_loop|INFO|wakeup due to [POLLIN] on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (92% CPU usage) 2025-05-20T16:58:51.553Z|689644|poll_loop|INFO|Dropped 47 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate 2025-05-20T16:58:51.553Z|689645|poll_loop|INFO|wakeup due to [POLLIN] on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (98% CPU usage) 2025-05-20T16:58:57.514Z|689646|poll_loop|INFO|Dropped 50 log messages in last 6 seconds (most recently, 1 seconds ago) due to excessive rate 2025-05-20T16:58:57.514Z|689647|poll_loop|INFO|wakeup due to [POLLIN] on fd 32 (FIFO pipe:[1813314324]) at controller/pinctrl.c:4173 (95% CPU usage) 2025-05-20T16:59:03.558Z|689648|poll_loop|INFO|Dropped 49 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate Checking what ovn-controller is doing in debug mode, we can see a lot of the below ARP packets: 2025-05-20T17:10:21.149Z|00004|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX.6.X31, dst-ip=172.XX.X.2XX 2025-05-20T17:10:21.149Z|00005|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX1.6.XX1, dst-ip=172.XX.X.XX4 2025-05-20T17:10:21.271Z|00006|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.X6.X.2X3 2025-05-20T17:10:21.271Z|00007|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.X6.X.X41 2025-05-20T17:10:21.271Z|00008|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x60199dbd| in-port=338| src-mac=fa:16:3e:a7:a2:37, dst-mac=00:00:00:00:00:00| src-ip=172.XX.X2.X30, dst-ip=172.XX.X.X09 2025-05-20T17:10:21.271Z|00009|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=131| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XXX.X.X4, dst-ip=172.XX.X.X19 2025-05-20T17:10:21.272Z|00010|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.X23, dst-ip=172.XX.X.X98 2025-05-20T17:10:21.277Z|00011|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=48| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX1.X.1X1, dst-ip=172.XX.X.X05 2025-05-20T17:10:21.388Z|00012|pinctrl(ovn_pinctrl0)|DBG|pinctrl received packet-in | opcode=ARP| OF_Table_ID=0| OF_Cookie_ID=0x1367fe68| in-port=13| src-mac=fa:16:3e:1b:2b:77, dst-mac=00:00:00:00:00:00| src-ip=10.XX.X.X23, dst-ip=172.XX.X.2X2 In my understanding, it seems there are a lot of ARPs from different OVN virtual networks and making the ovn-controller use more CPU time. Wouldn't the ovn-controller know how to handle these ARP packets without use a lot of CPU time? Regards, Tiago Pires -- _‘Esta mensagem é direcionada apenas para os endereços constantes no cabeçalho inicial. Se você não está listado nos endereços constantes no cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão imediatamente anuladas e proibidas’._ * **‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss