Hi ,

We are seeing ovn-controller churning constantly at 100% CPU usage.

(Open vSwitch) 2.17.6
ovn-controller 22.09.2

2023-11-01T04:54:08.406Z|01514|poll_loop|INFO|wakeup due to [POLLIN] on fd
24 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (100% CPU
usage)
2023-11-01T04:54:11.053Z|01515|poll_loop|INFO|wakeup due to 2641-ms timeout
at lib/rconn.c:687 (100% CPU usage)
2023-11-01T04:54:11.058Z|01516|poll_loop|INFO|wakeup due to [POLLIN] on fd
23 (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (100% CPU
usage)

We don't have a huge logical scale , maybe like 400 hypervisors , most
machines have max 8 logical switch ports bound in the br-int .

I think we are generating a massive number of ARP entries to the ovs switch
, I'm wondering if this is why we are seeing the high CPU .
ovs-ofctl dump-flows br-int | grep -i arp | wc -l
273259

Why would we see so many ARP entries being generated?

ovn-appctl coverage/show
Event coverage, avg rate over last: 5 seconds, last minute, last hour,
 hash=14bcd236:
cmap_expand                0.0/sec     0.000/sec        0.0000/sec   total:
3
netlink_sent               0.0/sec     0.000/sec        0.0692/sec   total:
338
netlink_received           0.0/sec     0.000/sec        0.0692/sec   total:
338
vconn_sent                 1.8/sec     0.600/sec        1.3228/sec   total:
650858
vconn_received             2.0/sec     0.617/sec        0.3764/sec   total:
1743
vconn_open                 0.0/sec     0.000/sec        0.0000/sec   total:
3
util_xalloc              13326.6/sec  8847.767/sec   143201.4078/sec
total: 657040773
unixctl_replied            0.2/sec     0.017/sec        0.0003/sec   total:
1
unixctl_received           0.2/sec     0.017/sec        0.0003/sec   total:
1
stream_open                0.0/sec     0.000/sec        0.0000/sec   total:
5
pstream_open               0.0/sec     0.000/sec        0.0000/sec   total:
1
seq_change                 6.2/sec     3.617/sec        7.6122/sec   total:
72727
rconn_sent                 1.8/sec     0.600/sec        1.3197/sec   total:
648897
rconn_queued               1.8/sec     0.600/sec        1.3197/sec   total:
648897
poll_zero_timeout          0.0/sec     0.017/sec        0.0175/sec   total:
88
poll_create_node          19.4/sec    10.550/sec       22.4789/sec   total:
225870
txn_success                0.0/sec     0.000/sec        0.0150/sec   total:
84
txn_incomplete             0.0/sec     0.000/sec        0.3678/sec   total:
3092
txn_unchanged              1.6/sec     1.067/sec        2.5200/sec   total:
30484
hmap_expand              302.2/sec   201.183/sec      642.3989/sec   total:
5011612
hmap_pathological          7.2/sec     4.800/sec       24.6650/sec   total:
151135
miniflow_malloc            0.0/sec     0.000/sec    11609.2186/sec   total:
43790437
flow_extract               1.6/sec     0.367/sec        0.0908/sec   total:
386
physical_run               0.0/sec     0.000/sec        0.0158/sec   total:
58
pinctrl_total_pin_pkts     1.6/sec     0.367/sec        0.0908/sec   total:
386
pinctrl_notify_main_thread   0.0/sec     0.000/sec        0.0003/sec
total: 1
lflow_conj_free            0.0/sec     0.000/sec        0.0089/sec   total:
32
lflow_conj_alloc           0.0/sec     0.000/sec        1.7628/sec   total:
6550
lflow_cache_trim           0.0/sec     0.000/sec        0.0019/sec   total:
9
lflow_cache_delete         0.0/sec     0.000/sec        0.0175/sec   total:
90
lflow_cache_miss           0.0/sec     0.000/sec       78.2608/sec   total:
373182
lflow_cache_hit            0.0/sec     0.000/sec     1785.4458/sec   total:
6552555
lflow_cache_add            0.0/sec     0.000/sec        0.0208/sec   total:
82447
lflow_cache_free_matches   0.0/sec     0.000/sec        0.0017/sec   total:
7
lflow_cache_free_expr      0.0/sec     0.000/sec        0.0158/sec   total:
83
lflow_cache_add_matches    0.0/sec     0.000/sec        0.0022/sec   total:
2444
lflow_cache_add_expr       0.0/sec     0.000/sec        0.0186/sec   total:
80003
consider_logical_flow      0.0/sec     0.000/sec     1519.2769/sec   total:
5645717
lflow_run                  0.0/sec     0.000/sec        0.0172/sec   total:
64
100 events never hit


ovs-appctl coverage/show
Event coverage, avg rate over last: 5 seconds, last minute, last hour,
 hash=3ca88d5c:
nln_changed                0.0/sec     0.000/sec        0.0033/sec   total:
2838
netlink_sent              95.2/sec   110.667/sec      112.5772/sec   total:
175716759
netlink_recv_jumbo        19.8/sec    24.350/sec       24.8900/sec   total:
19517705
netlink_received         125.0/sec   141.633/sec      144.4631/sec   total:
216927167
netdev_set_ethtool         0.0/sec     0.000/sec        0.0003/sec   total:
148
netdev_get_ethtool         0.0/sec     0.000/sec        0.0017/sec   total:
889
netdev_set_hwaddr          0.0/sec     0.000/sec        0.0000/sec   total:
1
netdev_get_ifindex       110.8/sec   127.867/sec      126.3978/sec   total:
42850322
netdev_set_policing        0.0/sec     0.000/sec        0.1311/sec   total:
60019
vconn_sent                 0.2/sec     0.450/sec        1.2531/sec   total:
921184
vconn_received             0.2/sec     0.450/sec        1.3094/sec   total:
24143217
util_xalloc              3207.6/sec  3477.633/sec     4545.9147/sec
total: 6406423189
unixctl_replied            0.4/sec     0.200/sec        0.1706/sec   total:
405431
unixctl_received           0.4/sec     0.200/sec        0.1706/sec   total:
405431
stream_open                0.0/sec     0.000/sec        0.0000/sec   total:
1
pstream_open               0.0/sec     0.000/sec        0.0000/sec   total:
9
seq_change               1085.8/sec  1177.800/sec     1191.0867/sec
total: 2871230085
rconn_sent                 0.2/sec     0.450/sec        1.0297/sec   total:
919861
rconn_queued               0.2/sec     0.450/sec        1.0297/sec   total:
919861
rconn_overflow             0.0/sec     0.000/sec        0.0000/sec   total:
1414
poll_zero_timeout         27.2/sec    26.133/sec       26.6861/sec   total:
31541067
poll_create_node         339.6/sec   348.733/sec      352.6569/sec   total:
739173782
txn_success                0.2/sec     0.200/sec        0.2031/sec   total:
506211
txn_incomplete             0.2/sec     0.200/sec        0.2169/sec   total:
525150
txn_unchanged              0.0/sec     0.033/sec        0.0461/sec   total:
87571
netdev_get_stats          68.2/sec    68.200/sec       67.6836/sec   total:
156244559
mac_learning_expired       0.0/sec     0.000/sec        0.0086/sec   total:
3402
mac_learning_learned       0.0/sec     0.000/sec        0.0061/sec   total:
3449
hmap_expand               42.2/sec    55.617/sec       63.5119/sec   total:
103998889
hmap_pathological          0.0/sec     0.000/sec        0.5786/sec   total:
541510
hindex_expand              0.0/sec     0.000/sec        0.0000/sec   total:
17
hindex_pathological        0.0/sec     0.000/sec        0.0000/sec   total:
4355
miniflow_malloc            0.0/sec     0.000/sec        2.0386/sec   total:
46495678
flow_extract              17.4/sec    19.917/sec       19.5244/sec   total:
20628513
dpif_execute_with_help     0.0/sec     0.117/sec        0.1758/sec   total:
233722
dpif_execute              16.2/sec    19.233/sec       18.9425/sec   total:
19779730
dpif_flow_del              9.4/sec    13.067/sec       13.0742/sec   total:
15394670
dpif_flow_put             16.8/sec    17.683/sec       17.8483/sec   total:
19618541
dpif_flow_get              0.0/sec     0.000/sec        0.0000/sec   total:
29
dpif_flow_flush            0.0/sec     0.000/sec        0.0000/sec   total:
3
dpif_port_del              0.0/sec     0.000/sec        0.0006/sec   total:
309
dpif_port_add              0.0/sec     0.000/sec        0.0003/sec   total:
188
cmap_shrink                0.0/sec     0.000/sec        0.0267/sec   total:
116793
cmap_expand                0.0/sec     0.000/sec        0.0331/sec   total:
118948
ccmap_shrink               0.0/sec     0.000/sec        0.0000/sec   total:
27719
ccmap_expand               0.0/sec     0.000/sec        0.0000/sec   total:
1525
xlate_actions             27.4/sec    26.567/sec       39.7833/sec   total:
27537033
upcall_ukey_replace        0.0/sec     0.000/sec        0.0000/sec   total:
245
upcall_ukey_contention     0.0/sec     0.000/sec        0.0000/sec   total:
156
ukey_dp_change             0.0/sec     0.000/sec        0.0000/sec   total:
49
revalidate_missed_dp_flow   0.0/sec     0.000/sec        0.0031/sec
total: 1324
handler_duplicate_upcall   0.6/sec     1.267/sec        1.3233/sec   total:
251186
dumped_new_flow            0.0/sec     0.000/sec        0.0000/sec   total:
3496
dumped_duplicate_flow      0.0/sec     0.000/sec        0.0022/sec   total:
1035
rev_mac_learning           0.0/sec     0.000/sec        0.0089/sec   total:
3615
rev_flow_table             0.0/sec     0.000/sec        0.0150/sec   total:
33855
rev_port_toggled           0.0/sec     0.000/sec        0.0036/sec   total:
890
rev_reconfigure            0.0/sec     0.000/sec        0.0094/sec   total:
2455
packet_in_overflow         0.0/sec     0.000/sec        0.0000/sec   total:
27
ofproto_update_port        0.0/sec     0.000/sec        0.0092/sec   total:
25328
ofproto_recv_openflow      0.2/sec     0.450/sec        1.3089/sec   total:
24143105
ofproto_packet_out         0.0/sec     0.000/sec        0.0017/sec   total:
59238
ofproto_flush              0.0/sec     0.000/sec        0.0000/sec   total:
3
bridge_reconfigure         0.0/sec     0.000/sec        0.0094/sec   total:
3736
88 events never hit
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to