Hi Adrian, thanks for the reply. Something went wrong with my previous mail (empty??) so trying again.
On Wed, 2024-01-17 at 12:43 +0100, Adrian Moreno wrote: > Hi Michiel, > > On 1/15/24 11:45, Michiel van den Berg via discuss wrote: > > Openvswitch + rstp - bug? or configuration mistake? > > > > Openvswitch with rstp enabled wont reply to arp requests, there for > > can not be > > reached, until it sends traffic outside where other hosts can pick > > up its mac > > address and send traffic. > > > > Below test shows the simplest bridge configuration I can make, with > > 1 external > > and 1 int port. Ofcourse this is not how you would use STP in > > production, but it > > works as a simple example. > > > > Test: (Debian 12, ifupdown2) > > > > # Ensure config is clean. > > ovs-vsctl del-br storage > > > > # Create bridge with rstp enabled. > > BRIDGE=storage > > INTPORT=stor0 > > EXTPORT=ens19 > > > > # Create bridge > > ovs-vsctl add-br $BRIDGE > > ovs-vsctl set Bridge $BRIDGE rstp_enable=true > > > > Are you not adding any rule to the bridge? No openflow rules, this should act as a simple l3 switch, until we are ready to do more, our DC is growing, however we are not at the point yet we need such dynamic configuration. > > > # Add INTPORT > > ovs-vsctl add-port $BRIDGE $INTPORT > > ovs-vsctl set Port $INTPORT tag=18 > > ovs-vsctl set Interface $INTPORT type=internal > > ovs-vsctl set Port $INTPORT other_config:rstp-enable=true # Is this > > even > > required? - doesnt change working status. > > > > # Add EXTPORT > > ovs-vsctl add-port $BRIDGE $EXTPORT > > ovs-vsctl set Port $EXTPORT other_config:rstp-enable=true > > > > # Above configuration is correct according to docs (outside of the > > intport > > having rstp enabled). In this case ARP requests are being ignored. > > > > 11:14:37.395050 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.21, length 28 > > 11:14:37.483090 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.22, length 28 > > 11:14:38.418969 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.21, length 28 > > 11:14:38.507020 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.22, length 28 > > > > Above is 2 systems (.21 and .22) trying to ping this system (.24) > > > > Where are those systems connected to? Our regular setup across all our clusters is: X proxmox KVM hosts, each with 4 NIC: - 2 connected to a frontend network (via ovs, bridge, rstp) - 2 connected to a storage network (via ovs, bridge, rstp) X ceph hosts, each with 3 NIC: - 1 connected to frontend network (plain linux interface) - 2 connected to storage (via ovs, bridge, rstp) 2 'frontend' switches - Arista, 10g, rstp enabled 2 'storage' switches - Arista, 40g, rstp enabled. - each switch holds a link to each system and the other switch in the same zone. The machines involved in this test were 3 of the ceph machines, 2 of which are on debian bullseye with its stable version of ovs and one has been upgraded to debian bookworm and its stable version of OVS. .21, .22, and .24 are these machines, which are referenced troughout this document. > > > From TCPDump I can also see STP is in the correct state > > > > 11:16:27.810779 STP 802.1w, Rapid STP, Flags [Learn, Forward], > > bridge-id > > 1000.6a:06:2b:fe:2b:41.800f, length 36 > > message-age 0.00s, max-age 6.00s, hello-time 2.00s, forwarding- > > delay 4.00s > > root-id 1000.6a:06:2b:fe:2b:41, root-pathcost 0, port-role > > Designated > > > > Can you send the output of "ovs-appctl rstp/show {bridge}"? From the 'broken' .24 machine: (ens20 is another adapter which is supposed to be part of the rstp config, but for simplicity has been left out) ---- stor ---- Root ID: stp-priority 4096 stp-system-id 6a:06:2b:fe:2b:41 stp-hello-time 2s stp-max-age 6s stp-fwd-delay 4s root-port ens19 root-path-cost 2000 Bridge ID: stp-priority 32768 stp-system-id b2:57:88:4a:e5:4a stp-hello-time 2s stp-max-age 20s stp-fwd-delay 15s Interface Role State Cost Pri.Nbr ---------- ---------- ---------- -------- ------- ens19 Root Forwarding 2000 128.1 From a older installation (.21) ---- vmbr2 ---- Root ID: stp-priority 4096 stp-system-id 6a:06:2b:fe:2b:41 stp-hello-time 2s stp-max-age 6s stp-fwd-delay 4s root-port eth1 root-path-cost 200000 Bridge ID: stp-priority 32768 stp-system-id ea:9c:22:f2:07:41 stp-hello-time 2s stp-max-age 6s stp-fwd-delay 4s Interface Role State Cost Pri.Nbr ---------- ---------- ---------- -------- ------- eth2 Alternate Discarding 200000 128.1 eth1 Root Forwarding 200000 128.2 The root bridge is the same on both machines which shows me RSTP is communcating on some level. The difference in port cost on both machines is due to a change in the OVS source where the 'unknown' port speed was updated to be 10g from 100m before and most machines running the older OVS version. Btw. This problem already exists there, but due to the way the ordering in ifupdown to build the configuration the mac address gets known in the network BEFORE rstp gets enabled, which is something ive shown below. > Is there anything odd in the logs? Maybe, im not sure. At least to me 2 things stand out right now. Ran a debug for a few minutes. with a otherwise stable test network (no rstp changes) it does seem to transition port modes. A block like this comes by roughly every second: (only difference being the counters and time). 2024-01-15T12:47:01.242Z|08068|poll_loop|DBG|wakeup due to [POLLIN] on fd 31 (FIFO pipe:[17458]) at ../vswitchd/bridge.c:421 (0% CPU usage) 2024-01-15T12:47:01.243Z|08069|rstp_sm|DBG|stor: move_rstp() 2024-01-15T12:47:01.243Z|08070|rstp_sm|DBG|stor, port 1: Port_role_transition_sm 7 -> 6 2024-01-15T12:47:01.243Z|08071|rstp_sm|DBG|stor, port 1: port_transmit_sm 6 -> 3 2024-01-15T12:47:01.243Z|08072|rstp_sm|DBG|stor: move_rstp() 2024-01-15T12:47:01.243Z|08073|rstp_sm|DBG|stor, port 1: Port_role_transition_sm 6 -> 7 2024-01-15T12:47:01.243Z|08074|rstp_sm|DBG|stor, port 1: port_transmit_sm 3 -> 5 2024-01-15T12:47:01.243Z|08075|rstp_sm|DBG|stor: move_rstp() 2024-01-15T12:47:01.243Z|08076|rstp_sm|DBG|stor, port 1: port_transmit_sm 5 -> 6 2024-01-15T12:47:01.243Z|08077|rstp_sm|DBG|stor: move_rstp() It seems weird to me the port wants to transition every second even though there are no actual changes in the network. Every 2 seconds you see some information come by, probably related to stp-hello-time which is set to 2 seconds. 2024-01-15T12:47:03.295Z|03919|poll_loop(handler92)|DBG|wakeup due to [POLLIN] on fd 28 (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/dpif- netlink.c:3195 (0% CPU usage) 2024-01-15T12:47:03.295Z|03920|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|03921|rstp_sm(handler92)|DBG|stor, port 1: Port_receive_sm 4 -> 3 2024-01-15T12:47:03.295Z|03922|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|03923|rstp_sm(handler92)|DBG|stor, port 1: Port_receive_sm 3 -> 4 2024-01-15T12:47:03.295Z|03924|rstp_sm(handler92)|DBG|stor, port 1: Port_information_sm 8 -> 9 2024-01-15T12:47:03.295Z|03925|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|03926|rstp_sm(handler92)|DBG|v1: 1.000.6a062bfe2b41, 0, 1.000.6a062bfe2b41, 32783, 0 2024-01-15T12:47:03.295Z|03927|rstp_sm(handler92)|DBG|v2: 1.000.6a062bfe2b41, 0, 1.000.6a062bfe2b41, 32783, 0 2024-01-15T12:47:03.295Z|03928|rstp_sm(handler92)|DBG|superior_same 2024-01-15T12:47:03.295Z|03929|rstp_sm(handler92)|DBG|stor, port 1: Port_information_sm 9 -> 17 2024-01-15T12:47:03.295Z|03930|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|03931|rstp_sm(handler92)|DBG|stor, port 1: Port_information_sm 17 -> 7 2024-01-15T12:47:03.295Z|03932|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|03933|rstp_sm(handler92)|DBG|stor, port 1: Port_information_sm 7 -> 8 2024-01-15T12:47:03.295Z|03934|rstp_sm(handler92)|DBG|stor: move_rstp() 2024-01-15T12:47:03.295Z|08102|poll_loop|DBG|wakeup due to [POLLIN] on fd 31 (FIFO pipe:[17458]) at ../vswitchd/bridge.c:421 (0% CPU usage) 2024-01-15T12:47:03.295Z|03935|poll_loop(handler92)|DBG|wakeup due to 0-ms timeout at ../ofproto/ofproto-dpif-upcall.c:802 (0% CPU usage) > > Is all traffic between those ports dropped or only ARP? i.e: if you > manually set > the arp entries in both ends, does traffic flow again? As far as ive been able to see, Correct. However. All machines are running lldpd which naturally broadcasts information over the network, However this does not seem to cause the mac address to be known, it does work as lldp information can be got from the switch, or the machine about either-other-end. Doing a ping, telnet, ssh or whatever you can think off attempt from the 'broken' (.24) machine to any working machine causes the whole network to come alive. ex. after starting the bridge. ping from .21 to .24 (not working) ping from .22 to .24 (not working) ping from .24 to .21 - will work. and after this BOTH above pings start working to. This is likely due to some smarts in the switch which can tell .22 about the mac address it just learned about, but this is a guess. > If the ARP packet is being dropped inside OVS you could try running > "ovs-appctl > ofproto/trace", which will give us the reason why OVS decided to drop > it. ofproto is not available as we have no openflow rules, simple l3 setup. > > > > A working configuration would be removing the 3 rstp lines from > > above script. In > > this case ofcourse RSTP is not available, but the port does reply > > to the arp > > request ensuring other hosts can reach it. > > > > # Ensure config is clean. > > ovs-vsctl del-br storage > > > > # Create bridge with rstp enabled. > > BRIDGE=storage > > INTPORT=stor0 > > EXTPORT=ens19 > > > > # Create bridge > > ovs-vsctl add-br $BRIDGE > > > > # Add INTPORT > > ovs-vsctl add-port $BRIDGE $INTPORT > > ovs-vsctl set Port $INTPORT tag=18 > > ovs-vsctl set Interface $INTPORT type=internal > > > > # Add EXTPORT > > ovs-vsctl add-port $BRIDGE $EXTPORT > > > > # TCPDUMP: > > 11:24:34.707063 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.21, length 28 > > 11:24:35.211050 ARP, Ethernet (len 6), IPv4 (len 4), Request who- > > has > > 172.25.42.24 tell 172.25.42.22, length 28 > > 11:24:39.832310 ARP, Ethernet (len 6), IPv4 (len 4), Reply > > 172.25.42.21 is-at > > 9e:eb:24:bb:1f:17, length 28 > > 11:24:40.344110 ARP, Ethernet (len 6), IPv4 (len 4), Reply > > 172.25.42.22 is-at > > be:b9:84:91:a9:28, length 28 > > > > How above TCPDUMP looks weird, would expect "Reply 172.25.42.24 is- > > at > > <somewhere>" however that might be something im doing wrong with > > the tcpdump > > command, let me know :-) > > > > Enabling RSTP on this bridge after starting it up, simply: > > > > ovs-vsctl set Bridge $BRIDGE rstp_enable=true > > ovs-vsctl set Port $INTPORT other_config:rstp-enable=true > > ovs-vsctl set Port $EXTPORT other_config:rstp-enable=true > > > > Will stop any ARP messages again, so clearing the mac address table > > on one of > > the other hosts will cause communication to stop. > > > > Versions: *Please not this problem already exists for many years > > across > > different versions* > > > > # cat /etc/debian_version > > 12.4 > > # dpkg -l linux-image-amd64 > > linux-image-amd64 6.1.67-1 > > # uname -a > > Linux ceph04-test 6.1.0-16-amd64 #1 SMP PREEMPT_DYNAMIC Debian > > 6.1.67-1 > > (2023-12-12) x86_64 GNU/Linux > > # ovs-vswitchd --version > > ovs-vswitchd (Open vSwitch) 3.1.0 > > > > Other refs: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2017- > > August/045083.html > > <https://mail.openvswitch.org/pipermail/ovs-discuss/2017- > > August/045083.html> > > https://forum.proxmox.com/threads/ovs-intport-you-cant-ping-me- > > unless-i-ping-you- > > first.104828/ <https://forum.proxmox.com/threads/ovs-intport-you- > > cant-ping-me-unless-i-ping-you-first.104828/> > > https://serverfault.com/questions/1041970/ovs-bridge-inbound- > > broadcast-packets-dropped-when-rstp- > > enabled <https://serverfault.com/questions/1041970/ovs-bridge- > > inbound-broadcast-packets-dropped-when-rstp-enabled> > > > > > > > > _______________________________________________ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss