Replying to this, just so that we keep a record on the mailing list.
On 12.06.2025 17:01, Hannes Duerr wrote:
Tested as follow:
Created 5 Proxmox VE nodes
joined them as cluster
added a two interfaces per node, all interfaces are on the same host bridge.
Assigned the interfaces VLAN tags so that the nodes form a circle:
----1---
/ \
5 2
\ /
4-------3
== OSPF ==
Created new OSPF fabric `backbone` with area 0.0.0.0 and ipv4 prefix
192.168.2.0/24
Added all 5 nodes and assigned them the ipv4 addresses 192.168.0.[1-5]
(unnumbered)
Checked routes with vtysh -c 'show ip ospf route' and pinged all ips
-> works as expected
Added PtP /31 address to the interfaces (numbered) and reloaded the config
Checked routes with vtysh -c 'show ip ospf route' and pinged all ips
-> works as expected
Removed nodes 5 and 4
Created additional OSPF fabric `ospf2` with area 1.1.1.1 and ipv4
prefix 192.168.1.0/24
Added nodes 3,4 and 5
Added PtP /31 address to the interfaces (numbered) and reloaded the config
┌──────────────────┐ ┌──────────────────┐
│ Area 0.0.0.0 │ │ Area 1.1.1.1 │
│ │ │ │
│ F1 <-> F2 <-> F3 <┼──┼> F3 <-> F4 <-> F5 │
│ │ │ │
└──────────────────┘ └──────────────────┘
Checked routes with vtysh -c 'show ip route'
Codes: K - kernel route, C - connected, L - local, S - static,
O - OSPF, * - FIB route
[...]
O 192.168.0.1/32 [110/10] via 0.0.0.0, dummy_backbone onlink,
rmapsrc 192.168.0.1, weight 1, 06:40:38
O>* 192.168.0.2/32 [110/20] via 192.168.0.2, ens20 onlink, rmapsrc
192.168.0.1, weight 1, 06:40:23
O>* 192.168.0.3/32 [110/30] via 192.168.0.2, ens20 onlink, rmapsrc
192.168.0.1, weight 1, 06:40:18
O 192.168.1.3/32 [110/30] via 192.168.0.2, ens20 onlink, rmapsrc
192.168.0.1, weight 1, 06:40:18
O 192.168.1.4/32 [110/40] via 192.168.0.2, ens20 onlink, rmapsrc
192.168.0.1, weight 1, 06:40:14
O 192.168.1.5/32 [110/50] via 192.168.0.2, ens20 onlink, rmapsrc
192.168.0.1, weight 1, 06:40:08
You can see that the ospf routes are created automatically, but are
not transferred to the FDB. Accordingly, they are not visible in the
kernel routing table. The reason for this is the restriction of access
in the /etc/frr/frr.conf
`access-list pve_ospf_backbone_ips permit 192.168.0.0/24`
We discussed this already off-list and for now keeping it like this
This will probably be a future addition, something like "import-subnets"
or even "import-fabrics" where you can select other subnets/fabrics that
are allowed. We currently filter all the routes in frr, so that only
routes to the actual fabric ips (from the dummy interface) are inserted
(This is to avoid inserting p2p ip addresses into the fib).
== Open Fabric ==
Created new OpenFabric fabric `of1` with ipv6 prefix
2a02:ab8:308:3:eff:0:ff00:1/64
Added all 5 nodes and assigned them the ipv6
addresses 2a02:ab8:308:3:eff:0:ff00:[1-5] (unnumbered)
Checked routes with vtysh -c 'show openfabric route' and pinged all
ips -> works as expected
Installed Ceph Cluster on all nodes and initialized 2 OSDs per node
Took one node down and the routes switch as expected
Took the node up again -> the node was not pingable anymore and the
routes did not come up again
even after 10 minutes waiting
Already talked to Gabriel about this but we're not yet sure what the
issue is here.
The issue here is two-fold:
* IPv6 forwarding was not enabled. Here we need to enable IPv6
forwarding globally because there is no per-interface switch as there
is with IPv4. This is fixed in v4.
* When booting up there is a race between openfabric initiating the
interface (circuit) and the underlying interface coming up. This will
result in fabricd not configuring the circuit. That's also why a FRR
restart after the initial boot fixes the issue. This is fixed with
https://github.com/FRRouting/frr/pull/17083 which is included in the
10.3.1 version which is shipped with debian trixie.
Thanks a lot for testing!
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel