It's basically basic zone on steroids, quite exciting :) -----Original Message----- From: Nux <[email protected]> Sent: 08 December 2025 14:18 To: [email protected] Cc: Alex Mattioli <[email protected]>; Wido den Hollander <[email protected]> Subject: Re: [DISCUSS] New network type to eliminate L2 and use pure L3 routing up until the VM
Hi Wido, This looks pretty, pretty nice. Combined with the VNF feature it could perhaps open a new chapter in CloudStack networking. Looking forward to an initial implementation. Good job! On 2025-12-08 21:05, Wido den Hollander via dev wrote: > Hello, > > I have discussed this with Alex during CCC 2025 in Milan and as a > follow-up I created an issue on Github: > https://github.com/apache/cloudstack/issues/12210 > > Wido > > Op 24-09-2025 om 21:50 schreef Wido den Hollander: >> Hello, >> >> I have a fascination for networking as some might be aware. I think >> that a proper network design is the solid foundation underneath a >> cloud which allows it to scale and provide the flexibility an >> organization requires. >> >> I've worked a lot on the VXLAN+EVPN+BGP integration in CloudStack and >> I think it's a great solution and should be the default for anybody >> who starts to deploy CloudStack today. >> >> VXLAN does have its drawbacks as it requires VXLAN offloading in the >> NIC, switches and routers who can process it and requires additional >> networking skills. >> >> In the end a VM needs connectivity, IPv4 and/or IPv6. This allows >> them to connect to other servers and the rest of the internet. >> >> In the current design, whether it is traditional VLAN or VXLAN we >> still assume that there is a L2 network. The VLAN or the VNI in VXLAN. >> >> Technically these are not required and we can use pure L3 routing >> towards the VMs from the host. In my opinion this can simplify >> networking while also adding scalability. >> >> >> ** cloudbr0 ** >> On a test machine with plain Libvirt+KVM I created cloudbr0: >> >> 113: cloudbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc >> noqueue state UP group default qlen 1000 >> link/ether f6:73:63:49:1f:33 brd ff:ff:ff:ff:ff:ff >> inet 169.254.0.1/32 scope global cloudbr0 >> valid_lft forever preferred_lft forever >> inet6 fe80::b009:e3ff:fe41:1394/64 scope link >> valid_lft forever preferred_lft forever >> inet6 fe80::1/64 scope link >> valid_lft forever preferred_lft forever >> >> >> You can see I've added two addresses to the bridge: >> >> - 169.254.0.1/32 >> - fe80::1/64 >> >> ** Test VM ** >> I have deployed a test VM which I attached to cloudbr0 and manually >> added the addresses using netplan: >> >> network: >> ethernets: >> ens18: >> addresses: >> - 2a14:9b80:103::100/128 >> - 2.57.57.29/32 >> nameservers: >> addresses: >> - 2620:fe::fe >> search: [] >> routes: >> - to: 0.0.0.0/0 >> via: 169.254.0.1 >> on-link: true >> - to: ::/0 >> via: fe80::1 >> version: 2 >> >> This results in: >> >> >> root@routing-test:~# ip addr show dev ens18 >> 2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel >> state UP group default qlen 1000 >> link/ether bc:24:11:93:d7:94 brd ff:ff:ff:ff:ff:ff >> altname enp0s18 >> inet 2.57.57.29/32 scope global ens18 >> valid_lft forever preferred_lft forever >> inet6 2a14:9b80:103::100/128 scope global >> valid_lft forever preferred_lft forever >> inet6 fe80::be24:11ff:fe93:d794/64 scope link >> valid_lft forever preferred_lft forever root@routing-test:~# >> >> >> In the VM you now see the IPv4 and IPv6 routes: >> >> >> root@routing-test:~# ip -4 r >> default via 169.254.0.1 dev ens18 proto static onlink >> root@routing-test:~# ip -6 r >> 2a14:9b80:103::100 dev ens18 proto kernel metric 256 pref medium >> fe80::/64 dev ens18 proto kernel metric 256 pref medium default via >> fe80::1 dev ens18 proto static metric 1024 pref medium >> root@routing-test:~# >> >> >> ** Static route and ARP/NDP entry ** >> On the HV I needed to add two routes and ARP/NDP entries pointing to >> the VM >> >> >> ip -6 route add 2a14:9b80:103::100/128 dev cloudbr0 ip -6 neigh add >> 2a14:9b80:103::100 lladdr BC:24:11:93:D7:94 dev >> cloudbr0 nud permanent >> ip -4 route add 2.57.57.29/32 dev cloudbr0 ip -4 neigh add 2.57.57.29 >> lladdr BC:24:11:93:D7:94 dev cloudbr0 nud permanent >> >> >> BC:24:11:93:D7:94 is the MAC address of the VM in this case. >> >> >> ** L3 Routing with BGP ** >> On the hypervisor I have the FRR BGP daemon running who advertises >> the >> /32 and /128 routes: >> >> - 2.57.57.29/32 >> - 2a14:9b80:103::100/128 >> >> >> ubuntu# sh ip route 2.57.57.29 >> Routing entry for 2.57.57.29/32 >> Known via "kernel", distance 0, metric 0, best >> Last update 00:00:51 ago >> * directly connected, cloudbr0, weight 1 >> >> hv-138-a12-26# show ipv6 route 2a14:9b80:103::100 Routing entry for >> 2a14:9b80:103::100/128 >> Known via "static", distance 1, metric 0 >> Last update 6d04h23m ago >> directly connected, cloudbr0, weight 1 >> >> Routing entry for 2a14:9b80:103::100/128 >> Known via "kernel", distance 0, metric 1024, best >> Last update 6d08h27m ago >> * directly connected, cloudbr0, weight 1 >> >> ubuntu# >> >> >> Both addresses are now advertised upstream towards the other BGP >> peers while the hypervisor only receives the default routes from >> upstream >> (0.0.0.0/0 and ::/0) >> >> >> *** CloudStack *** >> As we only route /32 or /128s towards a VM we gain a lot more >> flexibility as these IPs can be routed anywhere in your network. No >> stretching of VLANs nor routing VXLAN between sites. >> >> CloudStack orchestration will need to make sure we program the right >> routes on the hypervisor, but this is something Libvirt hooks can >> take care of. >> >> BGP is to be configured by the admin and that is to be documented. >> >> This would be an additional type of network which will not support: >> >> - DHCP >> - User-Data from the VR >> - A VR at all >> >> UserData will need to come from ConfigDrive and using ConfigDrive the >> VM will need to configure the IPs locally. >> >> Security Grouping can and will still work as it does right now. >> >> ** IPv4 and IPv6 *** >> This idea is protocol independent and since DHCP is no longer needed >> it can work in multiple modes: >> >> - IPv4 only >> - IPv6 only (Really single stack!) >> - IPv4+IPv6 (Dual Stack) >> >> ConfigDrive will take care of the network configuration. >> >> ** What's next? ** >> >> I am not proposing anything to be developed right now, but I hope to >> spark some ideas with people and get a discussion going. >> >> If this will lead to an implementation to be written? Let's see! >> >> Wido >>
