Harry, I'm still working to reproduce this, without success. I have set the .autoconf sysctl to 0 (which controls creation of local addresses in response to received Router Advertisements), as well as setting .addr_gen_mode to 1 (to disable SLAAC (fe80::) addresses).
In any event, .autoconf=0 and .addr_gen_mode=1 still fails to reproduce the issue on my test system. I find that if I disable mcast_flood on the relevant bridge ports (i.e., bridge link set dev vnet1 mcast_flood off) I do see the behavior you describe, but in that case no variant that I've tried (no vid, and all vids in use) of "bridge mdb add ... grp ff02::1:ff00:2" appears to permit ND traffic to pass to the VM destination. Can you provide more specifics of how exactly the bridge and ports are configured? Ideally, both the method to set it up, as well as the configuration details when failing (i.e., "ip -s -d link show" for the bridge and relevant bridge ports, "bridge vlan show", "bridge mdb show", "bridge fdb show br [bridgename]") Also, to answer a question from your original report, the default setting in the kernel for multicast_snooping (enabled, i.e., 1) hasn't changed recently (and quite possibly ever). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1959702 Title: Regression: ip6 ndp broken, host bridge doesn't add vlan guest entry to mdb Status in linux package in Ubuntu: Confirmed Bug description: Starting at the end: I believe as the bug presently requires each of the host's bridge ports to be ipv6 addressable to enable ipv6 to function in the guest, and most admins won't think to add special entries into their host's nftables.conf to allow for it 'because who knew?' it represents what you might call a 'passive security vulnerability'. A recent kernel upgrade has broken ipv6/ip6 ndp in a host/kvm setup using a bridge on the host and vlans for some guests. I've tracked the problem to a failure of the mcast code to add entries to the host's mdb table. Manually adding the entries to the mdb on the bridge corrects the problem. It's very easy to demonstrate the bug in an all ubuntu setup. 1. On an ubuntu host, create two vms, I used libvirt, as set up below. 2. On the host, create a bridge and vlan with two ports, each with the chosen vlan as PVID and egress untagged. Assign those ports one each to the guests as the interface, use e1000. Be sure to NOT autoconfigure the host side of the bridge ports with any ip4 or ip6 address (including fe80::), it's just an avoidable security risk. We don't want to allow the host any sort of ip access / exposure to the vlan. In other words, treat the host's bridge ports as if a 'real off-host switch' without expectation of making each bridge's port being ip6 addressable on the bridge itself. (FWIW: Worth checking if the vlan is left tagged and not pvid, and the vlan is decoded in the guest as a separate interface, does the problem go away? It imposes the burden of vlan management awareness to the guest and so is not acceptable as a solution.) 3. On the host, assign a physical NIC to the bridge and the vlan to the nic. The egress is tagged for the chosen vlan and not PVID. Optionally set up an off-host gateway for the vlan, but it isn't necessary to show the bug. 4. On each guest, manually assign a unique ip4 and ip6 address on the same subnet (you'll see though dhcp4 could work if there was an off- host router providing related services, the bug prevents dhcp6 from working). 5. On one vm, ping the other. Notice ip4 pings work, ip6 pings do not. 6. Manually add the fe02::ffxx:xxxx entries for each vm to the vlan to the host bridge's multicast table. Use 'temp' if you're quick enough, otherwise perm. 7. Notice pings between the guests now work on ip6 and ipv4. Using tcpdump and watching icmp6 traffic, you'll notice the packets making it across the various bridge ports the moment you manually add the appropriate fe02::ff... multicast address to the mdb table. Beware a false sense of security: Once the ndp completes and the link addresses are in the fdb, it can 'seem like' everything is fine until the fdb times out and the required mdb entry again must be used to allow ndp to refresh the address. Setting mcast_querier doesn't help. Perhaps previous kernels turned off the multicast snooping by default and just flooded all the bridge ports with all multicast traffic so this bug was avoided. It's my hunch the reason there hasn't been more complaint about this is it takes an extra step to not autoconfigure the vm ports with fe80:: link local addresses on the host. I believe the existence of the fe80 address on the host ports engages ndp code on the host to load the mdb as if preparing for the host's side of the bridge to participate in ip4 and ip6 higher layer traffic, but that's a 'bad hack that happens to work' -- it shouldn't be a requirement that each host vlan port have an ip6 address, after all it didn't need an IP4 address.... I've attached a linux-bug for you, but it's probably mostly unrelated info. I believe as the bug presently requires each of the host's bridge ports to be ipv6 addressable to enable ipv6 to function in the guest, and most admins won't think to add special entries into their host's nftables.conf to allow for it 'because who knew?' it represents what you might call a 'passive security vulnerability'. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1959702/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp