Hi again, (damn I shouldn't work sunday ;) I done some tcpdump:
on this setup: ---sw-core---- | | switch1 switch2 | | | | host linux sw-core: 10.0.0.1 switch1 : 10.0.0.2 switch2 :10.0.0.3 each switch have igmp querier enabled linux host use bonding (active-backup), querier and multicast_snooping is enabled now tcpdump on vmbr0 of linux host I see only query coming from the sw-core with the lowest ip 09:44:38.692597 IP sw-core.odiso.net > all-systems.mcast.net: igmp query v2 09:44:38.692597 IP sw-core.odiso.net > all-systems.mcast.net: igmp query v2 ... Now I disable igmp quierer on sw-core, after around 1 min, I see igmp query comming for switch1 ... 09:45:38.698272 IP switch1.odiso.net > all-systems.mcast.net: igmp query v2 09:46:38.703388 IP switch1.odiso.net > all-systems.mcast.net: igmp query v2 If I disable quierer on switch, switch2 will be the quierer. So election works fine. (the igmp quierer with the lowest ip address is the master). So, I restore initial setup, with 3 querier, sw-core become again the master. I launch a tcpdump ,Now after some time (30min/1h) , I see some igmp queries comming from linux bridge and cisco switch at the same time ! 10:43:26.703388 IP 0.0.0.0 > all-systems.mcast.net: igmp query v2 10:43:58.69233E IP sw-core.odiso.net > all-systems.mcast.net: igmp query v2 Note that linux bridges igmp query use 0.0.0.0 as source address. I found the original mail cover letter of the patch about disabling by default the igmp querier on linux bridge http://en.usenet.digipedia.org/thread/18960/28749/ " [0/3] bridge: Do not send multicast queries by default 2012-04-13 14:36 This series of patches is aimed to change the default multicast snooping behaviour to one that is safer to deploy in the wild. There have been numerous reports of switches misbehaving with our current behaviour of sending general queries, presumably because we're using a zero source IP address which is unavoidable as using anything else would interfere with multicast querier elections " So, I don't known for HP switchs, but for Cisco switches it seem to break the election of igmp. Maybe my problem was that my proxmox host was the igmp quierer, and when I have shutted it down, no other igmp quierer have worked, and snooping have blocked all mutlticast address. ----- Mail original ----- De: "Alexandre DERUMIER" <aderum...@odiso.com> À: "Michael Rasmussen" <m...@datanom.net> Cc: pve-devel@pve.proxmox.com Envoyé: Dimanche 10 Mars 2013 08:29:59 Objet: Re: [pve-devel] corosync, multicast problem because of vmbr multicast_snooping enabled >>@alexandre: What precise Cisco switch do you see the problems with? >>What IOS version? Are there any firmware upgrade available? cisco 2960g && cisco 6500. (can't remember ios version but version of 2012 for both) My biggest problem was 2 week ago, I shutdown 1 of my nodes, and after 2min, alls nodes on the same vlan (including differents cluster with differents multicast address) can't see each others. disabling igmp on linux bridge has resolved the problem. So it should be related to snooping & igmp queries, but I don't known if the problem is on physical switch or linux bridge. I'll try to reproduce the problem this week and will do some tcpdump to find the problem Now, I see a lot of bug reports on the net about snooping on linux bridge. (don't known if it's about snooping or igmp queries). And I trust more my good old cisco switchs than a 2 year old implementation on linux bridge. here another bug with igmp report from bridge and bonding, if failover occur in bonding, igmp report are not send anymore :/ http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1c3ac4289a0e4d60cbd4787b4a91de4a0c785df1 ----- Mail original ----- De: "Michael Rasmussen" <m...@datanom.net> À: pve-devel@pve.proxmox.com Envoyé: Samedi 9 Mars 2013 21:22:27 Objet: Re: [pve-devel] corosync, multicast problem because of vmbr multicast_snooping enabled On Sat, 9 Mar 2013 18:33:58 +0000 Dietmar Maurer <diet...@proxmox.com> wrote: > > So I think we talk about switch bugs here, not normal behavior. > I am leaning towards the same conclusion since I have never seen those queries cause any problems here. @alexandre: What precise Cisco switch do you see the problems with? What IOS version? Are there any firmware upgrade available? According to Cisco the queries should not cause any problems but maybe this is what causes your problems: "Multicast routers send host-query messages periodically to refresh their knowledge of memberships present on their networks. If, after some number of queries, the Cisco IOS software discovers that no local hosts are members of a multicast group, the software stops forwarding onto the local network multicast packets from remote origins for that group and sends a prune message upstream toward the source." http://www.cisco.com/en/US/docs/ios/12_2/ip/configuration/guide/1cfmulti.html#wp1067822 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael <at> rasmussen <dot> cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir <at> datanom <dot> net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir <at> miras <dot> org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- The moving cursor writes, and having written, blinks on. _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel