Hi Folks, I have following comments on draft-ietf-bess-evpn-irb-mcast. I also compare it to draft-sajassi-bess-evpn-mvpn-seamless-interop, which utilizes existing MVPN technology to achieve mcast-irb functionality in EVPN.
*1. Re-branding MVPN constructs into EVPN * *evpn-irb* draft proposes a lot of MVPN constructs into EVPN. Originating multicast receiver interest "per PE" instead of "per BD", use of selective tunnels are few examples. If solution really is achievable through MVPN, why do we need to re-brand it in EVPN? *2. Scale of BGP routes* *evpn-irb* solution mandates a PE to process and store all IMET NLRI's from all peer PE's in tenant domain (as opposed to processing and storing only NLRI's for BD's it has locally present). This is proposed because multicast traffic could be originating from any PE in any BD. To put this in perspective, lets take an example of a tenant domain with 101 PE's with each PE having 1000 BD's. Each PE has at most 10 BD's common with any other PE in network. In this case PE1 will have to process and store, 100 (remote PE's) x 1000 (BD's per PE) x 1 (IMET per BD) = 0.1 Million IMET routes. Essentially, it is of order *"Num of BD's" x "Num of PE's"*. Whereas for *seamless-interop* solution, a PE would need to process and store 100 I-PMSI (IMET equivalent in MVPN) routes, means one route from each peer PE. . This is of order *"num of PE's".* It should be noted that VxLAN supports and max of ~16 million BD's thus *evpn-irb* solution results in huge overhead per PE if compared with *seamless-interop*. *3. Control plane scale in fabric / core * Also each PE one additional Tunnel per BD apart from existing BUM tunnel. Essentially one tunnel for B+U and another for M. This is proposed to avoid all B+U traffic in the BD1, indiscriminately reaching all PE's in domain, irrespective of whether they have BD1 configured locally or not. This increases the state in fabric by *"num of PE" x "num of BD"*. In above example it would again come to 0.1 Million additional tunnels. *seamless-interop* solution uses one tunnel per PE, so 100 additional tunnels to achieve same objective. *4. Data Plane considerations* *4.1.* The data-plane nuances of solution has been underplayed. For example, if a PE1 has a (S, G) receivers in BD2, BD3 till BD10, whereas source S belongs in BD1 subnet on PE2. And if BD1 is not configured locally on PE1, a special BD (called SBD) is programmed as IIF in forwarding entry. Later if BD1 gets configured on PE1, it would cause IIF to change on PE1 from SBD to BD1. This would result in traffic disruption for all existing receivers in BD2, BD3 till BD10. It should be noted that no state changes are observed in any of the receiver BD's. This is a significant drawback of the solution given the latency requirement of applications running in data-centers. Additionally, since host mobility and on-demand BD configuration are critical functionalities for DC solutions, such case can't be discounted. *4.2. *Also, *evpn-irb* solution proposes to relax the RPF check for a (*, G) multicast entry. This poses a great risk of traffic loops especially in transient network conditions in addition to poor debug-ability. *seamless-interop* solution doesn't possess these drawbacks and any BD configuration or de-configuration wouldn't cause any change in existing forwarding state. In nutshell, even after borrowing MVPN constructs, *evpn-irb* solution presents significant overhead in comparison to *seamless-interop* for modern day data-center use-cases where flexible workload placement, workload mobility and data-plane efficiency are critical features. Regards, Ashutosh
_______________________________________________ BESS mailing list [email protected] https://www.ietf.org/mailman/listinfo/bess
