[nvo3] Multi-subnet VNs [was Re: FW: New Version Notification for draft-yong-nvo3-frwk-dpreq-addition-00.txt]

Aldrin Isaac Wed, 19 Dec 2012 11:44:25 -0800

Hi Kireeti,

In E-VPN, ARP is only flooded when the MAC-IP binding is unknown in BGP.
 Once it is known, the local PE responds locally to the ARP request.   This
scales quite well so it's not the best reason to lean one way or other.


An alternative for edge routing using EVPN is for an NVE to localize the
VNs to which edge routing is desired and stand up a local IP forwarder
across these VN using the IP info in the EVPN routes.  If the DMAC on a
packet is not present in the EVI and if the payload is IP then pass to the
IP forwarder....

In regards to optimizing multicast, with EVPN this can be
done using VN dedicated to multicast distribution by using
the VLAN-based MVR model.  It works well and used today.

Another problem that is addressed in EVPN is that segments can be
multihomed using LAG.  With IP-only solutions, physical end station would
need to multihome by advertising loopback IP over multiple physical IP
interfaces.

We can have our TORs and use them too!! :)

Best regards -- aldrin

On Wednesday, December 19, 2012, Kireeti Kompella wrote:

> Hi Aldrin,
>
> On Tue, Dec 18, 2012 at 8:29 PM, Aldrin Isaac <[email protected]>wrote:
>
>> Kireeti,
>>
>> I'm not clear what difference it makes whether a packet is unicast
>> forwarded using MAC address or IP address within a subnet
>
>
> Two important differences:
> a) you don't have to know the MAC address if you forward on IP.  I.e., you
> don't have to propagate the ARP to the destination (flood), get the reply,
> bind IP to MAC (ARP table), and maintain ARP binding (timeout, validate,
> etc.).  The first is a real problem; the rest are annoyances that become
> problems at scale.
>

> (Note that the ARMD WG was created to address this issue, and you know
> where that ended.)
>
> (Note further that this may be hard to do in general, but in the case of
> an orchestrated data center, you have the information about where a given
> IP lives, and you have a control plane (ORACLE) to inform all relevant
> NVEs.  And of course, an overlay to shield the infrastructure from poking
> its nose into your forwarding behavior -- i.e., the infra doesn't care
> whether you route or switch TS traffic.)
>
> b) In the quite common case where all traffic from a TS is IP, you don't
> have to maintain two tables and two forwarding paradigms at the NVE (one
> for IPs and one for MACs).  This is common enough to warrant optimization.
>
> A third difference is that if you have only unicast traffic, you don't
> have to maintain a multicast tree (for flooding).  For some, this is a nice
> bonus, but I know you have a multicast packet or two in your network :-)
>
> as long as
>> it gets to the intended destination along the most optimal path,
>> particularly when the price to pay is non-standard behavior
>> (intra-subnet ARP manglers ;}, etc).  I understand the argument about
>> the sub-optimal routing from a third site, but when the primary sites
>> end up aggregating prefixes for scaling reasons that argument falls
>> off the table.  One way or other the piper gets paid.
>>
>
> One way, the piper gets paid a fair bit more than the other!
>
> In terms of the real world issue of getting there from here --
>> personally I haven't seen any vendor working towards a standards-based
>> solution that will allow intra-subnet routing for subnets over
>> HW/TOR-based PE, let alone intra-subnet routing for subnets that span
>> across both hypervisor-based PE and TOR-based PE.  This makes me leery
>> of solutions that can only take us half way there, particularly during
>> the transition phase.  So if we're talking about network
>> virtualization based purely on hypervisors, "route IP, bridge non-IP"
>> may be realistic if you're willing to accept the caveats, but does not
>> seem to be otherwise.
>>
>
> Good point.  Clearly, this is not a local decision: "route IP, bridge
> non-IP" means that intra-subnet routes are propagated the same way as
> inter-subnet routes, and thus every NVE, h/w or s/w, must be on the same
> page.
>
> To make this concrete using BGP VPNs, "route IP, bridge non-IP" means all
> routes, intra- and inter-subnet, are propagated as IP VPN routes, and E-VPN
> routes contain MACs without IPs.  "Bridge intra-subnet IP and non-IP, route
> inter-subnet" means inter-subnet routes are propagated as IP VPN routes,
> and intra-VPN routes as E-VPN MAC+IP routes.
>
> We can have a chat off-list on h/w vendors working towards this.
>  Hopefully, others will weigh the above arguments, and support this.
>  Deployers (like you) have a say in this too :-)
>
>  Btw, I understand how multicast may be less than efficient when
>> building both inter and intra subnet trees for the same IP mcast group
>> that end up overlapping links (maybe even more than twice) -- but I'd
>> like to hear your take on any other *insolvable* issues with regard to
>> multicast.
>>
>
> Isn't that enough?  :-)  I am not a multicast expert, but I can try to dig
> up IRB multicast horror stories.
>
> Cheers,
> Kireeti.
>
> Best regards -- aldrin
>>
>>
>>
>> On Tue, Dec 18, 2012 at 6:06 PM, Kireeti Kompella
>> <[email protected]> wrote:
>> > Hi Thomas,
>> >
>> > On Dec 18, 2012, at 09:03 , Thomas Narten <[email protected]> wrote:
>> >
>> >> Kireeti Kompella <[email protected]> writes:
>> >>
>> >>> The solution is simple: route if IP, bridge if not.  Yes, one could
>> >>> do IRB, but why?  IRB brings in complications, especially for
>> >>> multicast.  I'm sure someone suggested this already, so put me down
>> >>> as supporting this view.
>> >>
>> >> I'm not sure I understand the difference.
>> >>
>> >> From an *NVE* perspective, when it receives a packet (which will have
>> >> an L2 header), it can look at the Ethertype, and if its IP, it can
>> >> route it. Otherwise, it can provide normal L2 service. So, in this
>> >> sense, "route if IP, bridge if not" is straightforward. And more to
>> >> the point, I assume that if the packet gets L2 service, the entire VN
>> >> is treated as a *single* broadcast domain. All nodes can reach all
>> >> other nodes. Right?
>> >
>> > Right.
>> >
>> >> Just so I understand, how is this different than IRB?  What does IRB
>> >> imply that the above does not?
>> >
>> > IRB follows the principle of "bridge when you can, route otherwise".
>>  So, an IP packet with dest IP in the same subnet actually gets bridged;
>> the originator (e.g., the VM) is responsible for ARPing the IP address,
>> slapping the right dest MAC on the packet and sending that to the NVE which
>> simply forwards based on dest MAC address *without* decrementing the TTL.
>> >
>> > If the dest IP is in another subnet, the packet is sent to the gateway
>> (which for IRB would be the same NVE), which this time does an IP address
>> lookup, decrements TTL and routes the packet.
>> >
>> > For multicast, there are even more differences.
>> >
>> >> But this is different than what (I believe) Lucy is arguing for. In
>> >> the case of a multi-subnet VN, you have one VN, but it contains
>> >> different subnets. Each subnet is intended to be one broadcast domain
>> >> (i.e., equivalent of a VLAN), so that when sending LL multicast and
>> >> the like on a specific subnet, such packets are *not* delivered to all
>> >> nodes in the VN, but only those that are part of subnet.
>> >
>> > If one were to configure multiple subnets on a VLAN, I wonder if LL
>> traffic goes to all members of the VLAN, or just those in the same subnet
>> as the sender.  I suspect the former (but don't know).
>> >
>> >> This is a more complex type of service to provide. And I'm not sure we
>> >> need this type of service to be provided by one VN.
>> >
>> > Agree.
>> >
>> >> A (seemingly
>> >> simpler) alternative would be to put each subnet in its own VN and
>> >> allow inter-subnet traffic to be handed as inter-VN traffic. So long
>> >> as that case is optimized (i.e., the ingress NVE can tunnel directly
>> >> to the egress NVE without adding triangular routing), this would seem
>> >> to be a cleaner way to implement this.
>> >
>> > Can be done.  However, we're on Lucy's topic; mine was "route if IP,
>> bridge otherwise"; the goal was to rationalize the need for Layer 2
>> forwarding for non-IP traffic, and inter- and intra-subnet routing.
>> >
>> > Kireeti.
>> >
>> >> Thomas
>> >>
>> >> _______________________________________________
>> >> nvo3 mailing list
>> >> [email protected]
>> >> https://www.ietf.org/mailman/listinfo/nvo3
>> >
>> > _______________________________________________
>> > nvo3 mailing list
>> > [email protected]
>> > https://www.ietf.org/mailman/listinfo/nvo3
>>
>
>
>
> --
> Kireeti
>

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

[nvo3] Multi-subnet VNs [was Re: FW: New Version Notification for draft-yong-nvo3-frwk-dpreq-addition-00.txt]

Reply via email to