Re: [nvo3] Let's refocus on real world

Stiliadis, Dimitrios (Dimitri) Tue, 28 Aug 2012 10:33:10 -0700

;) see my "multiplexed" comment below (i.e. two servers mounting the same
file system).


IO shares work on the server side ..

Cheers,

Dimitri

On 8/28/12 10:25 AM, "Ivan Pepelnjak" <[email protected]> wrote:

>If I understood vSphere manuals and discussions on various blogs/forums
>correctly, VMware solved most of this problem a long time ago with I/O
>shares and a few other features ... but don't trust a networking guy to
>know anything about storage :)
>
>On 8/28/12 7:10 PM, Jon Hudson wrote:
>> Dead on.
>>
>> Anytime you have a fan-in-fan-out type traffic flow with filesystem
>>info one person can ruin the party for everyone. FCoE is a perfect
>>example where a pause frame sent on a aggregation link can end up
>>impacting many initiators. Or even at a controller level of any array
>>you can get a traffic jam of sorts on poorly designed and layed out
>>subsystems. Or too few lines for food at an IETF social.
>>
>> Lots can be done with queues etc to mitigate the issue, but it is
>>always something to be mindful of. Especially if your remote filesystem
>>is not just a mounted LUN but the mainsystem/boot LUN and you have
>>windows paging over the wire.
>>
>> On Aug 28, 2012, at 9:55 AM, "Stiliadis, Dimitrios
>>(Dimitri)"<[email protected]>  wrote:
>>
>>> FCoE is clearly not a requirementŠ
>>>
>>> but, there is something to be said about storage (and I should have
>>> responded in the
>>> other email about this), but in general storage isolation is done at
>>>the
>>> storage
>>> level and not the network layer. So, we can ignore.
>>>
>>> If we take a a storage server that exports a file system that is
>>>mounted
>>> by a
>>> hypervisor and multiple tenants have their VMs in this file system,
>>>then a
>>> single
>>> network connection between hypervisor and storage device could
>>>potentially
>>> lead to head of line blocking and allow one tenant to influence the
>>> performance
>>> of another tenant. If my memory serves me correct, VMware for example
>>>can
>>> only
>>> use two or four iSCSI initiators that have be to shared by the
>>>different
>>> VMs
>>> of the hypervisor, and thus traffic from multiple tenants is
>>>multiplexed
>>> on the same
>>> network flow .. This means that storage drivers/devices have to take
>>>care
>>> of
>>> traffic isolation. And this can be perfectly fine in point-to-point
>>> situations, but
>>> it can get interesting in multiplexed scenarios Š
>>>
>>> (but we just don't want the storage guys to blame the network guys for
>>> performance issues ;)
>>>
>>> Dimitri
>>>
>>> On 8/28/12 9:44 AM, "Ivan Pepelnjak"<[email protected]>  wrote:
>>>
>>>>
>>>>
>>>> In sane real-life designs the virtual network overlay solution would
>>>>not
>>>> transport FCoE. I'm also positive someone will come up with exactly
>>>>that
>>>> requirement sooner rather than later :D
>>>>
>>>> On 8/28/12 6:40 PM, Aldrin Isaac wrote:
>>>>
>>>> The question regarding FCoE is whether overlay solutions need to
>>>> transport it.  I think the answer is no.  If something operates at the
>>>> underlay level than it isn't in scope for NVo3, including DCB.
>>>>
>>>> On Tuesday, August 28, 2012, Somesh Gupta wrote:
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From:
>>>> [email protected]<javascript:;>  [mailto:[email protected]
>>>> <javascript:;>] On Behalf Of
>>>>> Ivan Pepelnjak
>>>>> Sent: Tuesday, August 28, 2012 12:22 AM
>>>>> To: Stiliadis, Dimitrios (Dimitri)
>>>>> Cc: Black, David;
>>>> [email protected]<javascript:;>; Linda Dunbar
>>>>> Subject: Re: [nvo3] Let's refocus on real world (was: Comments on
>>>>>Live
>>>>> Migration and VLAN-IDs)
>>>>>
>>>>> Dimitri,
>>>>>
>>>>> We're more in agreement than it might seem. I might have my doubts
>>>>> about
>>>>> the operational viability of the OpenStack-to-baremetal use case you
>>>>> described below, but I'm positive someone will try to do that as
>>>>>well.
>>>>>
>>>>> In any case, regardless of whether we're considering VMs or
>>>>>bare-metal
>>>>> servers, in the simplest scenario the server-to-NVE connection is a
>>>>> point-to-point link, usually without VLAN tagging.
>>>>>
>>>>> In the VM/hypervisor case, NVE is implemented in the hypervisor soft
>>>>> switch; in the baremetal server case, it has to be implemented in the
>>>>> ToR switch.
>>>> This is certainly only today's restriction. If nov3 takes off, there
>>>> certainly could be a pseudo-driver in Linux that could implement the
>>>> NVE (like a VLAN driver) without much additional overhead.
>>>>
>>>>> It's important to keep in mind the limitations of the ToR switches to
>>>>> ensure whatever solution we agree upon will be implementable in ToR
>>>>> switches as well, but it makes absolutely no sense to assume NVE will
>>>>> not be in the hypervisor (because someone wants to support a customer
>>>>> having a decade-old VLAN-only hypervisor soft switch).
>>>>>
>>>>> As for ToR switch capabilities, Dell has demonstrated NVGRE support
>>>>>and
>>>>> Arista is right now showing off a hardware VXLAN VTEP prototype, so I
>>>>> guess it's safe to assume next-generation merchant silicon will
>>>>>support
>>>>> GRE- and UDP-based encapsulations well before we'll agree on what
>>>>>NVO3
>>>>> solution should be.
>>>>>
>>>>> Finally, can at least some of us agree that the topology that makes
>>>>> most
>>>>> sense is a direct P2P link between (VM or bare-metal) server and NVE
>>>>> using VLAN tagging only when a server participating in multiple L2
>>>>>CUGs
>>>>> has interface limitations?
>>>>>
>>>>> Kind regards,
>>>>> Ivan
>>>>>
>>>>> On 8/27/12 6:55 AM, Stiliadis, Dimitrios (Dimitri) wrote:
>>>>>> Ivan:
>>>>>>
>>>>>> I agree and at the same time disagree with some of the statements
>>>>>> below. I would like to understand your view.
>>>>>>
>>>>>> See inline:
>>>>>>
>>>>>> On 8/25/12 8:22 AM, "Ivan Pepelnjak"<[email protected]>   wrote:
>>>>>>
>>>>>>> On 8/24/12 11:11 PM, Linda Dunbar wrote:
>>>>>>> [...]
>>>>>>>
>>>>>>>> But most, if not all, data centers today don't have the
>>>>>>>>Hypervisors
>>>>>>>> which can encapsulate the NVo3 defined header. The deployment to
>>>>> all
>>>>>>>> 100% NVo3 header based servers won't happen overnight. One thing
>>>>> for
>>>>>>>> sure that you will see data centers with mixed types of servers
>>>>>>>>for
>>>>>>>> very long time.
>>>>>>>>
>>>>>>>> If NVEs are in the ToR, you will see mixed scenario of blade
>>>>> servers,
>>>>>>>> servers with simple virtual switches, or even IEEE802.1Qbg's VEPA.
>>>>> So
>>>>>>>> it is necessary for NVo3 to deal with the "L2 Site" defined in
>>>>>>>>this
>>>>>>>> draft.
>>>>>>> There are two hypothetical ways of implementing NVO3: existing
>>>>> layer-2
>>>>>>> technologies (with well-known scaling properties that prompted the
>>>>>>> creation of NVO3 working group) or something-over-IP encapsulation.
>>>>>>>
>>>>>>> I might be myopic, but from what I see most data centers today (at
>>>>> least
>>>>>>> based on market shares of individual vendors) don't have ToR
>>>>> switches
>>>>>>> that would be able to encapsulate MAC frames or IP datagrams in
>>>>>>>UDP,
>>>>> GRE
>>>>>>> or MPLS envelopes. I am not familiar enough with the commonly used
>>>>>>> merchant silicon hardware to understand whether that's a software
>>>>>>>or
>>>>>>> hardware limitation. In any case, I wouldn't expect switch vendors
>>>>> to
>>>>>>> roll out NVO3-like something-over-IP solutions any time soon.
>>>>>>>
>>>>>>> On the hypervisor front, VXLAN is shipping for months, NVGRE is
>>>>> included
>>>>>>> in the next version of Hyper-V and MAC-over-GRE is available (with
>>>>> Open
>>>>>>> vSwitch) for both KVM and Xen. Open vSwitch is also part of
>>>>>>>standard
>>>>>>> Linux kernel distribution and thus available to any other Linux-
>>>>> based
>>>>>>> hypervisor product.
>>>>>>>
>>>>>>> So: all major hypervisors have MAC-over-IP solutions, each one
>>>>>>>using
>>>>> a
>>>>>>> proprietary encapsulation because there's no standard way of doing
>>>>> it,
>>>>>>> and yet we're spending time discussing and documenting the history
>>>>> of
>>>>>>> evolution of virtual networking. Maybe we should be a bit more
>>>>>>> forward-looking, acknowledge the world has changed, and come up
>>>>>>>with
>>>>> a
>>>>>>> relevant hypervisor-based solution.
>>>>>> Correct, and here is where IETF as a standard body fails. There is
>>>>>>no
>>>>>> easy way (any time soon) for a VXLAN based solution to talk to an
>>>>> NVGRE
>>>>>> or MAC/GRE, or Cloudstack MAC/GRE or STT  (you forgot this one),
>>>>> based
>>>>>> solution.
>>>>>> Proprietary approaches that drive enterprises to vendor lock ins.
>>>>>>And
>>>>>> instead
>>>>>> of trying to address the first problem that is about
>>>>> "interoperability",
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> nvo3 mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/listinfo/nvo3
>_______________________________________________
>nvo3 mailing list
>[email protected]
>https://www.ietf.org/mailman/listinfo/nvo3

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Let's refocus on real world

Reply via email to