On Thu, Oct 18, 2012 at 12:42 AM, Marcus Sorensen <shadow...@gmail.com> wrote:
> Sorry, I've been up to my ears. I've attached the simple patch that
> makes this all happen, if anyone wants to take a look. This is the
> code that looks for physical devices. It's passed a bridge and then
> determines the parent of that bridge, then whether that parent is a
> tagged device and goes one more step and finds its parent. This just
> circumvents the last lookup if the parent of the bridge is a "vlan"
> device (single tagged, e.g. vlan100) but not a double-tagged one (e.g.
> vlan100.10), and the rest of cloudstack treats vlan100 as though it
> were a physical device, creates tagged bridges on it if it has guest
> traffic type, etc. I've been using it in our test bed for about a
> month, and have only run into the MTU issue.

Hey Marcus,

Attachments get stripped.  Can you post it somewhere?

> If people still think it's a good idea, I'll create a functional spec
> and additional info on how it works.
>
>  I've also got a small patch to modifyvlans.sh, but I'm debating
> whether or not it's necessary. It detects whether the "physical
> interface" is actually a vlan tagged interface, and if so it subtracts
> the necessary bytes from the MTU when it sets up the double-tagged
> bridges. It's technically not necessary, as the important part is
> whether the guest MTUs fit inside the MTU that the switch allows once
> the extra tag is added. But it just makes it a bit more obvious as to
> what's needed. However it also breaks the admin's ability to bump the
> switch MTUs up just a bit, say 1532, to account for the excess without
> having to go up to 9000 or full jumbo. If anyone is a network guru and
> has any feedback it would be appreciated, but I'm inclined to leave
> the MTUs alone and write it into the functional spec that a switch
> with a 1500 MTU supports double tags up to 1468, and a switch with a
> 9000 MTU supports VM guest networks up to 8968 MTU.
>
> On Mon, Oct 15, 2012 at 1:43 PM, Marcus Sorensen <shadow...@gmail.com> wrote:
>> Ok, I'll pull out the changes and let people see them. Cloudstack
>> seems to let me put the same vlan ranges on multiple physicals, though
>> I haven't done much actual testing with large numbers of vlans. I
>> imagine there would be other bottlenecks if they all needed to be up
>> on the same host at once. Luckily we only create bridges for the
>> actual VMs on the box so it should scale reasonably.
>>
>> The only caveat I've run into so far is that you either need to be
>> running jumbo frames on your switches, or turn down the MTU on the
>> guests a bit to accommodate the space taken by extra tag.  If you
>> wanted to run jumbo fames on the guests as well, you'd run into the
>> same situation and have to use slightly less than the 9000 (although
>> the virtual router would require a patch too for the new size).
>>
>> On Mon, Oct 15, 2012 at 9:56 AM, Ahmad Emneina <ahmad.emne...@citrix.com> 
>> wrote:
>>> On 10/15/12 8:35 AM, "Kelceydamage@bbits" <kel...@bbits.ca> wrote:
>>>
>>>>That's a far more elegant way then I tried, which was creating tagged
>>>>interfaces within guests.
>>>>
>>>>Sent from my iPhone
>>>>
>>>>On Oct 15, 2012, at 12:54 AM, Chiradeep Vittal
>>>><chiradeep.vit...@citrix.com> wrote:
>>>>
>>>>> This sounds like it can be modeled as multiple physical networks? That
>>>>>is,
>>>>> each "outer" vlan (400, 401, etc) is a separate physical network in the
>>>>> same zone. That could work, although it is probable that the zone
>>>>> configuration API bits prevent more than 4k VLANs per zone (that can be
>>>>> changed to per physical network).
>>>>>
>>>>> As long as communication between guests on different physical networks
>>>>> happens via the public network, it should be Ok.
>>>>> I'd like to see the patch.
>>>>>
>>>>> Thanks
>>>>>
>>>>> On 10/12/12 1:09 AM, "Marcus Sorensen" <shadow...@gmail.com> wrote:
>>>>>
>>>>>> Guys, in looking for a free and scalable way to provide private
>>>>>>networks
>>>>>> for customers I've been running a QinQ setup that has been working
>>>>>>quite
>>>>>> well. I've sort of laid the groundwork for it already in changing the
>>>>>> bridge naming conventions about a month ago for KVM(to names that won't
>>>>>> collide if the same vlans is used twice on different phys).
>>>>>>
>>>>>> Basically the way it works is like this. Linux has two ways of creating
>>>>>> tagged networks, the eth#.# and the less used vlan# network devices. I
>>>>>> have
>>>>>> a tiny patch that causes cloudstack to treat vlan# devs as though they
>>>>>> were
>>>>>> physical NICs. In this way, you can do something like physical devices
>>>>>> eth0,eth1,and vlan400. management traffic on eth0's bridge, storage on
>>>>>> eth1.102's bridge, maybe eth1.103 for public/guest, then create say a
>>>>>> vlan400 that is tag 400 on eth1. You add a traffic type of guest to it
>>>>>>and
>>>>>> give it a vlan range, say 10-4000. Then you end up with cloudstack
>>>>>>handing
>>>>>> out vlan400.10, vlan400.11, etc for guest networks. Works great for
>>>>>> network
>>>>>> isolation without burning through a bunch of your "real" vlans. In the
>>>>>> unlikely event that you run out, you just create a physical vlan401 and
>>>>>> start over with the vlan numbers.
>>>>>>
>>>>>> In theory all-you-can-eat isolated networks without having to configure
>>>>>> hundreds of vlans on your networking equipment. This may require
>>>>>> additional
>>>>>> config on any upstream switches to pass the double tags around, but in
>>>>>> general from what I've seen the inner tags just pass through on
>>>>>>anything
>>>>>> layer 2, it should only get tricky if you try to tunnel, route or strip
>>>>>> tags.
>>>>>>
>>>>>> This is especially nice with system VM routers and VPC (cloudstack
>>>>>>takes
>>>>>> care of everything), but admittedly external routers probably will have
>>>>>> spotty support for being able to route double tagged stuff. I'm also a
>>>>>>bit
>>>>>> afraid that if I were to get it merged in that it would just become
>>>>>>this
>>>>>> undocumented hack thing that few know about and nobody uses. So I'm
>>>>>> looking
>>>>>> for feedback on whether this sounds useful enough to commit, how it
>>>>>>should
>>>>>> be documented, and whether it makes sense to hint at this in the GUI
>>>>>> somehow.
>>>>>
>>>>
>>>
>>> +1
>>>
>>> This actually sounds amazing Marcus. I'd love to see and use this
>>> implementation.
>>>
>>> --
>>> Æ
>>>
>>>
>>>

Reply via email to