Here's my rough draft:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Q-in-Q+for+isolated+networks+functional+spec

On Sun, Oct 21, 2012 at 11:41 PM, Chiradeep Vittal
<chiradeep.vit...@citrix.com> wrote:
> +1 on the FS.
>
> On 10/20/12 10:52 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote:
>
>>The admin does have to create a new physical network, the patch just
>>allows you to use a tagged network as that physical network rather
>>than a real eth device. It is true that cloudstack doesn't know about
>>q-in-q per se, but it is the one creating the q-in-q vlans. The admin
>>does have to create any "vlan#" devs to be used, but I think that
>>makes sense since cloudstack doesn't manage any of your physical
>>network devices. Perhaps I need to write a bit of a functional spec
>>just to describe it in more detail.
>>
>>I haven't done anything with it in regards to xen, of course that
>>would also be a different patch since it hits different code. If
>>someone knows that code well maybe they can help. This is a simple
>>patch, but it's made possible by a previous patch that reworks how the
>>bridges are named, so enabling it for xen might not be as simple as
>>this makes it look.
>>
>>On Sat, Oct 20, 2012 at 10:57 PM, Chiradeep Vittal
>><chiradeep.vit...@citrix.com> wrote:
>>> It looks like your patch does not require the admin to configure
>>>anything
>>> wrt
>>> physical networks. The admin knows the list of "outer" VLANs and
>>> CloudStack is
>>> blissfully unaware of the QinQ stuff.
>>> This requires the hypervisors to be independently configured
>>>(out-of-band)
>>> with the
>>> outer VLAN bridges ?
>>> It also looks like this is a KVM-only solution.
>>> Have you tried this with XS?
>>>
>>> On 10/18/12 6:21 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote:
>>>
>>>>Ah, well it's pretty simple, so I'll just paste it here. Again,
>>>>perhaps more should be implemented regarding the MTU (like
>>>>functionality to configure MTU on the virtual router), but if you know
>>>>what to do it can all work via switch configs.
>>>>
>>>>diff --git
>>>>a/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtC
>>>>om
>>>>putingResource.java
>>>>b/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtC
>>>>om
>>>>putingResource.java
>>>>index 1bc70fa..70de3db 100755
>>>>---
>>>>a/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtC
>>>>om
>>>>putingResource.java
>>>>+++
>>>>b/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtC
>>>>om
>>>>putingResource.java
>>>>@@ -800,7 +800,7 @@ public class LibvirtComputingResource extends
>>>>ServerResourceBase implements
>>>>         String pif = Script.runSimpleBashScript("brctl show | grep "
>>>>+ bridge + " | awk '{print $4}'");
>>>>         String vlan = Script.runSimpleBashScript("ls /proc/net/vlan/" +
>>>>pif);
>>>>
>>>>-        if (vlan != null && !vlan.isEmpty()) {
>>>>+        if (vlan != null && !vlan.isEmpty() &&
>>>>(!pif.startsWith("vlan") || pif.matches("vlan\\d+\\.\\d+"))) {
>>>>                 pif = Script.runSimpleBashScript("grep ^Device\\:
>>>>/proc/net/vlan/" + pif + " | awk {'print $2'}");
>>>>         }
>>>>
>>>>On Thu, Oct 18, 2012 at 8:05 AM, Chip Childers
>>>><chip.child...@sungard.com> wrote:
>>>>> On Thu, Oct 18, 2012 at 12:42 AM, Marcus Sorensen
>>>>><shadow...@gmail.com>
>>>>>wrote:
>>>>>> Sorry, I've been up to my ears. I've attached the simple patch that
>>>>>> makes this all happen, if anyone wants to take a look. This is the
>>>>>> code that looks for physical devices. It's passed a bridge and then
>>>>>> determines the parent of that bridge, then whether that parent is a
>>>>>> tagged device and goes one more step and finds its parent. This just
>>>>>> circumvents the last lookup if the parent of the bridge is a "vlan"
>>>>>> device (single tagged, e.g. vlan100) but not a double-tagged one
>>>>>>(e.g.
>>>>>> vlan100.10), and the rest of cloudstack treats vlan100 as though it
>>>>>> were a physical device, creates tagged bridges on it if it has guest
>>>>>> traffic type, etc. I've been using it in our test bed for about a
>>>>>> month, and have only run into the MTU issue.
>>>>>
>>>>> Hey Marcus,
>>>>>
>>>>> Attachments get stripped.  Can you post it somewhere?
>>>>>
>>>>>> If people still think it's a good idea, I'll create a functional spec
>>>>>> and additional info on how it works.
>>>>>>
>>>>>>  I've also got a small patch to modifyvlans.sh, but I'm debating
>>>>>> whether or not it's necessary. It detects whether the "physical
>>>>>> interface" is actually a vlan tagged interface, and if so it
>>>>>>subtracts
>>>>>> the necessary bytes from the MTU when it sets up the double-tagged
>>>>>> bridges. It's technically not necessary, as the important part is
>>>>>> whether the guest MTUs fit inside the MTU that the switch allows once
>>>>>> the extra tag is added. But it just makes it a bit more obvious as to
>>>>>> what's needed. However it also breaks the admin's ability to bump the
>>>>>> switch MTUs up just a bit, say 1532, to account for the excess
>>>>>>without
>>>>>> having to go up to 9000 or full jumbo. If anyone is a network guru
>>>>>>and
>>>>>> has any feedback it would be appreciated, but I'm inclined to leave
>>>>>> the MTUs alone and write it into the functional spec that a switch
>>>>>> with a 1500 MTU supports double tags up to 1468, and a switch with a
>>>>>> 9000 MTU supports VM guest networks up to 8968 MTU.
>>>>>>
>>>>>> On Mon, Oct 15, 2012 at 1:43 PM, Marcus Sorensen
>>>>>><shadow...@gmail.com>
>>>>>>wrote:
>>>>>>> Ok, I'll pull out the changes and let people see them. Cloudstack
>>>>>>> seems to let me put the same vlan ranges on multiple physicals,
>>>>>>>though
>>>>>>> I haven't done much actual testing with large numbers of vlans. I
>>>>>>> imagine there would be other bottlenecks if they all needed to be up
>>>>>>> on the same host at once. Luckily we only create bridges for the
>>>>>>> actual VMs on the box so it should scale reasonably.
>>>>>>>
>>>>>>> The only caveat I've run into so far is that you either need to be
>>>>>>> running jumbo frames on your switches, or turn down the MTU on the
>>>>>>> guests a bit to accommodate the space taken by extra tag.  If you
>>>>>>> wanted to run jumbo fames on the guests as well, you'd run into the
>>>>>>> same situation and have to use slightly less than the 9000 (although
>>>>>>> the virtual router would require a patch too for the new size).
>>>>>>>
>>>>>>> On Mon, Oct 15, 2012 at 9:56 AM, Ahmad Emneina
>>>>>>><ahmad.emne...@citrix.com> wrote:
>>>>>>>> On 10/15/12 8:35 AM, "Kelceydamage@bbits" <kel...@bbits.ca> wrote:
>>>>>>>>
>>>>>>>>>That's a far more elegant way then I tried, which was creating
>>>>>>>>>tagged
>>>>>>>>>interfaces within guests.
>>>>>>>>>
>>>>>>>>>Sent from my iPhone
>>>>>>>>>
>>>>>>>>>On Oct 15, 2012, at 12:54 AM, Chiradeep Vittal
>>>>>>>>><chiradeep.vit...@citrix.com> wrote:
>>>>>>>>>
>>>>>>>>>> This sounds like it can be modeled as multiple physical networks?
>>>>>>>>>>That
>>>>>>>>>>is,
>>>>>>>>>> each "outer" vlan (400, 401, etc) is a separate physical network
>>>>>>>>>>in the
>>>>>>>>>> same zone. That could work, although it is probable that the zone
>>>>>>>>>> configuration API bits prevent more than 4k VLANs per zone (that
>>>>>>>>>>can be
>>>>>>>>>> changed to per physical network).
>>>>>>>>>>
>>>>>>>>>> As long as communication between guests on different physical
>>>>>>>>>>networks
>>>>>>>>>> happens via the public network, it should be Ok.
>>>>>>>>>> I'd like to see the patch.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> On 10/12/12 1:09 AM, "Marcus Sorensen" <shadow...@gmail.com>
>>>>>>>>>>wrote:
>>>>>>>>>>
>>>>>>>>>>> Guys, in looking for a free and scalable way to provide private
>>>>>>>>>>>networks
>>>>>>>>>>> for customers I've been running a QinQ setup that has been
>>>>>>>>>>>working
>>>>>>>>>>>quite
>>>>>>>>>>> well. I've sort of laid the groundwork for it already in
>>>>>>>>>>>changing
>>>>>>>>>>>the
>>>>>>>>>>> bridge naming conventions about a month ago for KVM(to names
>>>>>>>>>>>that
>>>>>>>>>>>won't
>>>>>>>>>>> collide if the same vlans is used twice on different phys).
>>>>>>>>>>>
>>>>>>>>>>> Basically the way it works is like this. Linux has two ways of
>>>>>>>>>>>creating
>>>>>>>>>>> tagged networks, the eth#.# and the less used vlan# network
>>>>>>>>>>>devices. I
>>>>>>>>>>> have
>>>>>>>>>>> a tiny patch that causes cloudstack to treat vlan# devs as
>>>>>>>>>>>though
>>>>>>>>>>>they
>>>>>>>>>>> were
>>>>>>>>>>> physical NICs. In this way, you can do something like physical
>>>>>>>>>>>devices
>>>>>>>>>>> eth0,eth1,and vlan400. management traffic on eth0's bridge,
>>>>>>>>>>>storage on
>>>>>>>>>>> eth1.102's bridge, maybe eth1.103 for public/guest, then create
>>>>>>>>>>>say a
>>>>>>>>>>> vlan400 that is tag 400 on eth1. You add a traffic type of guest
>>>>>>>>>>>to it
>>>>>>>>>>>and
>>>>>>>>>>> give it a vlan range, say 10-4000. Then you end up with
>>>>>>>>>>>cloudstack
>>>>>>>>>>>handing
>>>>>>>>>>> out vlan400.10, vlan400.11, etc for guest networks. Works great
>>>>>>>>>>>for
>>>>>>>>>>> network
>>>>>>>>>>> isolation without burning through a bunch of your "real" vlans.
>>>>>>>>>>>In the
>>>>>>>>>>> unlikely event that you run out, you just create a physical
>>>>>>>>>>>vlan401 and
>>>>>>>>>>> start over with the vlan numbers.
>>>>>>>>>>>
>>>>>>>>>>> In theory all-you-can-eat isolated networks without having to
>>>>>>>>>>>configure
>>>>>>>>>>> hundreds of vlans on your networking equipment. This may require
>>>>>>>>>>> additional
>>>>>>>>>>> config on any upstream switches to pass the double tags around,
>>>>>>>>>>>but in
>>>>>>>>>>> general from what I've seen the inner tags just pass through on
>>>>>>>>>>>anything
>>>>>>>>>>> layer 2, it should only get tricky if you try to tunnel, route
>>>>>>>>>>>or
>>>>>>>>>>>strip
>>>>>>>>>>> tags.
>>>>>>>>>>>
>>>>>>>>>>> This is especially nice with system VM routers and VPC
>>>>>>>>>>>(cloudstack
>>>>>>>>>>>takes
>>>>>>>>>>> care of everything), but admittedly external routers probably
>>>>>>>>>>>will have
>>>>>>>>>>> spotty support for being able to route double tagged stuff. I'm
>>>>>>>>>>>also a
>>>>>>>>>>>bit
>>>>>>>>>>> afraid that if I were to get it merged in that it would just
>>>>>>>>>>>become
>>>>>>>>>>>this
>>>>>>>>>>> undocumented hack thing that few know about and nobody uses. So
>>>>>>>>>>>I'm
>>>>>>>>>>> looking
>>>>>>>>>>> for feedback on whether this sounds useful enough to commit, how
>>>>>>>>>>>it
>>>>>>>>>>>should
>>>>>>>>>>> be documented, and whether it makes sense to hint at this in the
>>>>>>>>>>>GUI
>>>>>>>>>>> somehow.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> +1
>>>>>>>>
>>>>>>>> This actually sounds amazing Marcus. I'd love to see and use this
>>>>>>>> implementation.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Æ
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>

Reply via email to