Re: [ovs-dev] MPLS and VLAN QinQ patch

ravi kerur Wed, 13 Jun 2012 09:32:30 -0700

On Wed, Jun 13, 2012 at 8:49 AM, Jesse Gross <je...@nicira.com> wrote:
> On Thu, Jun 14, 2012 at 12:04 AM, ravi kerur <rke...@gmail.com> wrote:
>> On Wed, Jun 13, 2012 at 5:55 AM, Jesse Gross <je...@nicira.com> wrote:
>>> On Jun 13, 2012, at 12:12 PM, ravi kerur <rke...@gmail.com> wrote:
>>>> On Tue, Jun 12, 2012 at 5:31 PM, Pravin Shelar <pshe...@nicira.com> wrote:
>>>>> TTL actions:
>>>>> As Jesse has mentioned before there could be one Set action which
>>>>> should be able to handle three different ttl related actions added.
>>>>> e.g. OVS_ACTION_ATTR_COPY_TTL_IN can generate set-ipv4 action from
>>>>> userspace. OVS_ACTION_ATTR_COPY_TTL_OUT and
>>>>> OVS_ACTION_ATTR_DEC_MPLS_TTL can be set mpls action. This works for
>>>>> single label case. But ttl_in and ttl_out also needs mpls inner packet
>>>>> parsing in flow_extract to get exact flow in kernel datapath.
>>>>>
>>>>
>>>> <rk> I had disagreed with Jesse for this approach and unfortunately
>>>> our discussion didn't continue since he was busy. My understanding is
>>>> that you use set actions when you are modifying multiple fields in a
>>>> single header and it makes perfect sense to me. In case of mpls we are
>>>> dealing with 2 headers ip and mpls
>>>
>>> It does save some time if you modify multiple header fields but that's not 
>>> the main goal. The primary objective is to push as much complexity as 
>>> possible to userspace and flow setup time. The kernel should just expose 
>>> the simplest building blocks possible.
>>>
>>>> for e.g.
>>>>
>>>> copy_ttl_in could be mpls-outer-header to mpls-inner-header (in case
>>>> of stacked labels) and mpls-header to ip for single label.
>>>>
>>>> copy_ttl_out could be mpls-inner-header to mpls-outer-header(stacked
>>>> labels) and ip to mpls-header for signle label
>>>>
>>>> dec_mpls_ttl, single header operation.
>>>>
>>>> hence I think it should be left as is. Let me know if I have
>>>> misunderstood set operations.
>>>
>>> I spent a fair amount of time thinking about this today and concluded that 
>>> neither option is very attractive as is.
>>>
>>> If we go with what you have now, I'm fairly confident that we will regret 
>>> it in the future.  The kernel code used to more directly implement various 
>>> features and prior to upstreaming we broke many of them down.  I'm happy 
>>> with the results of that but this time we won't have the benefit of 
>>> revising things later. This is particularly bad because it deviates from 
>>> our usual model of userspace controlling everything and here userspace 
>>> won't even know what the flow looks like. The effects of this tend to 
>>> metastasize because when userspace doesn't know what the packet looks like 
>>> it can't implement things that it might overwise be able to do and more and 
>>> more ends up in the kernel. The other thing, which is specific to MPLS, is 
>>> that there is no inherent way to know the type of the payload. Userspace is 
>>> vastly more likely to have this information in the event that we want to do 
>>> something with the inner packet.  In your patch the kernel is basically 
>>> assuming that the type is IP (OpenFlow doesn't give us any additional 
>>> information but it doesn't seem like a good idea in general).
>>
>> <rk> No both kernel and userspace takes care of ipv4/ipv6 and non-ip
>> as well. Motivation for MPLS is to be unaware of underneath l2/l3
>> protocol and allow fast switching based on just top mpls label.
>
> I'm talking specifically about copying the TTL.  If you are copying
> to/from an inner IP packet then you are clearly not agnostic to the
> protocol being carried.
>


<rk> I think we are talking about 2 different cases here. In core mpls
network, routers/l3 switches are agnostic to underneath layers. For
mpls edge, yes I agree they need to be aware of underneath layer esp.
for ttl and tos.

>>>
>>> Using the approach that I had suggested largely avoids these problems and 
>>> if we could go that path by adding say, the ability to match on another 
>>> tag, then I think that would be reasonable. However, it still has a 
>>> variation of the last problem, which is that although it may know what the 
>>> inner type is, unless it can tell that to the kernel there is no way to get 
>>> to the inner flow. This makes copy inwards to IP header difficult. Perhaps 
>>> we could add a mechanism for userspace to tell the kernel the type of the 
>>> inner packet, although it might not know either.
>>
>> <rk> Note that we are also increasing the size of flow-key and flow
>> here which is not required in the first case. Is it good to increase
>> memory footprint if majority of the flows handled by ovs are non-MPLS.
>> It can be justified if ovs is in the core and handles only MPLS
>> traffic.
>
> If you don't try to parse anything inside of the MPLS payload then you
> can union the MPLS tags with L3/L4 information and not take any
> additional space.  At this point, I'm trying to find the correct
> solution and I'm less concerned about optimizations.  Once we have
> something that makes sense we can figure out if there is an impact on
> performance and ways to avoid that.

<rk> agreed. I just thought of bringing it up.
>
>>>
>>> Potentially a third approach is to add a more native way to recirculate 
>>> through the pipeline (somewhat like a patch port but lighter weight and 
>>> with a different intention). In addition to the TTL issues it could be 
>>> useful in situations where you want to parse deeper into the packet than 
>>> the number of MPLS or vlan tags supported.  Often times people might want 
>>> to do L3 operations after popping off an MPLS tag and in that case you 
>>> almost certainly end up doing something like this.
>>
>> <rk> this is the case when packet has single mpls label and action
>> configured is pop_mpls, in addition to mpls label lookup, l3 lookup
>> happens as well.
>
> Yes, that's the use case that I'm referring to.  Right now in order to
> handle this you would need to do some kind of manual recirculation
> because OVS doesn't parse beyond the MPLS header.

<rk> yes, I did try with resubmit option and I wasn't successful( I
did it at very early stage of development)

>
>>>
>>> Ben, do you have any thoughts?
>>>
>>>> I am not sure whether we should do take inner header parsing
>>>> approach in kernel because it deviates from the spec. The only
>>>> deviation so far in the code v/s spec is handling of ttl and tos bits
>>>> for inner-most mpls header and it is done that way because of
>>>> practical deployed use-case.
>>>
>>> The kernel doesn't implement OpenFlow, so it doesn't matter what the spec 
>>> says. The only thing that needs to be consistent is the interface that 
>>> userspace exposes to the outside world. Everything else is an internal 
>>> implementation detail.
>>
>> <rk> I was referring to struct sw_flow_key and atleast I thought it is
>> motivated by openflow.
>
> It's motivated by OpenFlow but there are plenty of other things in
> there.  It's really like any other internal data structure in OVS.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] MPLS and VLAN QinQ patch

Reply via email to