On Thu, Jan 8, 2015 at 9:48 PM, Fan Du <fengyuleidian0...@gmail.com> wrote:
> 于 2015年01月09日 03:55, Jesse Gross 写道:
>> On Thu, Jan 8, 2015 at 1:39 AM, Fan Du<fengyuleidian0...@gmail.com>
>> wrote:
>>> >于 2015年01月08日 04:52, Jesse Gross 写道:
>>>>> >>>
>>>>> >>>My understanding is:
>>>>>> >>> >controller sets the forwarding rules into kernel datapath, any
>>>>>> >>> > flow not
>>>>>> >>> >matching
>>>>>> >>> >with the rules are threw to controller by upcall. Once the rule
>>>>>> >>> > decision
>>>>>> >>> >is
>>>>>> >>> >made
>>>>>> >>> >by controller, then, this flow packet is pushed down to datapath
>>>>>> >>> > to be
>>>>>> >>> >forwarded
>>>>>> >>> >again according to the new rule.
>>>>>> >>> >
>>>>>> >>> >So I'm not sure whether pushing the over-MTU-sized packet or
>>>>>> >>> > pushing the
>>>>>> >>> >forged ICMP
>>>>>> >>> >without encapsulation to controller is required by current ovs
>>>>>> >>> >implementation. By doing
>>>>>> >>> >so, such over-MTU-sized packet is treated as a event for the
>>>>>> >>> > controller
>>>>>> >>> >to
>>>>>> >>> >be take
>>>>>> >>> >care of.
>>>> >>
>>>> >>If flows are implementing routing (again, they are doing things like
>>>> >>decrementing the TTL) then it is necessary for them to also handle
>>>> >>this situation using some potentially new primitives (like a size
>>>> >>check). Otherwise you end up with issues like the ones that I
>>>> >>mentioned above like needing to forge addresses because you don't know
>>>> >>what the correct ones are.
>>> >
>>> >
>>> >Thanks for explaining, Jesse!
>>> >
>>> >btw, I don't get it about "to forge addresses", building ICMP message
>>> >with Guest packet doesn't require to forge address when not
>>> > encapsulating
>>> >ICMP message with outer headers.
>> Your patch has things like this (for the inner IP header):
>> +                               new_ip->saddr = orig_ip->daddr;
>> +                               new_ip->daddr = orig_ip->saddr;
>> These addresses are owned by the endpoints, not the host generating
>> generating the ICMP message, so I would consider that to be forging
>> addresses.
>>> >If the flows aren't doing things to
>>>> >>
>>>> >>implement routing, then you really have a flat L2 network and you
>>>> >>shouldn't be doing this type of behavior at all as I described in the
>>>> >>original plan.
>>> >
>>> >
>>> >For flows implementing routing scenario:
>>> >First of all, over-MTU-sized packet could only be detected once the flow
>>> >as been consulted(each port could implement a 'check' hook to do this),
>>> >and just before send to the actual port.
>>> >
>>> >Then pushing the over-MTU-sized packet back to controller, it's the
>>> >controller
>>> >who will will decide whether to build ICMP message, or whatever routing
>>> >behaviour
>>> >it may take. And sent it back with the port information. This ICMP
>>> > message
>>> >will
>>> >travel back to Guest.
>>> >
>>> >Why does the flow has to use primitive like a "check size"? "check size"
>>> >will only take effect after do_output. I'm not very clear with this
>>> >approach.
>> Checking the size obviously needs to be an action that would take
>> place before outputting in order for it to have any effect. Attaching
>> a check to a port does not fit in very well with the other primitives
>> of OVS, so I think an action is the obvious place to put it.
>>> >And not all scenario involving flow with routing behaviour, just set up
>>> > a
>>> >vxlan tunnel, and attach KVM guest or Docker onto it for playing or
>>> >developing.
>>> >This wouldn't necessarily require user to set additional specific flows
>>> > to
>>> >make
>>> >over-MTU-sized packet pass through the tunnel correctly. In such
>>> > scenario, I
>>> >think the original patch in this thread to fragment tunnel packet is
>>> > still
>>> >needed
>>> >OR workout a generic component to build ICMP for all type tunnel in L2
>>> >level.
>>> >Both of those will act as a backup plan as there is no such specific
>>> > flow as
>>> >default.
>> In these cases, we should find a way to adjust the MTU, preferably
>> automatically using virtio.
> I'm gonna to argue this a bit more here.
> virtio_net pose no limit at its simulated net device, actually it can fall
> into
> anywhere between 68 and 65535. Most importantly, virtio_net just simulates
> NIC,
> it just can’t assume/presume there is an encapsulating port at its
> downstream.
> How should virtio automatically adjust its upper guest MTU?

There are at least two parts to this:
 * Calculating the right MTU for the guest device.
 * Transferring the MTU from the host to the guest.

The first would presumably involve exposing some kind of API that the
component that does know the right value could program. In this case,
that component could be OVS using the same type of information that
you just described in the earlier post about L3. The API could simply
to just set the MTU of the device in the host and this gets mirrored
to the guest.

The second part I guess is probably a fairly straightforward extension
to virtio but I don't know the details.
dev mailing list

Reply via email to