于 2015年01月09日 03:55, Jesse Gross 写道:
On Thu, Jan 8, 2015 at 1:39 AM, Fan Du <fengyuleidian0...@gmail.com> wrote:
于 2015年01月08日 04:52, Jesse Gross 写道:

My understanding is:
controller sets the forwarding rules into kernel datapath, any flow not
matching
with the rules are threw to controller by upcall. Once the rule decision
is
made
by controller, then, this flow packet is pushed down to datapath to be
forwarded
again according to the new rule.

So I'm not sure whether pushing the over-MTU-sized packet or pushing the
forged ICMP
without encapsulation to controller is required by current ovs
implementation. By doing
so, such over-MTU-sized packet is treated as a event for the controller
to
be take
care of.

If flows are implementing routing (again, they are doing things like
decrementing the TTL) then it is necessary for them to also handle
this situation using some potentially new primitives (like a size
check). Otherwise you end up with issues like the ones that I
mentioned above like needing to forge addresses because you don't know
what the correct ones are.


Thanks for explaining, Jesse!

btw, I don't get it about "to forge addresses", building ICMP message
with Guest packet doesn't require to forge address when not encapsulating
ICMP message with outer headers.

Your patch has things like this (for the inner IP header):

+                               new_ip->saddr = orig_ip->daddr;
+                               new_ip->daddr = orig_ip->saddr;

These addresses are owned by the endpoints, not the host generating
generating the ICMP message, so I would consider that to be forging
addresses.

If the flows aren't doing things to

implement routing, then you really have a flat L2 network and you
shouldn't be doing this type of behavior at all as I described in the
original plan.


For flows implementing routing scenario:
First of all, over-MTU-sized packet could only be detected once the flow
as been consulted(each port could implement a 'check' hook to do this),
and just before send to the actual port.

Then pushing the over-MTU-sized packet back to controller, it's the
controller
who will will decide whether to build ICMP message, or whatever routing
behaviour
it may take. And sent it back with the port information. This ICMP message
will
travel back to Guest.

Why does the flow has to use primitive like a "check size"? "check size"
will only take effect after do_output. I'm not very clear with this
approach.

Checking the size obviously needs to be an action that would take
place before outputting in order for it to have any effect. Attaching
a check to a port does not fit in very well with the other primitives
of OVS, so I think an action is the obvious place to put it.

If flow is defined as:

        CHECK_SIZE -> OUTPUT

Then traversing actions at CHECK_SIZE needs to find the exactly OUTPUT port,
thus get its underlay encapsulation method as well as valid route for physical
NIC MTU, with those information can calculation whether GSOed packets
exceeds physical MTU. This is feasible anyway at the first look. After this,
it's the controller responsibility to handle such event.

If the CHECK_SIZE returns positive(over-MTU-sized packets show up), then call
output_userspace to push this packet upper wards.

I'm not sure this vague idea is the expected behaviour as required by "L3 
processing".

And not all scenario involving flow with routing behaviour, just set up a
vxlan tunnel, and attach KVM guest or Docker onto it for playing or
developing.
This wouldn't necessarily require user to set additional specific flows to
make
over-MTU-sized packet pass through the tunnel correctly. In such scenario, I
think the original patch in this thread to fragment tunnel packet is still
needed
OR workout a generic component to build ICMP for all type tunnel in L2
level.
Both of those will act as a backup plan as there is no such specific flow as
default.

In these cases, we should find a way to adjust the MTU, preferably
automatically using virtio.



--
No zuo no die but I have to try.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to