I think that if we come up with a good overall design, it should be
able to handle different MTUs without needing to special case them -
after all, we're already talking about 2 different MTUs (encapsulated
and not) - so I don't think that having more would really make a
significant difference. I would encourage you to read through this
thread if you haven't already:
https://www.spinics.net/lists/netdev/msg257830.html

On Fri, Jun 10, 2016 at 6:50 AM, Matt Kassawara <mkassaw...@gmail.com> wrote:
> If it helps any, the entire underlying physical network should use the same
> MTU, at least in the simplest case. In other words, all provider networks
> use the native MTU and all self-service/project networks use the native MTU
> minus overlay protocol overhead, or 58 bytes for Geneve with IPv4 endpoints.
> Corner cases exist, but are rare and we can worry about them later.
>
> On Thu, Jun 9, 2016 at 7:06 PM, Jesse Gross <je...@kernel.org> wrote:
>>
>> In my previous message, this is what I mentioned (reproducing it here
>> just because it doesn't appear in the quoted conversation below):
>>
>> "One possible solution is to introduce an action in the kernel that
>> would check packets flowing through the switch against a length
>> specified by the user (where the 'user' is OVS userspace/OVN in this
>> case). To use this, we would do a route lookup of the tunnel endpoint
>> to find the outgoing device, subtract the encapsulation overhead, and
>> install a flow that checks this length and punts the packet to OVN to
>> generate an ICMP message."
>>
>> A possible way of getting the MTU to OVS userspace would be through a
>> configuration option. I don't think that this is really the hard part
>> though and so rest of the discussion around this should still apply.
>> In particular, it's not really that hard for OVS userspace to do a
>> route lookup, so if we are totally sure that the MTU is static we
>> could just have OVS fetch it. I'm not sure that either approach is all
>> that generic though.
>>
>> On Thu, Jun 9, 2016 at 10:27 AM, Matt Kassawara <mkassaw...@gmail.com>
>> wrote:
>> > Jesse,
>> >
>> > I know this sounds too easy, but can we just tell OVS about the
>> > underlying
>> > physical network MTU via config option?
>> >
>> > On Fri, May 6, 2016 at 1:08 PM, Jesse Gross <je...@kernel.org> wrote:
>> >>
>> >> On Fri, May 6, 2016 at 11:53 AM, Ryan Moats <rmo...@us.ibm.com> wrote:
>> >> > Jesse Gross <je...@kernel.org> wrote on 05/06/2016 11:11:10 AM:
>> >> >
>> >> >> From: Jesse Gross <je...@kernel.org>
>> >> >> To: Ryan Moats/Omaha/IBM@IBMUS
>> >> >> Cc: Matt Kassawara <mkassaw...@gmail.com>, discuss
>> >> >> <discuss@openvswitch.org>, Thomas Graf <tg...@suug.ch>
>> >> >> Date: 05/06/2016 11:11 AM
>> >> >
>> >> >
>> >> >> Subject: Re: [ovs-discuss] MTU considerations for OVN
>> >> >>
>> >> >> On Fri, May 6, 2016 at 8:40 AM, Ryan Moats <rmo...@us.ibm.com>
>> >> >> wrote:
>> >> >> > "discuss" <discuss-boun...@openvswitch.org> wrote on 05/04/2016
>> >> >> > 06:09:04
>> >> >> > PM:
>> >> >> >
>> >> >> >> From: Jesse Gross <je...@kernel.org>
>> >> >> >> To: Matt Kassawara <mkassaw...@gmail.com>
>> >> >> >> Cc: discuss <discuss@openvswitch.org>
>> >> >> >> Date: 05/04/2016 06:09 PM
>> >> >> >> Subject: Re: [ovs-discuss] MTU considerations for OVN
>> >> >> >> Sent by: "discuss" <discuss-boun...@openvswitch.org>
>> >> >> >>
>> >> >> >> On Tue, May 3, 2016 at 3:50 PM, Matt Kassawara
>> >> >> >> <mkassaw...@gmail.com>
>> >> >> >> wrote:
>> >> >> >> > Jesse,
>> >> >> >> >
>> >> >> >> > I'm resurrecting this thread after a fairly lengthy discussion
>> >> >> >> > of
>> >> >> >> > MTU
>> >> >> >> > with
>> >> >> >> > Ben at the recent OpenStack summit. Have you given the topic
>> >> >> >> > any
>> >> >> >> > further
>> >> >> >> > thought toward implementation in a reasonable way? Can you
>> >> >> >> > elaborate
>> >> >> >> > on
>> >> >> >> > the
>> >> >> >> > architectural limitations? At the moment, the OpenStack
>> >> >> >> > implementation
>> >> >> >> > of
>> >> >> >> > OVN doesn't use DPDK.
>> >> >> >>
>> >> >> >> The issue that I alluded to before is that when OVS (and by
>> >> >> >> extension
>> >> >> >> OVN) does L3 processing the packets aren't traversing the Linux
>> >> >> >> IP
>> >> >> >> stack and so the usual MTU checks don't apply. Instead OVS just
>> >> >> >> does
>> >> >> >> a
>> >> >> >> single combined lookup for all flow processing and then applies
>> >> >> >> some
>> >> >> >> actions like set SMAC/DMAC and decrement TTL. Not only is there
>> >> >> >> no
>> >> >> >> code to check the outgoing MTU but there's no obvious outgoing
>> >> >> >> device
>> >> >> >> to fetch the desired MTU from.
>> >> >> >
>> >> >> > I'm not 100% sure why this would be an issue - IIRC (based on my
>> >> >> > scanning
>> >> >> > the code)
>> >> >> > when a packet is going to be outputed, it looks like the MTU of
>> >> >> > the
>> >> >> > physical
>> >> >> > device
>> >> >> > is checked and a fragmentation decision made.  Isn't that good
>> >> >> > enough
>> >> >> > for
>> >> >> > our
>> >> >> > purposes?
>> >> >>
>> >> >> Which check in particular do you have in mind?
>> >> >>
>> >> >> There are two possibilities that I can think of:
>> >> >>  * ovs_vport_send() has one but the device it looks at for the MTU
>> >> >> is
>> >> >> a tunnel device, which has an essentially infinite MTU. The real MTU
>> >> >> that we would need to check also depends on the destination IP
>> >> >> address
>> >> >> of the tunnel but we haven't done a route lookup at this point.
>> >> >>  * ip_finish_output() in the IP stack. This one does have the
>> >> >> information that we need but it is outside of the tunnel. Any ICMP
>> >> >> packets that are generated will be processed through the
>> >> >> hypervisor's
>> >> >> IP stack and won't make it back to the VM. In addition, this check
>> >> >> doesn't handle GSO packets.
>> >> >
>> >> > I see, I was misreading code... my mistake.
>> >> >
>> >> > I certainly dislike the idea of separating the MTU calculation from
>> >> > the
>> >> > datapath. What I was hoping to find that it would be possible to do
>> >> > the
>> >> > fragmentation check on the tunnel after the route has been looked up
>> >> > and
>> >> > the outgoing device is known, but looking through this, I'm not
>> >> > seeing
>> >> > a good way to do this cleanly (yet) ...
>> >>
>> >> I agree.
>> >>
>> >> There was a thread a while back on the netdev mailing list related
>> >> this but no real conclusion:
>> >> https://www.spinics.net/lists/netdev/msg257830.html
>> >
>> >
>
>
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to