On Tue, Sep 29, 2015 at 03:21:16PM -0700, Ben Pfaff wrote: > On Tue, Sep 29, 2015 at 03:19:47PM -0700, Ben Pfaff wrote: > > On Mon, Sep 21, 2015 at 06:06:16PM -0700, Justin Pettit wrote: > > > > + <pre> > > > > +ip4.dst = ip4.src; > > > > +ip4.src = <var>S</var>; > > > > +ip4.ttl = 255; > > > > +icmp4.type = 0; > > > > +reg0 = ip4.dst; > > > > > > Finally, should it always be the original source IP? If we have > > > multiple routers in between, shouldn't it be the gateway's address? > > > (This question holds for all the places where reg0 is being set.) > > > > XXXXXXXXXXXXx > > Um, that was meant to be an actual reply. I'll get to that.
It's a real problem. To resolve it, I think we need to break logical router ingress table 1 into a pair of tables, one for IP input, one for IP routing, and enhance the "next" action to allow recirculation within a logical flow table. Here's an incremental. I'll post a full revision soon. diff --git a/ovn/TODO b/ovn/TODO index c914c10..77f5716 100644 --- a/ovn/TODO +++ b/ovn/TODO @@ -75,6 +75,13 @@ See description in ovn-northd(8). ** New OVN logical actions +*** enhanced "next" action. + +OVN logical router flows need to be able to revisit a single logical +flow table, so that ICMP "destination unreachable" errors generated by +a logical router can themselves be routed. One way to do this is to +enhance the "next" action to take an optional flow table index. + *** arp Generates an ARP packet based on the current IPv4 packet and allows it diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index c822dbb..a5abc3b 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -236,7 +236,7 @@ Other packets are implicitly dropped. </p> - <h3>Ingress Table 1: IP Routing</h3> + <h3>Ingress Table 1: IP Input</h3> <p> This table is the core of the logical router datapath functionality. It @@ -244,12 +244,6 @@ functionality. </p> - <p> - Logical flows in this table that advance to the next table set - <code>reg0</code> to the next-hop IP address. (<code>ip4.dst</code>, the - final destination, is left unchanged.) - </p> - <ul> <li> <p> @@ -300,7 +294,6 @@ ip4.dst = ip4.src; ip4.src = <var>S</var>; ip4.ttl = 255; icmp4.type = 0; -reg0 = ip4.dst; // Route back to original IP source. next; </pre> @@ -428,12 +421,27 @@ icmp4 { ip4.dst = ip4.src; ip4.src = <var>A</var>; ip4.ttl = 255; - reg0 = ip4.dst; next; }; </pre> </li> + </ul> + + <h3>Ingress Table 2: IP Routing</h3> + <p> + A packet that arrives at this table is an IP packet that should be routed + to the address in <code>ip4.dst</code>. This table implements IP + routing, setting <code>reg0</code> to the next-hop IP address (leaving + <code>ip4.dst</code>, the packet's final destination, unchanged) and + advances to the next table for ARP resolution. + </p> + + <p> + This table contains the following logical flows: + </p> + + <ul> <li> <p> Routing table. For each route to IPv4 network <var>N</var> with @@ -449,6 +457,11 @@ next; </pre> <p> + (Ingress table 1 already verified that <code>ip4.ttl--;</code> will + not yield a TTL exceeded error.) + </p> + + <p> If the route has a gateway, <var>G</var> is the gateway IP address, otherwise it is <code>ip4.dst</code>. </p> @@ -458,8 +471,8 @@ next; <p> Destination unreachable. For each router port <var>P</var>, which owns IP address <var>A</var>, a priority-0 logical flow with match - <code>in_port == <var>P</var> && !ip.later_frag</code> has - the following actions: + <code>in_port == <var>P</var> && !ip.later_frag && + !icmp</code> has the following actions: </p> <pre> @@ -469,19 +482,23 @@ icmp4 { ip4.dst = ip4.src; ip4.src = <var>A</var>; ip4.ttl = 255; - reg0 = ip4.dst; - next; + next(2); }; </pre> <p> + (The <code>!icmp</code> check prevents recursion if the destination + unreachable message itself cannot be routed.) + </p> + + <p> These flows are omitted if the logical router has a default route, that is, a route with netmask 0.0.0.0. </p> </li> </ul> - <h3>Ingress Table 2: ARP Resolution</h3> + <h3>Ingress Table 3: ARP Resolution</h3> <p> Any packet that reaches this table is an IP packet whose next-hop IP _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev