> On Sep 17, 2015, at 10:11 AM, Ben Pfaff <b...@nicira.com> wrote:


Thanks for writing this up!  I'm still digesting it, but here are some initial 
comments:

> +        <p>
> +          L3 admission control: A priority-220 flow drops packets that match
> +          any of the following:
> +        </p>
> +
> +        <ul>
> +          <li>
> +            <code>ip.src[28..31] == 0xe</code> (multicast source)
> +          </li>
> +          <li>
> +            <code>ip.src == 255.255.255.255</code> (broadcast source)
> +          </li>
> +          <li>
> +            <code>ip.src == 127.0.0.0/8 || ip.dst == 127.0.0.0/8</code>
> +            (localhost source or destination)
> +          </li>
> +          <li>
> +            <code>ip.src == 0.0.0.0/8 || ip.dst == 0.0.0.0/8</code> (zero
> +            network source or destination)
> +          </li>

I assume these should all be "ip4.src".

> +          <li>
> +            <code>ip.src</code> is any IP address owned by the router.
> +          </li>
> +          <li>
> +            <code>ip.src</code> is the broadcast address of any IP network
> +            known to the router.

These will be IPv4 or IPv6 specific, but may be good enough for illustrative 
purposes here.

> +        <p>
> +          ICMP echo reply.  These flows reply to ICMP echo requests received
> +          for the router's IP address.  Let <var>A</var> be an IP address 
> owned
> +          by the router or the broadcast address for one of these IP 
> address's
> +          networks.  Then, for each <var>A</var>, a priority-210 flow matches
> +          on <code>ip.dst == <var>A</var></code> and <code>icmp4.type == 8
> +          &amp;&amp; icmp4.code == 0</code> (ICMP echo request).  These flows
> +          use the following actions where, if <var>A</var> is unicast, then
> +          <var>S</var> is <var>A</var>, and if <var>A</var> is broadcast,
> +          <var>S</var> is the router's IP address in <var>A</var>'s network:
> +        </p>
> +
> +        <pre>
> +ip4.dst = ip4.src;
> +ip4.src = <var>S</var>;
> +ip4.ttl = 255;
> +icmp4.type = 0;
> +reg0 = ip4.dst;

Later on, it becomes clear why reg0 is being used, but it would be nice to be 
explicit about it earlier.

Maybe it's obvious to others, but my first thought was that this was the 
original "ip.dst", but now that I look at it more carefully, my guess is that 
it is the original "ip.src".  It might be good to add a comment to clarify.

Finally, should it always be the original source IP?  If we have multiple 
routers in between, shouldn't it be the gateway's address?  (This question 
holds for all the places where reg0 is being set.)

> +next;
> +</pre>

Is it possible in these examples to have "<pre>" and "</pre>"  line up?

> +      <li>
> +        <p>
> +          TCP reset.  These flows generate TCP reset messages in reply to TCP
> +          datagrams directed to the router's IP address.  The logical router
> +          doesn't accept any TCP traffic so it always generates such a reply.
> +        </p>

Do you want to add the comment about not matching on IP fragments with nonzero 
offset like you did for UDP and protocol unreachable?  (TCP shouldn't generate 
IP fragments, but it can happen.)

> +        <p>
> +          Protocol unreachable.  These flows generate ICMP protocol 
> unreachable
> +          messages in reply to packets directed to the router's IP address on
> +          IP protocols other than UDP, TCP, and ICMP.
> +        </p>

I think for all of these error generators, we should consider some sort of 
rate-limiting.  Obviously, this is a little complicated if we want to do it in 
ovs-vswitchd--especially in the fast path.

> +      <li>
> +        Ethernet local broadcast.  A priority-190 flow with match 
> <code>eth.dst
> +        == ff:ff:ff:ff:ff:ff</code> drops traffic destined to the local
> +        Ethernet broadcast address.  By definition this traffic should not be
> +        forwarded.
> +      </li>
> +
> +      <li>
> +        Drop IP multicast.  A priority-190 flow with match 
> <code>ip.dst[28..31]
> +        == 0xe</code> drops IP multicast traffic.
> +      </li>

We may want to drop traffic sent to an IP broadcast address to prevent things 
like Smurf attacks.

> +        <p>
> +          Unknown MAC bindings.  For each non-gateway route to IPv4 network
> +          <var>N</var> with netmask <var>M</var> on router port <var>P</var>
> +          that owns IP address <var>A</var> and Ethernet address 
> <var>E</var>,
> +          a logical flow with match <code>ip.dst ==
> +          <var>N</var>/<var>M</var></code>, whose priority is the number of
> +          1-bits in <var>M</var>, has the following actions:
> +        </p>
> +
> +        <pre>
> +ratelimit;

I don't think "ratelimit" is defined anywhere.

> +arp {
> +    eth.dst = ff:ff:ff:ff:ff:ff;
> +    eth.src = <var>E</var>;
> +    arp.sha = <var>E</var>;
> +    arp.tha = 00:00:00:00:00:00;
> +    arp.spa = <var>A</var>;
> +    arp.tpa = ip.dst;
> +    outport = <var>P</var>;
> +    output;

Should you set the ARP opcode?

> +        <dt><code>ip4.ttl--;</code></dt>
> +        <dd>
> +          <p>
> +            Decrements the IPv4 TTL.  If this would make the TTL zero or
> +            negative, then processing of the packet halts; no further actions
> +            are processed.  (To properly handle such cases, a higher-priority
> +            flow should match on <code>ip.ttl &lt; 2</code>.)
> +          </p>
> +
> +          <p><b>Prerequisite:</b> <code>ip4</code></p>

Is IPv6 that different?

>         </dd>
> 
> -        <dt><code>learn</code></dt>
> +        <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
> +        <dd>
> +          <p>
> +            Temporarily replaces the IPv4 packet being processed by an ARP
> +            packet and executes each nested <var>action</var> on the ARP
> +            packet.  Actions following the <var>arp</var> action, if any, 
> apply
> +            to the original, unmodified packet.
> +          </p>

So what would we normally do with the original packet?  Do we just drop it?  
That seems kind of unfortunate.

> 
> -        <dt><code>dec_ttl { <var>action</var>, </code>...<code> } { 
> <var>action</var>; </code>...<code>};</code></dt>
> +        <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
>         <dd>
> -          decrement TTL; execute first set of actions if
> -          successful, second set if TTL decrement fails
> +          <p>
> +            Temporarily replaces the IPv4 packet being processed by an ICMPv4
> +            packet and executes each nested <var>action</var> on the ARP

Do you mean IPv4 instead of ARP?

> +            packet.  Actions following the <var>icmp4</var> action, if any,
> +            apply to the original, unmodified packet.
> +          </p>
> +
> +          <p>
> +            The ICMPv4 packet that this action operates on is initialized 
> based
> +            on the IPv4 packet being processed, as follows.  Ethernet and 
> IPv4
> +            fields not listed here are not changed:
> +          </p>
> +
> +          <ul>
> +            <li><code>ip.proto = 1</code> (ICMPv4)</li>
> +            <li><code>ip.frag = 0</code> (not a fragment)</li>
> +            <li><code>icmp4.type = 3</code> (destination unreachable)</li>
> +            <li><code>icmp4.code = 1</code> (host unreachable)</li>

I assume this is just an example since we support other types and codes, so may 
want to mention that in the description.

I'm going to spend some more time thinking about it, and then I'll follow up 
with some additional feedback.

Thanks again.

--Justin


_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to