Tue, Dec 06, 2016 at 11:51:46AM CET, simon.hor...@netronome.com wrote:
>On Mon, Dec 05, 2016 at 09:29:45AM -0800, Tom Herbert wrote:
>> On Mon, Dec 5, 2016 at 6:23 AM, Simon Horman <simon.hor...@netronome.com> 
>> wrote:
>> > On Sat, Dec 03, 2016 at 10:52:32AM -0800, Tom Herbert wrote:
>> >> On Sat, Dec 3, 2016 at 2:49 AM, Jiri Pirko <j...@resnulli.us> wrote:
>> >> > Fri, Dec 02, 2016 at 09:31:41PM CET, simon.hor...@netronome.com wrote:
>> >> >>Allow dissection of ICMP(V6) type and code. This re-uses transport layer
>> >> >>port dissection code as although ICMP is not a transport protocol and 
>> >> >>their
>> >> >>type and code are not ports this allows sharing of both code and 
>> >> >>storage.
>> >> >>
>> >> >>Signed-off-by: Simon Horman <simon.hor...@netronome.com>
>> >
>> > ...
>> >
>> >> > Digging into this a bit more. I think it would be much nice not to mix
>> >> > up l4 ports and icmp stuff.
>> >> >
>> >> > How about to have FLOW_DISSECTOR_KEY_ICMP
>> >> > and
>> >> > struct flow_dissector_key_icmp {
>> >> >         u8 type;
>> >> >         u8 code;
>> >> > };
>> >> >
>> >> > The you can make this structure and struct flow_dissector_key_ports into
>> >> > an union in struct flow_keys.
>> >> >
>> >> > Looks much cleaner to me.
>> >
>> > Hi Jiri,
>> >
>> > I wondered about that too. The main reason that I didn't do this the first
>> > time around is that I wasn't sure what to do to differentiate between ports
>> > and icmp in fl_init_dissector() given that using a union would could to
>> > mask bits being set in both if they are set in either.
>> >
>> > I've taken a stab at addressing that below along with implementing your
>> > suggestion. If the treatment in fl_init_dissector() is acceptable
>> > perhaps something similar should be done for 
>> > FLOW_DISSECTOR_KEY_IPV[46]_ADDRS
>> > too?
>> >
>> >> I agree, this patch adds to many conditionals into the fast path for
>> >> ICMP handling. Neither is there much point in using type and code as
>> >> input to the packet hash.
>> >
>> > Hi Tom,
>> >
>> > I'm not sure that I follow this. A packet may be ICMP or not regardless of
>> > if FLOW_DISSECTOR_KEY_PORT (and now FLOW_DISSECTOR_KEY_ICMP) port is set or
>> > not. I'd appreciate some guidance here.
>> >
>> > First-pass at implementing Jiri's suggestion follows:
>> >
>
>...
>
>> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> > index 0584b4bb4390..33e5fa2b3e87 100644
>> > --- a/net/core/flow_dissector.c
>> > +++ b/net/core/flow_dissector.c
>
>...
>
>> > @@ -975,6 +983,10 @@ static const struct flow_dissector_key 
>> > flow_keys_dissector_keys[] = {
>> >                 .offset = offsetof(struct flow_keys, ports),
>> >         },
>> >         {
>> > +               .key_id = FLOW_DISSECTOR_KEY_ICMP,
>> > +               .offset = offsetof(struct flow_keys, icmp),
>> > +       },
>> 
>> This is the problem I was referring to. This adds ICMP to the keys for
>> computing the skb hash and the ICMP type and code are in a union with
>> port numbers so they are in the range over which the hash is computed.
>> This means that we are treating type and code as some sort of flow and
>> and so different type/code between the same src/dst could follow
>> different paths in ECMP. For the purposes of computing a packet hash I
>> don't think we want ICMP information included. This might be a matter
>> of not putting FLOW_DISSECTOR_KEY_ICMP in flow_keys_dissector_keys,
>> where we need this information we could define another structure that
>> has all the same fields as in flow_keys_dissector_keys and include
>> FLOW_DISSECTOR_KEY_ICMP. Alternatively, the flow_dissector_key_icmp
>> could also be outside of the area that is used in the hash: that is no
>> in flow_keys_hash_start(keys) to flow_keys_hash_length(keys), keyval);
>
>...
>
>> > diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> > index c96b2a349779..f4ba69bd362f 100644
>> > --- a/net/sched/cls_flower.c
>> > +++ b/net/sched/cls_flower.c
>> > @@ -38,7 +38,10 @@ struct fl_flow_key {
>> >                 struct flow_dissector_key_ipv4_addrs ipv4;
>> >                 struct flow_dissector_key_ipv6_addrs ipv6;
>> >         };
>> > -       struct flow_dissector_key_ports tp;
>> > +       union {
>> > +               struct flow_dissector_key_ports tp;
>> > +               struct flow_dissector_key_icmp icmp;
>> > +       };
>> 
>> When an ICMP packet is encapsulated within UDP then icmp overwrites
>> valid port information with is a stronger signal of the flow (unless
>> FLOW_DISSECTOR_F_STOP_AT_ENCAP is set). This is another reason not to
>> use a union with ports.
>
>...
>
>Hi Tom,
>
>thanks for your explanation. I think I have a clearer picture now.
>
>I have reworked things to try to address your concerns.
>In particular:
>
>* FLOW_DISSECTOR_KEY_ICMP is no longer added to flow_keys_dissector_keys.
>  I don't think it is needed at all for the use-case I currently have in
>  mind, which is classifying packets using tc_flower. And adding it there
>  was an error on my part.

I was just about to ask why you are adding it there. Good.


>
>* I stopped using a union for ports and icmp. At the very least this seems
>  to make it easier to reason about things as we no longer need to consider
>  that a port value may in fact be an ICMP type or code.

This should be within csl_flower code. You can easily have it as a union
in struct fl_flow_key. 


>
>  This seems to allow us avoid adding a number of is_icmp checks (as I believe
>  you pointed out earlier). And should also address the problem you state
>  immediately above.

I don't understand the check you are talking about. The union just allow
to share the mem. I don't see any checks needed.


>
>* I have also placed icmp outside of the range flow_keys_hash_start(keys)
>  to flow_keys_hash_length(keys), keyval). This is because I now see no
>  value of having it inside that range and it both avoids any chance of
>  contaminating the hash with ICMP values and hashing over unwanted
>  (hopefully zero) values.

Okay, now I'm confused again. You don't want to add this to
flow_keys_dissector_keys. Why would you?


>
>  This required an update to flow_keys_hash_length() as the bound
>  of the end of fields the hash is no longer the end of struct flow_keys.
>  It seems clean but I wonder if there are similar assumptions lurking
>  elsewhere.
>
>I have lightly tested this for the tc_flower case (only).
>
>Incremental patch on top of this series:
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index a6f75cfb2bf7..8029dd4912b6 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3181,8 +3181,7 @@ static bool bond_flow_dissect(struct bonding *bond, 
>struct sk_buff *skb,
>       } else {
>               return false;
>       }
>-      if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 &&
>-          proto >= 0 && !skb_flow_is_icmp_any(skb, proto))
>+      if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0)
>               fk->ports.ports = skb_flow_get_ports(skb, noff, proto);
> 
>       return true;
>@@ -3210,8 +3209,7 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff 
>*skb)
>               return bond_eth_hash(skb);
> 
>       if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER23 ||
>-          bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23 ||
>-          flow_keys_are_icmp_any(&flow))
>+          bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23)
>               hash = bond_eth_hash(skb);
>       else
>               hash = (__force u32)flow.ports.ports;
>diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>index 44a8f69a9198..9c535fbccf2c 100644
>--- a/include/linux/skbuff.h
>+++ b/include/linux/skbuff.h
>@@ -1094,11 +1094,6 @@ u32 __skb_get_poff(const struct sk_buff *skb, void 
>*data,
> __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
>                           void *data, int hlen_proto);
> 
>-static inline bool skb_flow_is_icmp_any(const struct sk_buff *skb, u8 
>ip_proto)
>-{
>-      return flow_protos_are_icmp_any(skb->protocol, ip_proto);
>-}
>-
> static inline __be32 skb_flow_get_ports(const struct sk_buff *skb,
>                                       int thoff, u8 ip_proto)
> {
>diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
>index 5540dfa18872..058c9df8a4d8 100644
>--- a/include/net/flow_dissector.h
>+++ b/include/net/flow_dissector.h
>@@ -91,14 +91,10 @@ struct flow_dissector_key_addrs {
> 
> /**
>  * flow_dissector_key_ports:
>- *    @ports: port numbers of Transport header or
>- *            type and code of ICMP header
>+ *    @ports: port numbers of Transport header
>  *            ports: source (high) and destination (low) port numbers
>  *            src: source port number
>  *            dst: destination port number
>- *            icmp: ICMP type (high) and code (low)
>- *            type: ICMP type
>- *            type: ICMP code
>  */
> struct flow_dissector_key_ports {
>       union {
>@@ -107,6 +103,18 @@ struct flow_dissector_key_ports {
>                       __be16 src;
>                       __be16 dst;
>               };
>+      };
>+};
>+
>+/**
>+ * flow_dissector_key_icmp:
>+ *    @ports: type and code of ICMP header
>+ *            icmp: ICMP type (high) and code (low)
>+ *            type: ICMP type
>+ *            code: ICMP code
>+ */
>+struct flow_dissector_key_icmp {
>+      union {
>               __be16 icmp;
>               struct {
>                       u8 type;
>@@ -115,7 +123,6 @@ struct flow_dissector_key_ports {
>       };
> };
> 
>-
> /**
>  * struct flow_dissector_key_eth_addrs:
>  * @src: source Ethernet address
>@@ -133,6 +140,7 @@ enum flow_dissector_key_id {
>       FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs 
> */
>       FLOW_DISSECTOR_KEY_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs 
> */
>       FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */
>+      FLOW_DISSECTOR_KEY_ICMP, /* struct flow_dissector_key_icmp */
>       FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
>       FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
> */
>       FLOW_DISSECTOR_KEY_VLAN, /* struct flow_dissector_key_flow_vlan */
>@@ -173,11 +181,16 @@ struct flow_keys {
>       struct flow_dissector_key_keyid keyid;
>       struct flow_dissector_key_ports ports;
>       struct flow_dissector_key_addrs addrs;
>+#define FLOW_KEYS_HASH_END_FIELD addrs
>+      struct flow_dissector_key_icmp icmp;
> };
> 
> #define FLOW_KEYS_HASH_OFFSET         \
>       offsetof(struct flow_keys, FLOW_KEYS_HASH_START_FIELD)
> 
>+#define FLOW_KEYS_HASH_END            \
>+      offsetofend(struct flow_keys, FLOW_KEYS_HASH_END_FIELD)
>+
> __be32 flow_get_u32_src(const struct flow_keys *flow);
> __be32 flow_get_u32_dst(const struct flow_keys *flow);
> 
>@@ -233,8 +246,7 @@ static inline bool flow_keys_are_icmp_any(const struct 
>flow_keys *keys)
> 
> static inline bool flow_keys_have_l4(const struct flow_keys *keys)
> {
>-      return (!flow_keys_are_icmp_any(keys) && keys->ports.ports) ||
>-              keys->tags.flow_label;
>+      return (keys->ports.ports || keys->tags.flow_label);
> }
> 
> u32 flow_hash_from_keys(struct flow_keys *keys);
>diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>index 0584b4bb4390..ed6d46475343 100644
>--- a/net/core/flow_dissector.c
>+++ b/net/core/flow_dissector.c
>@@ -139,6 +139,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>       struct flow_dissector_key_basic *key_basic;
>       struct flow_dissector_key_addrs *key_addrs;
>       struct flow_dissector_key_ports *key_ports;
>+      struct flow_dissector_key_icmp *key_icmp;
>       struct flow_dissector_key_tags *key_tags;
>       struct flow_dissector_key_vlan *key_vlan;
>       struct flow_dissector_key_keyid *key_keyid;
>@@ -559,18 +560,25 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>               break;
>       }
> 
>-      if (dissector_uses_key(flow_dissector,
>-                             FLOW_DISSECTOR_KEY_PORTS)) {
>-              key_ports = skb_flow_dissector_target(flow_dissector,
>-                                                    FLOW_DISSECTOR_KEY_PORTS,
>-                                                    target_container);
>-              if (flow_protos_are_icmp_any(proto, ip_proto))
>-                      key_ports->icmp = skb_flow_get_be16(skb, nhoff, data,
>-                                                          hlen);
>-              else
>+      if (flow_protos_are_icmp_any(proto, ip_proto)) {
>+              if (dissector_uses_key(flow_dissector,
>+                                     FLOW_DISSECTOR_KEY_ICMP)) {
>+                      key_icmp = skb_flow_dissector_target(flow_dissector,
>+                                                           
>FLOW_DISSECTOR_KEY_ICMP,
>+                                                           target_container);
>+                      key_icmp->icmp = skb_flow_get_be16(skb, nhoff, data,
>+                                                         hlen);
>+              }
>+      } else {
>+              if (dissector_uses_key(flow_dissector,
>+                                     FLOW_DISSECTOR_KEY_PORTS)) {
>+                      key_ports = skb_flow_dissector_target(flow_dissector,
>+                                                            
>FLOW_DISSECTOR_KEY_PORTS,
>+                                                            target_container);
>                       key_ports->ports = __skb_flow_get_ports(skb, nhoff,
>                                                               ip_proto, data,
>                                                               hlen);
>+              }
>       }
> 
> out_good:
>@@ -613,9 +621,10 @@ static inline const u32 *flow_keys_hash_start(const 
>struct flow_keys *flow)
> static inline size_t flow_keys_hash_length(const struct flow_keys *flow)
> {
>       size_t diff = FLOW_KEYS_HASH_OFFSET + sizeof(flow->addrs);
>-      BUILD_BUG_ON((sizeof(*flow) - FLOW_KEYS_HASH_OFFSET) % sizeof(u32));
>+      BUILD_BUG_ON((FLOW_KEYS_HASH_END - FLOW_KEYS_HASH_OFFSET) %
>+                   sizeof(u32));
>       BUILD_BUG_ON(offsetof(typeof(*flow), addrs) !=
>-                   sizeof(*flow) - sizeof(flow->addrs));
>+                   FLOW_KEYS_HASH_END - sizeof(flow->addrs));
> 
>       switch (flow->control.addr_type) {
>       case FLOW_DISSECTOR_KEY_IPV4_ADDRS:
>@@ -628,7 +637,7 @@ static inline size_t flow_keys_hash_length(const struct 
>flow_keys *flow)
>               diff -= sizeof(flow->addrs.tipcaddrs);
>               break;
>       }
>-      return (sizeof(*flow) - diff) / sizeof(u32);
>+      return (FLOW_KEYS_HASH_END - diff) / sizeof(u32);
> }
> 
> __be32 flow_get_u32_src(const struct flow_keys *flow)
>@@ -745,8 +754,7 @@ void make_flow_keys_digest(struct flow_keys_digest *digest,
> 
>       data->n_proto = flow->basic.n_proto;
>       data->ip_proto = flow->basic.ip_proto;
>-      if (flow_keys_have_l4(flow))
>-              data->ports = flow->ports.ports;
>+      data->ports = flow->ports.ports;
>       data->src = flow->addrs.v4addrs.src;
>       data->dst = flow->addrs.v4addrs.dst;
> }
>diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
>index 408313e33bbe..6575aba87630 100644
>--- a/net/sched/cls_flow.c
>+++ b/net/sched/cls_flow.c
>@@ -96,7 +96,7 @@ static u32 flow_get_proto(const struct sk_buff *skb,
> static u32 flow_get_proto_src(const struct sk_buff *skb,
>                             const struct flow_keys *flow)
> {
>-      if (!flow_keys_are_icmp_any(flow) && flow->ports.ports)
>+      if (flow->ports.ports)
>               return ntohs(flow->ports.src);
> 
>       return addr_fold(skb->sk);
>@@ -105,7 +105,7 @@ static u32 flow_get_proto_src(const struct sk_buff *skb,
> static u32 flow_get_proto_dst(const struct sk_buff *skb,
>                             const struct flow_keys *flow)
> {
>-      if (!flow_keys_are_icmp_any(flow) && flow->ports.ports)
>+      if (flow->ports.ports)
>               return ntohs(flow->ports.dst);
> 
>       return addr_fold(skb_dst(skb)) ^ (__force u16) tc_skb_protocol(skb);
>diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>index c96b2a349779..56df0368125a 100644
>--- a/net/sched/cls_flower.c
>+++ b/net/sched/cls_flower.c
>@@ -39,6 +39,7 @@ struct fl_flow_key {
>               struct flow_dissector_key_ipv6_addrs ipv6;
>       };
>       struct flow_dissector_key_ports tp;
>+      struct flow_dissector_key_icmp icmp;
>       struct flow_dissector_key_keyid enc_key_id;
>       union {
>               struct flow_dissector_key_ipv4_addrs enc_ipv4;
>@@ -511,19 +512,23 @@ static int fl_set_key(struct net *net, struct nlattr 
>**tb,
>                              &mask->tp.dst, TCA_FLOWER_KEY_SCTP_DST_MASK,
>                              sizeof(key->tp.dst));
>       } else if (flow_basic_key_is_icmpv4(&key->basic)) {
>-              fl_set_key_val(tb, &key->tp.type, TCA_FLOWER_KEY_ICMPV4_TYPE,
>-                             &mask->tp.type, TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,
>-                             sizeof(key->tp.type));
>-              fl_set_key_val(tb, &key->tp.code, TCA_FLOWER_KEY_ICMPV4_CODE,
>-                             &mask->tp.code, TCA_FLOWER_KEY_ICMPV4_CODE_MASK,
>-                             sizeof(key->tp.code));
>+              fl_set_key_val(tb, &key->icmp.type, TCA_FLOWER_KEY_ICMPV4_TYPE,
>+                             &mask->icmp.type,
>+                             TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,
>+                             sizeof(key->icmp.type));
>+              fl_set_key_val(tb, &key->icmp.code, TCA_FLOWER_KEY_ICMPV4_CODE,
>+                             &mask->icmp.code,
>+                             TCA_FLOWER_KEY_ICMPV4_CODE_MASK,
>+                             sizeof(key->icmp.code));
>       } else if (flow_basic_key_is_icmpv6(&key->basic)) {
>-              fl_set_key_val(tb, &key->tp.type, TCA_FLOWER_KEY_ICMPV6_TYPE,
>-                             &mask->tp.type, TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,
>-                             sizeof(key->tp.type));
>-              fl_set_key_val(tb, &key->tp.code, TCA_FLOWER_KEY_ICMPV4_CODE,
>-                             &mask->tp.code, TCA_FLOWER_KEY_ICMPV4_CODE_MASK,
>-                             sizeof(key->tp.code));
>+              fl_set_key_val(tb, &key->icmp.type, TCA_FLOWER_KEY_ICMPV6_TYPE,
>+                             &mask->icmp.type,
>+                             TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,
>+                             sizeof(key->icmp.type));
>+              fl_set_key_val(tb, &key->icmp.code, TCA_FLOWER_KEY_ICMPV4_CODE,
>+                             &mask->icmp.code,
>+                             TCA_FLOWER_KEY_ICMPV4_CODE_MASK,
>+                             sizeof(key->icmp.code));
>       }
> 
>       if (tb[TCA_FLOWER_KEY_ENC_IPV4_SRC] ||
>@@ -634,6 +639,8 @@ static void fl_init_dissector(struct cls_fl_head *head,
>       FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
>                            FLOW_DISSECTOR_KEY_PORTS, tp);
>       FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
>+                           FLOW_DISSECTOR_KEY_ICMP, icmp);
>+      FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
>                            FLOW_DISSECTOR_KEY_VLAN, vlan);
>       FL_KEY_SET_IF_MASKED(&mask->key, keys, cnt,
>                            FLOW_DISSECTOR_KEY_ENC_KEYID, enc_key_id);
>@@ -1000,24 +1007,24 @@ static int fl_dump(struct net *net, struct tcf_proto 
>*tp, unsigned long fh,
>                                 sizeof(key->tp.dst))))
>               goto nla_put_failure;
>       else if (flow_basic_key_is_icmpv4(&key->basic) &&
>-               (fl_dump_key_val(skb, &key->tp.type,
>-                                TCA_FLOWER_KEY_ICMPV4_TYPE, &mask->tp.type,
>+               (fl_dump_key_val(skb, &key->icmp.type,
>+                                TCA_FLOWER_KEY_ICMPV4_TYPE, &mask->icmp.type,
>                                 TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,
>-                                sizeof(key->tp.type)) ||
>-                fl_dump_key_val(skb, &key->tp.code,
>-                                TCA_FLOWER_KEY_ICMPV4_CODE, &mask->tp.code,
>+                                sizeof(key->icmp.type)) ||
>+                fl_dump_key_val(skb, &key->icmp.code,
>+                                TCA_FLOWER_KEY_ICMPV4_CODE, &mask->icmp.code,
>                                 TCA_FLOWER_KEY_ICMPV4_CODE_MASK,
>-                                sizeof(key->tp.code))))
>+                                sizeof(key->icmp.code))))
>               goto nla_put_failure;
>       else if (flow_basic_key_is_icmpv6(&key->basic) &&
>-               (fl_dump_key_val(skb, &key->tp.type,
>-                                TCA_FLOWER_KEY_ICMPV6_TYPE, &mask->tp.type,
>+               (fl_dump_key_val(skb, &key->icmp.type,
>+                                TCA_FLOWER_KEY_ICMPV6_TYPE, &mask->icmp.type,
>                                 TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,
>-                                sizeof(key->tp.type)) ||
>-                fl_dump_key_val(skb, &key->tp.code,
>-                                TCA_FLOWER_KEY_ICMPV6_CODE, &mask->tp.code,
>+                                sizeof(key->icmp.type)) ||
>+                fl_dump_key_val(skb, &key->icmp.code,
>+                                TCA_FLOWER_KEY_ICMPV6_CODE, &mask->icmp.code,
>                                 TCA_FLOWER_KEY_ICMPV6_CODE_MASK,
>-                                sizeof(key->tp.code))))
>+                                sizeof(key->icmp.code))))
>               goto nla_put_failure;
> 
>       if (key->enc_control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS &&

Reply via email to