Re: [ovs-dev] MPLS important comments (was: Re: Patch for MPLS)

Ravi.Kerur Tue, 13 Mar 2012 11:39:22 -0700

Thanks Ben, on what Linux version you are testing this? When I sent out the 
patch yesterday I had tested "make C=1" on Ubuntu 11.10(3.0 kernel) and CentOS 
6.2(2.6 kernel) and had not seen sparse issues. I re-ran "make C=1" again today 
with the patch I sent out yesterday and didn't see any warnings. Please let me 
know on what version you are running this I will fix it.


Thanks
Ravi

-----Original Message-----
From: Ben Pfaff [mailto:b...@nicira.com]
Sent: Tuesday, March 13, 2012 10:36 AM
To: Kerur, Ravi
Cc: je...@nicira.com; dev@openvswitch.org
Subject: Re: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)

Thanks for the patch, it applies cleanly against current "master".  (I
had missed it on the first pass through my inbox this morning.)

I see the following "sparse" warnings for the kernel code.  It
otherwise builds cleanly, and I can confirm that the unit tests pass
for me.

actions.c:244:36: warning: incorrect type in assignment (different base types)
actions.c:244:36:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:244:36:    got restricted __be32 [usertype] <noident>
actions.c:265:32: warning: cast to restricted __be32
actions.c:265:32: warning: cast to restricted __be32
actions.c:265:32: warning: cast to restricted __be32
actions.c:265:32: warning: cast to restricted __be32
actions.c:265:32: warning: cast to restricted __be32
actions.c:265:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:325:32: warning: cast to restricted __be32
actions.c:346:45: warning: incorrect type in assignment (different base types)
actions.c:346:45:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:346:45:    got restricted __be32 [usertype] <noident>
actions.c:362:32: warning: cast to restricted __be32
actions.c:362:32: warning: cast to restricted __be32
actions.c:362:32: warning: cast to restricted __be32
actions.c:362:32: warning: cast to restricted __be32
actions.c:362:32: warning: cast to restricted __be32
actions.c:362:32: warning: cast to restricted __be32
actions.c:369:45: warning: incorrect type in assignment (different base types)
actions.c:369:45:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:369:45:    got restricted __be32 [usertype] <noident>
actions.c:386:32: warning: cast to restricted __be32
actions.c:386:32: warning: cast to restricted __be32
actions.c:386:32: warning: cast to restricted __be32
actions.c:386:32: warning: cast to restricted __be32
actions.c:386:32: warning: cast to restricted __be32
actions.c:386:32: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:393:41: warning: cast to restricted __be32
actions.c:396:58: warning: incorrect type in assignment (different base types)
actions.c:396:58:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:396:58:    got restricted __be32 [usertype] <noident>
actions.c:412:32: warning: cast to restricted __be32
actions.c:412:32: warning: cast to restricted __be32
actions.c:412:32: warning: cast to restricted __be32
actions.c:412:32: warning: cast to restricted __be32
actions.c:412:32: warning: cast to restricted __be32
actions.c:412:32: warning: cast to restricted __be32
actions.c:419:53: warning: incorrect type in assignment (different base types)
actions.c:419:53:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:419:53:    got restricted __be32 [usertype] <noident>
actions.c:421:41: warning: cast to restricted __be32
actions.c:421:41: warning: cast to restricted __be32
actions.c:421:41: warning: cast to restricted __be32
actions.c:421:41: warning: cast to restricted __be32
actions.c:421:41: warning: cast to restricted __be32
actions.c:421:41: warning: cast to restricted __be32
actions.c:425:53: warning: incorrect type in assignment (different base types)
actions.c:425:53:    expected unsigned int [unsigned] [usertype] <noident>
actions.c:425:53:    got restricted __be32 [usertype] <noident>

Thanks,

Ben.

On Tue, Mar 13, 2012 at 12:39:02AM +0100, ravi.ke...@telekom.com wrote:
> Attached latest patch after incorporating code review comments. I
> will go through code review comments one more time to make sure
> nothing is missed.
>
> Jesse: If you could please review kernel changes it will be helpful.
>
> I have tested code after changes on CentOS 6.2 and Ubuntu 11.10.
>
> "make check" and "make distcheck" matches latest git code results.
>
> Thanks
> Ravi
>
> -----Original Message-----
> From: Kerur, Ravi
> Sent: Monday, March 12, 2012 1:58 PM
> To: Ben Pfaff
> Cc: dev@openvswitch.org
> Subject: RE: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)
>
> Thanks Ben for the review. Please see inline for <rk>, mostly agreeing to 
> changes, but some requires inputs/clarifications.
>
> -----Original Message-----
> From: Ben Pfaff [mailto:b...@nicira.com]
> Sent: Friday, March 09, 2012 11:20 AM
> To: Kerur, Ravi
> Cc: dev@openvswitch.org
> Subject: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)
>
> Thank you for the patch!  I have some comments.  I'm dividing them
> into two groups.  The first group, in this email, is comments that
> point out likely bugs.  The second group, which I will send
> separately, is comments that point out less important issues.
>
> I didn't review the kernel code.  That will have to wait for Jesse.
>
>
> include/linux/openvswitch.h
> ---------------------------
>
> We don't usually create structs for the arguments when the arguments
> are simple types such as a single integer.  Instead, just indicate in
> the comment on the side the type and in the big comment on "enum
> ovs_action_attr" the effect of the action.  Actually every new action
> needs to be explained in the big comment.
>
> Your change renumbers OVS_ACTION_ATTR_SAMPLE but we can't do that
> because it's part of the ABI.
>
> <rk> When I started implementation I thought it will be good to store it in a 
> variable and might have to reuse in some other part of the code. On a second 
> look it doesn't look like that's the case. I have made necessary changes.
>
> lib/dpif-netdev.c
> -----------------
>
> OVS_ACTION_ATTR_SET_MPLS_LABEL takes an entire MPLS LSE as its
> argument, and the odp-util.c code parses an entire MPLS LSE (except
> the bottom-of-stack bit, which I've commented on elsewhere), but the
> dpif-netdev.c code only sets the "label" bits from the LSE into the
> packet.  The "datapath" version of the code, on the other hand, does
> something more complicated (it appears to set all the fields within
> the LSE that are nonzero into the packet).  This looks very
> inconsistent and probably wrong.
>
> <rk> OVS_ACTION_ATTR_SET_MPLS_LABEL was a misnomer. Changed it to 
> OVS_ACTION_ATTR_SET_MPLS_LSE. Moreover, userspace code was not tested until 
> after potential fix was sent out. It's now inline with kernel and rest of the 
> code base.
>
> lib/flow.c
> ----------
>
> This adds some support for 802.1ad but I'd prefer to keep that
> separate from MPLS support, so please remove the 802.1ad stuff from
> this patch.
>
> <rk> Done.
>
> flow_extract() now has code to skip over inner VLAN tags.  What's the
> rationale for that?  It seems unwise to me, because it makes it
> impossible for a controller to distinguish a packet with one VLAN tag
> from one with a dozen.  The implementation is also not correct as far
> as I can tell, because it can overrun the end of the packet.
>
> For the same reason I wonder whether we should really skip over all
> the MPLS labels?
>
> <rk> I don't know of any use-case where a packet can carry more than 2 vlan 
> tags. Idea is to parse all vlan_tags/mpls_labels such that packet->l3 is 
> correctly set after the parse. While parsing multiple tags/labels update 
> flow->vlan_tci and flow->mpls_lse only after parsing first tag/label. If this 
> is not the right way to do it let me know how you plan to handle this. I will 
> incorporate it.
>
> I found the flow_extract() treatment confusing.  It skips past all the
> MPLS labels, which would imply that we are going to look at packet
> data beyond them, but that will in fact never happen (because the
> dl_type is always MPLS).  We do want to skip the MPLS header so that
> we can set the l3 pointer properly.
>
> <rk> Please see above response.
>
> flow_is_mpls_label_valid() looks wrong to me, because I think that
> the left-shift in computing 'tmp_label' could discard some high 1-bits
> that aren't really valid.  I would consider writing it as:
>     return !(mpls_label & htonl(MPLS_LABEL_MASK >> MPLS_LABEL_SHIFT));
> which I think has the desired effect.
>
> <rk> Fixed this.
>
> flow_format() doesn't show the "bottom of stack" bit.  Should it?  (It
> looks like the kernel populates it from the packet, so I would expect
> so.)
>
> <rk> Good to have it. Added stack information.
>
> I have a concern about flow_hash_symmetric_l4().  This function is not
> well-documented, but it's intent is that packets in either direction
> in a pair of related flows (e.g. the two flows that make up a TCP
> connection) will have the same hash.  Will two flows related in that
> way have the same MPLS LSE?  This function should only hash the
> mpls_lse if the answer is "yes".
>
> <rk> Theoretically the answer is "yes" but I am not 100% on it. I would like 
> to leave it there unless I see it's not needed at all.
>
> lib/meta-flow.c
> ---------------
>
> I guess that MFF_MPLS_LABEL and MFP_MPLS_TC should have a
> prerequisite, because the MPLS label may only be matched if the
> Ethernet type indicates that the packet is a MPLS packet (right?).  So
> I would add an MFP_MPLS and appropriate support for it.
>
> <rk> Fixed it.
>
> I think that the definitions in mf_is_value_valid() and mf_get_value()
> are contradictory.  mf_is_value_valid() insists that the label be in
> the least significant 20 bits of value->be32, but mf_get_value() puts
> it in the most significant 20 bits.  I suggest that both should use
> the least significant 20 bits.
>
> I think that mf_random_value() has the same problem.
>
> <rk> Code is written with flow->mpls_lse following MPLS format because it is 
> easier for anyone debugging it. Issue is when a user configures for a match 
> based on a label it's in lower 20 bits. In mf_set_value, flow->mpls_lse is 
> updated and it updates according to MPLS format. For nx_put_match, it follows 
> the spec by putting the label in lower 20bits. mf_is_value_valid is called 
> from nx_pull_match in which case label is in lower 20bits and hence the 
> deviation.
>
> mf_set_value, mf_get_value and mf_random_value follow the same format. I am 
> bit reluctant to make changes now since it requires complete code changes and 
> I am not seeing any added value in doing it.
>
>
> lib/nx-match.c
> --------------
>
> Like flow_format(), I would expect format_mpls_label() to print the
> "bottom of stack" bit.  Similarly with the parsing code in
> parse_odp_actions() and parse_odp_key_attr().
>
> format_odp_action() and other code expects that
> OVS_ACTION_ATTR_POP_MPLS has an ethertype argument, but
> parse_odp_action() doesn't supply one.
>
> <rk> Fixed it.
>
> lib/odp-util.h
> --------------
>
> In parse_odp_action(), the number here should be 16, not 11, in two
> places:
> +    if (!strncmp(s, "copy_mpls_ttl_in", 11)) {
> +        nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_IN);
> +        return 11;
> +    }
>
> and here it should be 17, not 12, in two places:
> +    if (!strncmp(s, "copy_mpls_ttl_out", 12)) {
> +        nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_OUT);
> +        return 12;
> +    }
>
> <rk> Initially it was just "copy_ttl_in/out", later on I changed it to 
> "copy_mpls_ttl_in/out", but forgot to update it here. Fixed it now.
>
> lib/ofp-parse.c
> ---------------
>
> In parse_protocol(), there are two entries for "mpls".  Only the first
> one will have an effect, the second one would need a different name.
>
> <rk> Agreed. Have changed it i.e mpls -> unicast, mplsm -> multicast
>
> lib/ofp-print.c
> ---------------
>
> In ofp_print_action(), please put "0x" into the following formats.
> Otherwise, the values printed will be parsed as decimal if they are
> fed back into the parser:
> +   case OFPUTIL_NXAST_PUSH_MPLS:
> +        nampush = (const struct nx_action_push_mpls *) a;
> +        ds_put_format(s, "push_mpls:%"PRIx16, ntohs(nampush->ethertype));
> +        break;
> +
> +    case OFPUTIL_NXAST_POP_MPLS:
> +        nampop = (const struct nx_action_pop_mpls *) a;
> +        ds_put_format(s, "pop_mpls:%"PRIx16, ntohs(nampop->ethertype));
> +        break;
>
> <rk> Fixed.
>
>
> In ofp_match_to_string(), I don't think it's a good idea to abbreviate
> both MPLS ethertypes the same way.  Otherwise, the shorthand loses
> information.
>
> <rk> Fixed.
>
>
> lib/packets.c
> -------------
>
> dec_mpls_ttl() will decrement a ttl of 0 to 255.  Is that acceptable
> behavior?  Perhaps the comment should mention it.
>
> copy_mpls_ttl_in() and copy_mpls_ttl_out() assume that the l3 header
> is present and that it is IPv4, or alternatively that an inner MPLS
> header is present, without checking.
>
> set_mpls_lse_tc() is buggy.  It needs to do the shift before the mask.
>
> set_mps_lse_stack() is buggy the same way.
>
> push_mpls() reads the IP header without first checking that it is
> present.
>
> <rk> As mentioned earlier, userspace was not tested at all until last week. 
> Taken care of it after unit-testing.
>
> It appears that push_mpls() refuses to add an MPLS header if there is
> a VLAN tag and the inner Ethertype is not IPv4.  Why?
>
> <rk> I didn't think any other L3 protocol would fit into the use-case. For 
> IPV6, it has its own label and I am not sure there would be a case for 
> <VLAN/IPV6> packet to push a new MPLS label. Make sense? Or are you referring 
> to other L3 protocol aka IPX...?
>
> ofproto/ofproto-dpif.c
> ----------------------
>
> Did you carefully work out how the actions are implemented?  Does the
> right thing happen (a no-op) if an OpenFlow flow has a "push mpls"
> action followed by a "pop mpls" action?  etc.
>
> <rk> I don't think push_mpls followed by pop_mpls as a nop is a valid use 
> case. However, pop_mpls followed by push_mpls is a valid use case for label 
> swap. I have tested code for both scenarios with latest code and it works 
> fine.
>
> I'm not sure, in fact, that it makes sense to try to delay committing
> MPLS actions.  It seems tricky, to me, at least.  I'd be tempted to
> write them to the ODP actions immediately rather than trying to defer
> it.  The code would be simpler and more obviously correct, at any rate.
>
> <rk> When I started off implementation I wanted to take similar approach of 
> the existing code since there is nothing special in MPLS. Is there a specific 
> reason why existing code does it? I have made necessary changes not to delay 
> mpls actions.
>
>
> tests/test-mpls.c
> -----------------
>
> I don't understand why this program is here.  You should be able to
> write unit tests for this feature without actually opening raw
> sockets, etc.  We have sufficient tools to test packets through the
> "dummy" interfaces, which some of the existing unit tests.  At any
> rate I believe that this test program requires "root" privileges to
> work.
>
> <rk> test-mpls.c is the main unit-test function I have used for all mpls and 
> regression testing. For "root" privileges, I believe even "make install" or 
> "make check" requires it, so I am not sure why this is a special case?
>
> You didn't add any unit tests, just this test program.  That means
> that "make check" doesn't actually test anything MPLS related.  We
> need at least basic unit tests.
>
> <rk> I will add unit-test cases for mpls. Since I need to do this for vlan 
> qinq as well, I would like to combine them and send it as a separate patch.
>
> <rk> Now, a request. It's really hard to carry around this huge change for a 
> long time, because everytime I merge I see conflict in different areas of 
> code and have to retest it each time. If you could let me push the modified 
> code(I will send out updated patch shortly) into targeted release it would 
> really help me. I will work on unit-tests shortly after the push and get back 
> to you at the earliest. I have made sure "make check" matches latest code in 
> master. Since new code is not breaking existing functionality it would really 
> help me if it finds a home.
>
> Thanks,
> Ravi
>
> Thanks,
>
> Ben.


_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] MPLS important comments (was: Re: Patch for MPLS)

Reply via email to