Re: [ovs-dev] MPLS important comments (was: Re: Patch for MPLS)

Ravi.Kerur Tue, 13 Mar 2012 12:14:28 -0700

Ok thanks. I will test it out and send another patch shortly.

-----Original Message-----
From: Ben Pfaff [mailto:[email protected]]
Sent: Tuesday, March 13, 2012 11:41 AM
To: Kerur, Ravi
Cc: [email protected]; [email protected]
Subject: Re: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)


Try using C=1 CF="-Wsparse-all -D__CHECK_ENDIAN__", which will uncover
more issues.

I'm using Linux 3.2.

On Tue, Mar 13, 2012 at 07:38:11PM +0100, [email protected] wrote:
> Thanks Ben, on what Linux version you are testing this? When I sent out the 
> patch yesterday I had tested "make C=1" on Ubuntu 11.10(3.0 kernel) and 
> CentOS 6.2(2.6 kernel) and had not seen sparse issues. I re-ran "make C=1" 
> again today with the patch I sent out yesterday and didn't see any warnings. 
> Please let me know on what version you are running this I will fix it.
>
> Thanks
> Ravi
>
> -----Original Message-----
> From: Ben Pfaff [mailto:[email protected]]
> Sent: Tuesday, March 13, 2012 10:36 AM
> To: Kerur, Ravi
> Cc: [email protected]; [email protected]
> Subject: Re: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)
>
> Thanks for the patch, it applies cleanly against current "master".  (I
> had missed it on the first pass through my inbox this morning.)
>
> I see the following "sparse" warnings for the kernel code.  It
> otherwise builds cleanly, and I can confirm that the unit tests pass
> for me.
>
> actions.c:244:36: warning: incorrect type in assignment (different base types)
> actions.c:244:36:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:244:36:    got restricted __be32 [usertype] <noident>
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:265:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:325:32: warning: cast to restricted __be32
> actions.c:346:45: warning: incorrect type in assignment (different base types)
> actions.c:346:45:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:346:45:    got restricted __be32 [usertype] <noident>
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:362:32: warning: cast to restricted __be32
> actions.c:369:45: warning: incorrect type in assignment (different base types)
> actions.c:369:45:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:369:45:    got restricted __be32 [usertype] <noident>
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:386:32: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:393:41: warning: cast to restricted __be32
> actions.c:396:58: warning: incorrect type in assignment (different base types)
> actions.c:396:58:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:396:58:    got restricted __be32 [usertype] <noident>
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:412:32: warning: cast to restricted __be32
> actions.c:419:53: warning: incorrect type in assignment (different base types)
> actions.c:419:53:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:419:53:    got restricted __be32 [usertype] <noident>
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:421:41: warning: cast to restricted __be32
> actions.c:425:53: warning: incorrect type in assignment (different base types)
> actions.c:425:53:    expected unsigned int [unsigned] [usertype] <noident>
> actions.c:425:53:    got restricted __be32 [usertype] <noident>
>
> Thanks,
>
> Ben.
>
> On Tue, Mar 13, 2012 at 12:39:02AM +0100, [email protected] wrote:
> > Attached latest patch after incorporating code review comments. I
> > will go through code review comments one more time to make sure
> > nothing is missed.
> >
> > Jesse: If you could please review kernel changes it will be helpful.
> >
> > I have tested code after changes on CentOS 6.2 and Ubuntu 11.10.
> >
> > "make check" and "make distcheck" matches latest git code results.
> >
> > Thanks
> > Ravi
> >
> > -----Original Message-----
> > From: Kerur, Ravi
> > Sent: Monday, March 12, 2012 1:58 PM
> > To: Ben Pfaff
> > Cc: [email protected]
> > Subject: RE: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)
> >
> > Thanks Ben for the review. Please see inline for <rk>, mostly agreeing to 
> > changes, but some requires inputs/clarifications.
> >
> > -----Original Message-----
> > From: Ben Pfaff [mailto:[email protected]]
> > Sent: Friday, March 09, 2012 11:20 AM
> > To: Kerur, Ravi
> > Cc: [email protected]
> > Subject: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS)
> >
> > Thank you for the patch!  I have some comments.  I'm dividing them
> > into two groups.  The first group, in this email, is comments that
> > point out likely bugs.  The second group, which I will send
> > separately, is comments that point out less important issues.
> >
> > I didn't review the kernel code.  That will have to wait for Jesse.
> >
> >
> > include/linux/openvswitch.h
> > ---------------------------
> >
> > We don't usually create structs for the arguments when the arguments
> > are simple types such as a single integer.  Instead, just indicate in
> > the comment on the side the type and in the big comment on "enum
> > ovs_action_attr" the effect of the action.  Actually every new action
> > needs to be explained in the big comment.
> >
> > Your change renumbers OVS_ACTION_ATTR_SAMPLE but we can't do that
> > because it's part of the ABI.
> >
> > <rk> When I started implementation I thought it will be good to store it in 
> > a variable and might have to reuse in some other part of the code. On a 
> > second look it doesn't look like that's the case. I have made necessary 
> > changes.
> >
> > lib/dpif-netdev.c
> > -----------------
> >
> > OVS_ACTION_ATTR_SET_MPLS_LABEL takes an entire MPLS LSE as its
> > argument, and the odp-util.c code parses an entire MPLS LSE (except
> > the bottom-of-stack bit, which I've commented on elsewhere), but the
> > dpif-netdev.c code only sets the "label" bits from the LSE into the
> > packet.  The "datapath" version of the code, on the other hand, does
> > something more complicated (it appears to set all the fields within
> > the LSE that are nonzero into the packet).  This looks very
> > inconsistent and probably wrong.
> >
> > <rk> OVS_ACTION_ATTR_SET_MPLS_LABEL was a misnomer. Changed it to 
> > OVS_ACTION_ATTR_SET_MPLS_LSE. Moreover, userspace code was not tested until 
> > after potential fix was sent out. It's now inline with kernel and rest of 
> > the code base.
> >
> > lib/flow.c
> > ----------
> >
> > This adds some support for 802.1ad but I'd prefer to keep that
> > separate from MPLS support, so please remove the 802.1ad stuff from
> > this patch.
> >
> > <rk> Done.
> >
> > flow_extract() now has code to skip over inner VLAN tags.  What's the
> > rationale for that?  It seems unwise to me, because it makes it
> > impossible for a controller to distinguish a packet with one VLAN tag
> > from one with a dozen.  The implementation is also not correct as far
> > as I can tell, because it can overrun the end of the packet.
> >
> > For the same reason I wonder whether we should really skip over all
> > the MPLS labels?
> >
> > <rk> I don't know of any use-case where a packet can carry more than 2 vlan 
> > tags. Idea is to parse all vlan_tags/mpls_labels such that packet->l3 is 
> > correctly set after the parse. While parsing multiple tags/labels update 
> > flow->vlan_tci and flow->mpls_lse only after parsing first tag/label. If 
> > this is not the right way to do it let me know how you plan to handle this. 
> > I will incorporate it.
> >
> > I found the flow_extract() treatment confusing.  It skips past all the
> > MPLS labels, which would imply that we are going to look at packet
> > data beyond them, but that will in fact never happen (because the
> > dl_type is always MPLS).  We do want to skip the MPLS header so that
> > we can set the l3 pointer properly.
> >
> > <rk> Please see above response.
> >
> > flow_is_mpls_label_valid() looks wrong to me, because I think that
> > the left-shift in computing 'tmp_label' could discard some high 1-bits
> > that aren't really valid.  I would consider writing it as:
> >     return !(mpls_label & htonl(MPLS_LABEL_MASK >> MPLS_LABEL_SHIFT));
> > which I think has the desired effect.
> >
> > <rk> Fixed this.
> >
> > flow_format() doesn't show the "bottom of stack" bit.  Should it?  (It
> > looks like the kernel populates it from the packet, so I would expect
> > so.)
> >
> > <rk> Good to have it. Added stack information.
> >
> > I have a concern about flow_hash_symmetric_l4().  This function is not
> > well-documented, but it's intent is that packets in either direction
> > in a pair of related flows (e.g. the two flows that make up a TCP
> > connection) will have the same hash.  Will two flows related in that
> > way have the same MPLS LSE?  This function should only hash the
> > mpls_lse if the answer is "yes".
> >
> > <rk> Theoretically the answer is "yes" but I am not 100% on it. I would 
> > like to leave it there unless I see it's not needed at all.
> >
> > lib/meta-flow.c
> > ---------------
> >
> > I guess that MFF_MPLS_LABEL and MFP_MPLS_TC should have a
> > prerequisite, because the MPLS label may only be matched if the
> > Ethernet type indicates that the packet is a MPLS packet (right?).  So
> > I would add an MFP_MPLS and appropriate support for it.
> >
> > <rk> Fixed it.
> >
> > I think that the definitions in mf_is_value_valid() and mf_get_value()
> > are contradictory.  mf_is_value_valid() insists that the label be in
> > the least significant 20 bits of value->be32, but mf_get_value() puts
> > it in the most significant 20 bits.  I suggest that both should use
> > the least significant 20 bits.
> >
> > I think that mf_random_value() has the same problem.
> >
> > <rk> Code is written with flow->mpls_lse following MPLS format because it 
> > is easier for anyone debugging it. Issue is when a user configures for a 
> > match based on a label it's in lower 20 bits. In mf_set_value, 
> > flow->mpls_lse is updated and it updates according to MPLS format. For 
> > nx_put_match, it follows the spec by putting the label in lower 20bits. 
> > mf_is_value_valid is called from nx_pull_match in which case label is in 
> > lower 20bits and hence the deviation.
> >
> > mf_set_value, mf_get_value and mf_random_value follow the same format. I am 
> > bit reluctant to make changes now since it requires complete code changes 
> > and I am not seeing any added value in doing it.
> >
> >
> > lib/nx-match.c
> > --------------
> >
> > Like flow_format(), I would expect format_mpls_label() to print the
> > "bottom of stack" bit.  Similarly with the parsing code in
> > parse_odp_actions() and parse_odp_key_attr().
> >
> > format_odp_action() and other code expects that
> > OVS_ACTION_ATTR_POP_MPLS has an ethertype argument, but
> > parse_odp_action() doesn't supply one.
> >
> > <rk> Fixed it.
> >
> > lib/odp-util.h
> > --------------
> >
> > In parse_odp_action(), the number here should be 16, not 11, in two
> > places:
> > +    if (!strncmp(s, "copy_mpls_ttl_in", 11)) {
> > +        nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_IN);
> > +        return 11;
> > +    }
> >
> > and here it should be 17, not 12, in two places:
> > +    if (!strncmp(s, "copy_mpls_ttl_out", 12)) {
> > +        nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_OUT);
> > +        return 12;
> > +    }
> >
> > <rk> Initially it was just "copy_ttl_in/out", later on I changed it to 
> > "copy_mpls_ttl_in/out", but forgot to update it here. Fixed it now.
> >
> > lib/ofp-parse.c
> > ---------------
> >
> > In parse_protocol(), there are two entries for "mpls".  Only the first
> > one will have an effect, the second one would need a different name.
> >
> > <rk> Agreed. Have changed it i.e mpls -> unicast, mplsm -> multicast
> >
> > lib/ofp-print.c
> > ---------------
> >
> > In ofp_print_action(), please put "0x" into the following formats.
> > Otherwise, the values printed will be parsed as decimal if they are
> > fed back into the parser:
> > +   case OFPUTIL_NXAST_PUSH_MPLS:
> > +        nampush = (const struct nx_action_push_mpls *) a;
> > +        ds_put_format(s, "push_mpls:%"PRIx16, ntohs(nampush->ethertype));
> > +        break;
> > +
> > +    case OFPUTIL_NXAST_POP_MPLS:
> > +        nampop = (const struct nx_action_pop_mpls *) a;
> > +        ds_put_format(s, "pop_mpls:%"PRIx16, ntohs(nampop->ethertype));
> > +        break;
> >
> > <rk> Fixed.
> >
> >
> > In ofp_match_to_string(), I don't think it's a good idea to abbreviate
> > both MPLS ethertypes the same way.  Otherwise, the shorthand loses
> > information.
> >
> > <rk> Fixed.
> >
> >
> > lib/packets.c
> > -------------
> >
> > dec_mpls_ttl() will decrement a ttl of 0 to 255.  Is that acceptable
> > behavior?  Perhaps the comment should mention it.
> >
> > copy_mpls_ttl_in() and copy_mpls_ttl_out() assume that the l3 header
> > is present and that it is IPv4, or alternatively that an inner MPLS
> > header is present, without checking.
> >
> > set_mpls_lse_tc() is buggy.  It needs to do the shift before the mask.
> >
> > set_mps_lse_stack() is buggy the same way.
> >
> > push_mpls() reads the IP header without first checking that it is
> > present.
> >
> > <rk> As mentioned earlier, userspace was not tested at all until last week. 
> > Taken care of it after unit-testing.
> >
> > It appears that push_mpls() refuses to add an MPLS header if there is
> > a VLAN tag and the inner Ethertype is not IPv4.  Why?
> >
> > <rk> I didn't think any other L3 protocol would fit into the use-case. For 
> > IPV6, it has its own label and I am not sure there would be a case for 
> > <VLAN/IPV6> packet to push a new MPLS label. Make sense? Or are you 
> > referring to other L3 protocol aka IPX...?
> >
> > ofproto/ofproto-dpif.c
> > ----------------------
> >
> > Did you carefully work out how the actions are implemented?  Does the
> > right thing happen (a no-op) if an OpenFlow flow has a "push mpls"
> > action followed by a "pop mpls" action?  etc.
> >
> > <rk> I don't think push_mpls followed by pop_mpls as a nop is a valid use 
> > case. However, pop_mpls followed by push_mpls is a valid use case for label 
> > swap. I have tested code for both scenarios with latest code and it works 
> > fine.
> >
> > I'm not sure, in fact, that it makes sense to try to delay committing
> > MPLS actions.  It seems tricky, to me, at least.  I'd be tempted to
> > write them to the ODP actions immediately rather than trying to defer
> > it.  The code would be simpler and more obviously correct, at any rate.
> >
> > <rk> When I started off implementation I wanted to take similar approach of 
> > the existing code since there is nothing special in MPLS. Is there a 
> > specific reason why existing code does it? I have made necessary changes 
> > not to delay mpls actions.
> >
> >
> > tests/test-mpls.c
> > -----------------
> >
> > I don't understand why this program is here.  You should be able to
> > write unit tests for this feature without actually opening raw
> > sockets, etc.  We have sufficient tools to test packets through the
> > "dummy" interfaces, which some of the existing unit tests.  At any
> > rate I believe that this test program requires "root" privileges to
> > work.
> >
> > <rk> test-mpls.c is the main unit-test function I have used for all mpls 
> > and regression testing. For "root" privileges, I believe even "make 
> > install" or "make check" requires it, so I am not sure why this is a 
> > special case?
> >
> > You didn't add any unit tests, just this test program.  That means
> > that "make check" doesn't actually test anything MPLS related.  We
> > need at least basic unit tests.
> >
> > <rk> I will add unit-test cases for mpls. Since I need to do this for vlan 
> > qinq as well, I would like to combine them and send it as a separate patch.
> >
> > <rk> Now, a request. It's really hard to carry around this huge change for 
> > a long time, because everytime I merge I see conflict in different areas of 
> > code and have to retest it each time. If you could let me push the modified 
> > code(I will send out updated patch shortly) into targeted release it would 
> > really help me. I will work on unit-tests shortly after the push and get 
> > back to you at the earliest. I have made sure "make check" matches latest 
> > code in master. Since new code is not breaking existing functionality it 
> > would really help me if it finds a home.
> >
> > Thanks,
> > Ravi
> >
> > Thanks,
> >
> > Ben.
>
>
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] MPLS important comments (was: Re: Patch for MPLS)

Reply via email to