Hi Ben, I don't have 3.2 kernel handy. When I ran exact command on 3.0 kernel I don't see any warnings. I have fixed the issues but no way to verify currently. I will do some basic testing and send out new patch for you/Jesse to review kernel code.
Thanks Ravi -----Original Message----- From: dev-boun...@openvswitch.org [mailto:dev-boun...@openvswitch.org] On Behalf Of Kerur, Ravi Sent: Tuesday, March 13, 2012 12:13 PM To: b...@nicira.com Cc: dev@openvswitch.org Subject: Re: [ovs-dev] MPLS important comments (was: Re: Patch for MPLS) Ok thanks. I will test it out and send another patch shortly. -----Original Message----- From: Ben Pfaff [mailto:b...@nicira.com] Sent: Tuesday, March 13, 2012 11:41 AM To: Kerur, Ravi Cc: je...@nicira.com; dev@openvswitch.org Subject: Re: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS) Try using C=1 CF="-Wsparse-all -D__CHECK_ENDIAN__", which will uncover more issues. I'm using Linux 3.2. On Tue, Mar 13, 2012 at 07:38:11PM +0100, ravi.ke...@telekom.com wrote: > Thanks Ben, on what Linux version you are testing this? When I sent out the > patch yesterday I had tested "make C=1" on Ubuntu 11.10(3.0 kernel) and > CentOS 6.2(2.6 kernel) and had not seen sparse issues. I re-ran "make C=1" > again today with the patch I sent out yesterday and didn't see any warnings. > Please let me know on what version you are running this I will fix it. > > Thanks > Ravi > > -----Original Message----- > From: Ben Pfaff [mailto:b...@nicira.com] > Sent: Tuesday, March 13, 2012 10:36 AM > To: Kerur, Ravi > Cc: je...@nicira.com; dev@openvswitch.org > Subject: Re: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS) > > Thanks for the patch, it applies cleanly against current "master". (I > had missed it on the first pass through my inbox this morning.) > > I see the following "sparse" warnings for the kernel code. It > otherwise builds cleanly, and I can confirm that the unit tests pass > for me. > > actions.c:244:36: warning: incorrect type in assignment (different base types) > actions.c:244:36: expected unsigned int [unsigned] [usertype] <noident> > actions.c:244:36: got restricted __be32 [usertype] <noident> > actions.c:265:32: warning: cast to restricted __be32 > actions.c:265:32: warning: cast to restricted __be32 > actions.c:265:32: warning: cast to restricted __be32 > actions.c:265:32: warning: cast to restricted __be32 > actions.c:265:32: warning: cast to restricted __be32 > actions.c:265:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:325:32: warning: cast to restricted __be32 > actions.c:346:45: warning: incorrect type in assignment (different base types) > actions.c:346:45: expected unsigned int [unsigned] [usertype] <noident> > actions.c:346:45: got restricted __be32 [usertype] <noident> > actions.c:362:32: warning: cast to restricted __be32 > actions.c:362:32: warning: cast to restricted __be32 > actions.c:362:32: warning: cast to restricted __be32 > actions.c:362:32: warning: cast to restricted __be32 > actions.c:362:32: warning: cast to restricted __be32 > actions.c:362:32: warning: cast to restricted __be32 > actions.c:369:45: warning: incorrect type in assignment (different base types) > actions.c:369:45: expected unsigned int [unsigned] [usertype] <noident> > actions.c:369:45: got restricted __be32 [usertype] <noident> > actions.c:386:32: warning: cast to restricted __be32 > actions.c:386:32: warning: cast to restricted __be32 > actions.c:386:32: warning: cast to restricted __be32 > actions.c:386:32: warning: cast to restricted __be32 > actions.c:386:32: warning: cast to restricted __be32 > actions.c:386:32: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:393:41: warning: cast to restricted __be32 > actions.c:396:58: warning: incorrect type in assignment (different base types) > actions.c:396:58: expected unsigned int [unsigned] [usertype] <noident> > actions.c:396:58: got restricted __be32 [usertype] <noident> > actions.c:412:32: warning: cast to restricted __be32 > actions.c:412:32: warning: cast to restricted __be32 > actions.c:412:32: warning: cast to restricted __be32 > actions.c:412:32: warning: cast to restricted __be32 > actions.c:412:32: warning: cast to restricted __be32 > actions.c:412:32: warning: cast to restricted __be32 > actions.c:419:53: warning: incorrect type in assignment (different base types) > actions.c:419:53: expected unsigned int [unsigned] [usertype] <noident> > actions.c:419:53: got restricted __be32 [usertype] <noident> > actions.c:421:41: warning: cast to restricted __be32 > actions.c:421:41: warning: cast to restricted __be32 > actions.c:421:41: warning: cast to restricted __be32 > actions.c:421:41: warning: cast to restricted __be32 > actions.c:421:41: warning: cast to restricted __be32 > actions.c:421:41: warning: cast to restricted __be32 > actions.c:425:53: warning: incorrect type in assignment (different base types) > actions.c:425:53: expected unsigned int [unsigned] [usertype] <noident> > actions.c:425:53: got restricted __be32 [usertype] <noident> > > Thanks, > > Ben. > > On Tue, Mar 13, 2012 at 12:39:02AM +0100, ravi.ke...@telekom.com wrote: > > Attached latest patch after incorporating code review comments. I > > will go through code review comments one more time to make sure > > nothing is missed. > > > > Jesse: If you could please review kernel changes it will be helpful. > > > > I have tested code after changes on CentOS 6.2 and Ubuntu 11.10. > > > > "make check" and "make distcheck" matches latest git code results. > > > > Thanks > > Ravi > > > > -----Original Message----- > > From: Kerur, Ravi > > Sent: Monday, March 12, 2012 1:58 PM > > To: Ben Pfaff > > Cc: dev@openvswitch.org > > Subject: RE: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS) > > > > Thanks Ben for the review. Please see inline for <rk>, mostly agreeing to > > changes, but some requires inputs/clarifications. > > > > -----Original Message----- > > From: Ben Pfaff [mailto:b...@nicira.com] > > Sent: Friday, March 09, 2012 11:20 AM > > To: Kerur, Ravi > > Cc: dev@openvswitch.org > > Subject: MPLS important comments (was: Re: [ovs-dev] Patch for MPLS) > > > > Thank you for the patch! I have some comments. I'm dividing them > > into two groups. The first group, in this email, is comments that > > point out likely bugs. The second group, which I will send > > separately, is comments that point out less important issues. > > > > I didn't review the kernel code. That will have to wait for Jesse. > > > > > > include/linux/openvswitch.h > > --------------------------- > > > > We don't usually create structs for the arguments when the arguments > > are simple types such as a single integer. Instead, just indicate in > > the comment on the side the type and in the big comment on "enum > > ovs_action_attr" the effect of the action. Actually every new action > > needs to be explained in the big comment. > > > > Your change renumbers OVS_ACTION_ATTR_SAMPLE but we can't do that > > because it's part of the ABI. > > > > <rk> When I started implementation I thought it will be good to store it in > > a variable and might have to reuse in some other part of the code. On a > > second look it doesn't look like that's the case. I have made necessary > > changes. > > > > lib/dpif-netdev.c > > ----------------- > > > > OVS_ACTION_ATTR_SET_MPLS_LABEL takes an entire MPLS LSE as its > > argument, and the odp-util.c code parses an entire MPLS LSE (except > > the bottom-of-stack bit, which I've commented on elsewhere), but the > > dpif-netdev.c code only sets the "label" bits from the LSE into the > > packet. The "datapath" version of the code, on the other hand, does > > something more complicated (it appears to set all the fields within > > the LSE that are nonzero into the packet). This looks very > > inconsistent and probably wrong. > > > > <rk> OVS_ACTION_ATTR_SET_MPLS_LABEL was a misnomer. Changed it to > > OVS_ACTION_ATTR_SET_MPLS_LSE. Moreover, userspace code was not tested until > > after potential fix was sent out. It's now inline with kernel and rest of > > the code base. > > > > lib/flow.c > > ---------- > > > > This adds some support for 802.1ad but I'd prefer to keep that > > separate from MPLS support, so please remove the 802.1ad stuff from > > this patch. > > > > <rk> Done. > > > > flow_extract() now has code to skip over inner VLAN tags. What's the > > rationale for that? It seems unwise to me, because it makes it > > impossible for a controller to distinguish a packet with one VLAN tag > > from one with a dozen. The implementation is also not correct as far > > as I can tell, because it can overrun the end of the packet. > > > > For the same reason I wonder whether we should really skip over all > > the MPLS labels? > > > > <rk> I don't know of any use-case where a packet can carry more than 2 vlan > > tags. Idea is to parse all vlan_tags/mpls_labels such that packet->l3 is > > correctly set after the parse. While parsing multiple tags/labels update > > flow->vlan_tci and flow->mpls_lse only after parsing first tag/label. If > > this is not the right way to do it let me know how you plan to handle this. > > I will incorporate it. > > > > I found the flow_extract() treatment confusing. It skips past all the > > MPLS labels, which would imply that we are going to look at packet > > data beyond them, but that will in fact never happen (because the > > dl_type is always MPLS). We do want to skip the MPLS header so that > > we can set the l3 pointer properly. > > > > <rk> Please see above response. > > > > flow_is_mpls_label_valid() looks wrong to me, because I think that > > the left-shift in computing 'tmp_label' could discard some high 1-bits > > that aren't really valid. I would consider writing it as: > > return !(mpls_label & htonl(MPLS_LABEL_MASK >> MPLS_LABEL_SHIFT)); > > which I think has the desired effect. > > > > <rk> Fixed this. > > > > flow_format() doesn't show the "bottom of stack" bit. Should it? (It > > looks like the kernel populates it from the packet, so I would expect > > so.) > > > > <rk> Good to have it. Added stack information. > > > > I have a concern about flow_hash_symmetric_l4(). This function is not > > well-documented, but it's intent is that packets in either direction > > in a pair of related flows (e.g. the two flows that make up a TCP > > connection) will have the same hash. Will two flows related in that > > way have the same MPLS LSE? This function should only hash the > > mpls_lse if the answer is "yes". > > > > <rk> Theoretically the answer is "yes" but I am not 100% on it. I would > > like to leave it there unless I see it's not needed at all. > > > > lib/meta-flow.c > > --------------- > > > > I guess that MFF_MPLS_LABEL and MFP_MPLS_TC should have a > > prerequisite, because the MPLS label may only be matched if the > > Ethernet type indicates that the packet is a MPLS packet (right?). So > > I would add an MFP_MPLS and appropriate support for it. > > > > <rk> Fixed it. > > > > I think that the definitions in mf_is_value_valid() and mf_get_value() > > are contradictory. mf_is_value_valid() insists that the label be in > > the least significant 20 bits of value->be32, but mf_get_value() puts > > it in the most significant 20 bits. I suggest that both should use > > the least significant 20 bits. > > > > I think that mf_random_value() has the same problem. > > > > <rk> Code is written with flow->mpls_lse following MPLS format because it > > is easier for anyone debugging it. Issue is when a user configures for a > > match based on a label it's in lower 20 bits. In mf_set_value, > > flow->mpls_lse is updated and it updates according to MPLS format. For > > nx_put_match, it follows the spec by putting the label in lower 20bits. > > mf_is_value_valid is called from nx_pull_match in which case label is in > > lower 20bits and hence the deviation. > > > > mf_set_value, mf_get_value and mf_random_value follow the same format. I am > > bit reluctant to make changes now since it requires complete code changes > > and I am not seeing any added value in doing it. > > > > > > lib/nx-match.c > > -------------- > > > > Like flow_format(), I would expect format_mpls_label() to print the > > "bottom of stack" bit. Similarly with the parsing code in > > parse_odp_actions() and parse_odp_key_attr(). > > > > format_odp_action() and other code expects that > > OVS_ACTION_ATTR_POP_MPLS has an ethertype argument, but > > parse_odp_action() doesn't supply one. > > > > <rk> Fixed it. > > > > lib/odp-util.h > > -------------- > > > > In parse_odp_action(), the number here should be 16, not 11, in two > > places: > > + if (!strncmp(s, "copy_mpls_ttl_in", 11)) { > > + nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_IN); > > + return 11; > > + } > > > > and here it should be 17, not 12, in two places: > > + if (!strncmp(s, "copy_mpls_ttl_out", 12)) { > > + nl_msg_put_flag(actions, OVS_ACTION_ATTR_COPY_TTL_OUT); > > + return 12; > > + } > > > > <rk> Initially it was just "copy_ttl_in/out", later on I changed it to > > "copy_mpls_ttl_in/out", but forgot to update it here. Fixed it now. > > > > lib/ofp-parse.c > > --------------- > > > > In parse_protocol(), there are two entries for "mpls". Only the first > > one will have an effect, the second one would need a different name. > > > > <rk> Agreed. Have changed it i.e mpls -> unicast, mplsm -> multicast > > > > lib/ofp-print.c > > --------------- > > > > In ofp_print_action(), please put "0x" into the following formats. > > Otherwise, the values printed will be parsed as decimal if they are > > fed back into the parser: > > + case OFPUTIL_NXAST_PUSH_MPLS: > > + nampush = (const struct nx_action_push_mpls *) a; > > + ds_put_format(s, "push_mpls:%"PRIx16, ntohs(nampush->ethertype)); > > + break; > > + > > + case OFPUTIL_NXAST_POP_MPLS: > > + nampop = (const struct nx_action_pop_mpls *) a; > > + ds_put_format(s, "pop_mpls:%"PRIx16, ntohs(nampop->ethertype)); > > + break; > > > > <rk> Fixed. > > > > > > In ofp_match_to_string(), I don't think it's a good idea to abbreviate > > both MPLS ethertypes the same way. Otherwise, the shorthand loses > > information. > > > > <rk> Fixed. > > > > > > lib/packets.c > > ------------- > > > > dec_mpls_ttl() will decrement a ttl of 0 to 255. Is that acceptable > > behavior? Perhaps the comment should mention it. > > > > copy_mpls_ttl_in() and copy_mpls_ttl_out() assume that the l3 header > > is present and that it is IPv4, or alternatively that an inner MPLS > > header is present, without checking. > > > > set_mpls_lse_tc() is buggy. It needs to do the shift before the mask. > > > > set_mps_lse_stack() is buggy the same way. > > > > push_mpls() reads the IP header without first checking that it is > > present. > > > > <rk> As mentioned earlier, userspace was not tested at all until last week. > > Taken care of it after unit-testing. > > > > It appears that push_mpls() refuses to add an MPLS header if there is > > a VLAN tag and the inner Ethertype is not IPv4. Why? > > > > <rk> I didn't think any other L3 protocol would fit into the use-case. For > > IPV6, it has its own label and I am not sure there would be a case for > > <VLAN/IPV6> packet to push a new MPLS label. Make sense? Or are you > > referring to other L3 protocol aka IPX...? > > > > ofproto/ofproto-dpif.c > > ---------------------- > > > > Did you carefully work out how the actions are implemented? Does the > > right thing happen (a no-op) if an OpenFlow flow has a "push mpls" > > action followed by a "pop mpls" action? etc. > > > > <rk> I don't think push_mpls followed by pop_mpls as a nop is a valid use > > case. However, pop_mpls followed by push_mpls is a valid use case for label > > swap. I have tested code for both scenarios with latest code and it works > > fine. > > > > I'm not sure, in fact, that it makes sense to try to delay committing > > MPLS actions. It seems tricky, to me, at least. I'd be tempted to > > write them to the ODP actions immediately rather than trying to defer > > it. The code would be simpler and more obviously correct, at any rate. > > > > <rk> When I started off implementation I wanted to take similar approach of > > the existing code since there is nothing special in MPLS. Is there a > > specific reason why existing code does it? I have made necessary changes > > not to delay mpls actions. > > > > > > tests/test-mpls.c > > ----------------- > > > > I don't understand why this program is here. You should be able to > > write unit tests for this feature without actually opening raw > > sockets, etc. We have sufficient tools to test packets through the > > "dummy" interfaces, which some of the existing unit tests. At any > > rate I believe that this test program requires "root" privileges to > > work. > > > > <rk> test-mpls.c is the main unit-test function I have used for all mpls > > and regression testing. For "root" privileges, I believe even "make > > install" or "make check" requires it, so I am not sure why this is a > > special case? > > > > You didn't add any unit tests, just this test program. That means > > that "make check" doesn't actually test anything MPLS related. We > > need at least basic unit tests. > > > > <rk> I will add unit-test cases for mpls. Since I need to do this for vlan > > qinq as well, I would like to combine them and send it as a separate patch. > > > > <rk> Now, a request. It's really hard to carry around this huge change for > > a long time, because everytime I merge I see conflict in different areas of > > code and have to retest it each time. If you could let me push the modified > > code(I will send out updated patch shortly) into targeted release it would > > really help me. I will work on unit-tests shortly after the push and get > > back to you at the earliest. I have made sure "make check" matches latest > > code in master. Since new code is not breaking existing functionality it > > would really help me if it finds a home. > > > > Thanks, > > Ravi > > > > Thanks, > > > > Ben. > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev