Re: [ovs-dev] [PATCH] db-ctl-base: Allow print rows that weak reference to table in 'cmd_show_table'.

2015-08-13 Thread Alex Wang
Hey Justin,

Thx for the responsive review~

On Wed, Aug 12, 2015 at 11:58 PM, Justin Pettit  wrote:

> I should have added that if you use the attached patch, you'll need to
> update the ovn-sbctl unit test you were nice enough to include in this
> patch.
>
> --Justin
>
>
> > On Aug 12, 2015, at 11:52 PM, Justin Pettit  wrote:
> >
> >
> >> On Aug 12, 2015, at 9:49 PM, Alex Wang  wrote:
> >>
> >> diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
> >> index 3028c09..cbafff1 100644
> >> --- a/lib/db-ctl-base.c
> >> +++ b/lib/db-ctl-base.c
> >> @@ -1620,6 +1620,12 @@ pre_cmd_show(struct ctl_context *ctx)
> >>ovsdb_idl_add_column(ctx->idl, column);
> >>}
> >>}
> >> +if (show->wref_table.name_column) {
> >> +ovsdb_idl_add_column(ctx->idl,
> show->wref_table.name_column);
> >> +}
> >> +if (show->wref_table.wref_column) {
> >> +ovsdb_idl_add_column(ctx->idl,
> show->wref_table.wref_column);
> >> +}
> >
> > Do you need to do a ovsdb_idl_add_table() on 'wref_table.table'?
> >
>

Yeah, I should do that,


> >> +/*  Prints table entries that weak reference the 'cur_row'. */
> >> +static void
> >> +cmd_show_weak_ref(struct ctl_context *ctx, const struct cmd_show_table
> *show,
> >> +  const struct ovsdb_idl_row *cur_row, int level)
> >> +{
> >> +const struct ovsdb_idl_row *row_wref;
> >> +const struct ovsdb_idl_table_class *table = show->wref_table.table;
> >> +const struct ovsdb_idl_column *name_column
> >> += show->wref_table.name_column;
> >> +const struct ovsdb_idl_column *wref_column
> >> += show->wref_table.wref_column;
> >> +
> >> +if (!table || !name_column || !wref_column) {
> >> +return;
> >> +}
> >> +
> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 1) * 4);
> >> +ds_put_format(&ctx->output, "%s", table->name);
> >> +ds_put_char(&ctx->output, '\n');
> >> +
> >> +for (row_wref = ovsdb_idl_first_row(ctx->idl, table); row_wref;
> >> + row_wref = ovsdb_idl_next_row(row_wref)) {
> >> +const struct ovsdb_datum *wref_datum
> >> += ovsdb_idl_read(row_wref, wref_column);
> >> +/* If weak reference refers to the 'cur_row', prints it. */
> >> +if (wref_datum->n
> >> +&& uuid_equals(&cur_row->uuid, &wref_datum->keys[0].uuid))
> {
> >> +const struct ovsdb_datum *name_datum
> >> += ovsdb_idl_read(row_wref, name_column);
> >> +
> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 2) * 4);
> >> +ds_put_format(&ctx->output, "%s: ", name_column->name);
> >> +ovsdb_datum_to_string(name_datum, &name_column->type,
> &ctx->output);
> >> +ds_put_char(&ctx->output, '\n');
> >> +}
> >> +}
> >> +}
> >
> > I think this causes the "ovn-controller-vtep" test case to fail due to
> the text printing before the "for" loop; that gets printed regardless of
> whether there are any matches or not.  I originally wrote a patch to just
> not print that preamble if there were no matches.  However, I think the
> attached patch might be more consistent with commands like "ovs-vsctl show"
> which print the root name for a row on the same line as the table name.  It
> also passes the unit tests.
> >
>

Sorry forget to run all tests,



> > For example, here's the output of the original patch:
> >
> > -=-=-=-=-=-=-=-=-=-=-
> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
> >Encap geneve
> >ip: "127.0.0.1"
> >Port_Binding
> >logical_port: "sw0-port0"
> >logical_port: "sw0-port1"
> > -=-=-=-=-=-=-=-=-=-=-
> >
> > Here it is with the attached patch:
> >
> > -=-=-=-=-=-=-=-=-=-=-
> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
> >Encap geneve
> >ip: "127.0.0.1"
> >Port_Binding "sw0-port0"
> >Port_Binding "sw0-port1"
> > -=-=-=-=-=-=-=-=-=-=-
> >
> > Does that seem reasonable?  If we later want to add other columns from
> Port_Binding, they can then be placed under each entry.
>

Yeah, totally makes sense,

>
> >> @@ -157,11 +168,16 @@ struct ctl_command *ctl_parse_commands(int argc,
> char *argv[],
> >> *
> >> * - 'columns[]' allows user to specify the print of additional columns
> >> *   in 'table'.
> >> + *
> >> + * - if 'wref_table' is filled, print the 'wref_table.table' rows that
> refer
> >> + *   to the 'table'.
> >
> > What do you think of this phrasing instead?  I think it's a bit more
> descriptive.
> >
> > * - if 'wref_table' is populated, print 'wref_table.name_column' for
> > *   each row in table 'wref_table.table' that has a reference to 'table'
> > *   in 'wref_table.wref_column'.  Every field must be populated.
> >
>

This is more comprehensive, thx, will adopt it,



> > Thanks a lot for doing this on short notice!
> >
> > Acked-by: Justin Pettit 
> >
> > --Justin
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-=-
> >
> > diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
> >

[ovs-dev] [PATCH/RFC v2] flow: Ignore invalid ICMPv6 fields when parsing packets

2015-08-13 Thread Simon Horman
There is a miss-match between the handling of invalid ICMPv6 fields in the
implementations of parse_icmpv6() in user-space and in the kernel datapath.

This patch addresses that by modifying the user-space implementation to
match that of the kernel datapath; processing is terminated without
rather than with an error and partial information is cleared.

With these changes the user-space implementation of parse_icmpv6()
never returns an error. Accordingly the return type and caller have been
updated.

The original motivation for this is to allow matching the ICMPv6 type and
code of packets with invalid neighbour discovery options although only the
change around the '(!opt_len || opt_len > *sizep)' conditional is necessary
to achieve that goal.

Signed-off-by: Simon Horman 

---
v2
* Harmonise the entire user-space implementation of parse_icmpv6() with
  kernel code rather than just the area around the '(!opt_len || opt_len >
  *sizep)' conditional. As suggested by Jesse Gross.
* Update patch subject from
  "flow: Ignore miss-sized IPv6 neighbour discovery options" to
  "flow: Ignore invalid ICMPv6 fields"
---
 lib/flow.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index fdca1e0743ea..1b7a3d5b7889 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -366,7 +366,7 @@ parse_ethertype(const void **datap, size_t *sizep)
 return htons(FLOW_DL_TYPE_NONE);
 }
 
-static inline bool
+static inline void
 parse_icmpv6(const void **datap, size_t *sizep, const struct icmp6_hdr *icmp,
  const struct in6_addr **nd_target,
  uint8_t arp_buf[2][ETH_ADDR_LEN])
@@ -377,7 +377,7 @@ parse_icmpv6(const void **datap, size_t *sizep, const 
struct icmp6_hdr *icmp,
 
 *nd_target = data_try_pull(datap, sizep, sizeof **nd_target);
 if (OVS_UNLIKELY(!*nd_target)) {
-return false;
+return;
 }
 
 while (*sizep >= 8) {
@@ -387,7 +387,7 @@ parse_icmpv6(const void **datap, size_t *sizep, const 
struct icmp6_hdr *icmp,
 int opt_len = nd_opt->nd_opt_len * 8;
 
 if (!opt_len || opt_len > *sizep) {
-goto invalid;
+return;
 }
 
 /* Store the link layer address if the appropriate option is
@@ -410,15 +410,18 @@ parse_icmpv6(const void **datap, size_t *sizep, const 
struct icmp6_hdr *icmp,
 }
 
 if (OVS_UNLIKELY(!data_try_pull(datap, sizep, opt_len))) {
-goto invalid;
+return;
 }
 }
 }
 
-return true;
+return;
 
 invalid:
-return false;
+*nd_target = NULL;
+memset(arp_buf[0], 0, ETH_ADDR_LEN);
+memset(arp_buf[1], 0, ETH_ADDR_LEN);
+return;
 }
 
 /* Initializes 'flow' members from 'packet' and 'md'
@@ -764,18 +767,16 @@ miniflow_extract(struct dp_packet *packet, struct 
miniflow *dst)
 const struct icmp6_hdr *icmp = data_pull(&data, &size,
  sizeof *icmp);
 memset(arp_buf, 0, sizeof arp_buf);
-if (OVS_LIKELY(parse_icmpv6(&data, &size, icmp, &nd_target,
-arp_buf))) {
-if (nd_target) {
-miniflow_push_words(mf, nd_target, nd_target,
-sizeof *nd_target / 8);
-}
-miniflow_push_macs(mf, arp_sha, arp_buf);
-miniflow_pad_to_64(mf, tcp_flags);
-miniflow_push_be16(mf, tp_src, htons(icmp->icmp6_type));
-miniflow_push_be16(mf, tp_dst, htons(icmp->icmp6_code));
-miniflow_pad_to_64(mf, igmp_group_ip4);
+parse_icmpv6(&data, &size, icmp, &nd_target, arp_buf);
+if (nd_target) {
+miniflow_push_words(mf, nd_target, nd_target,
+sizeof *nd_target / 8);
 }
+miniflow_push_macs(mf, arp_sha, arp_buf);
+miniflow_pad_to_64(mf, tcp_flags);
+miniflow_push_be16(mf, tp_src, htons(icmp->icmp6_type));
+miniflow_push_be16(mf, tp_dst, htons(icmp->icmp6_code));
+miniflow_pad_to_64(mf, igmp_group_ip4);
 }
 }
 }
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] db-ctl-base: Allow print rows that weak reference to table in 'cmd_show_table'.

2015-08-13 Thread Alex Wang
Adopted all suggestions and fixed unittest,

Applied to master~

Thanks,
Alex Wang,

On Thu, Aug 13, 2015 at 12:17 AM, Alex Wang  wrote:

> Hey Justin,
>
> Thx for the responsive review~
>
> On Wed, Aug 12, 2015 at 11:58 PM, Justin Pettit 
> wrote:
>
>> I should have added that if you use the attached patch, you'll need to
>> update the ovn-sbctl unit test you were nice enough to include in this
>> patch.
>>
>> --Justin
>>
>>
>> > On Aug 12, 2015, at 11:52 PM, Justin Pettit  wrote:
>> >
>> >
>> >> On Aug 12, 2015, at 9:49 PM, Alex Wang  wrote:
>> >>
>> >> diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
>> >> index 3028c09..cbafff1 100644
>> >> --- a/lib/db-ctl-base.c
>> >> +++ b/lib/db-ctl-base.c
>> >> @@ -1620,6 +1620,12 @@ pre_cmd_show(struct ctl_context *ctx)
>> >>ovsdb_idl_add_column(ctx->idl, column);
>> >>}
>> >>}
>> >> +if (show->wref_table.name_column) {
>> >> +ovsdb_idl_add_column(ctx->idl,
>> show->wref_table.name_column);
>> >> +}
>> >> +if (show->wref_table.wref_column) {
>> >> +ovsdb_idl_add_column(ctx->idl,
>> show->wref_table.wref_column);
>> >> +}
>> >
>> > Do you need to do a ovsdb_idl_add_table() on 'wref_table.table'?
>> >
>>
>
> Yeah, I should do that,
>
>
>> >> +/*  Prints table entries that weak reference the 'cur_row'. */
>> >> +static void
>> >> +cmd_show_weak_ref(struct ctl_context *ctx, const struct
>> cmd_show_table *show,
>> >> +  const struct ovsdb_idl_row *cur_row, int level)
>> >> +{
>> >> +const struct ovsdb_idl_row *row_wref;
>> >> +const struct ovsdb_idl_table_class *table =
>> show->wref_table.table;
>> >> +const struct ovsdb_idl_column *name_column
>> >> += show->wref_table.name_column;
>> >> +const struct ovsdb_idl_column *wref_column
>> >> += show->wref_table.wref_column;
>> >> +
>> >> +if (!table || !name_column || !wref_column) {
>> >> +return;
>> >> +}
>> >> +
>> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 1) * 4);
>> >> +ds_put_format(&ctx->output, "%s", table->name);
>> >> +ds_put_char(&ctx->output, '\n');
>> >> +
>> >> +for (row_wref = ovsdb_idl_first_row(ctx->idl, table); row_wref;
>> >> + row_wref = ovsdb_idl_next_row(row_wref)) {
>> >> +const struct ovsdb_datum *wref_datum
>> >> += ovsdb_idl_read(row_wref, wref_column);
>> >> +/* If weak reference refers to the 'cur_row', prints it. */
>> >> +if (wref_datum->n
>> >> +&& uuid_equals(&cur_row->uuid,
>> &wref_datum->keys[0].uuid)) {
>> >> +const struct ovsdb_datum *name_datum
>> >> += ovsdb_idl_read(row_wref, name_column);
>> >> +
>> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 2) * 4);
>> >> +ds_put_format(&ctx->output, "%s: ", name_column->name);
>> >> +ovsdb_datum_to_string(name_datum, &name_column->type,
>> &ctx->output);
>> >> +ds_put_char(&ctx->output, '\n');
>> >> +}
>> >> +}
>> >> +}
>> >
>> > I think this causes the "ovn-controller-vtep" test case to fail due to
>> the text printing before the "for" loop; that gets printed regardless of
>> whether there are any matches or not.  I originally wrote a patch to just
>> not print that preamble if there were no matches.  However, I think the
>> attached patch might be more consistent with commands like "ovs-vsctl show"
>> which print the root name for a row on the same line as the table name.  It
>> also passes the unit tests.
>> >
>>
>
> Sorry forget to run all tests,
>
>
>
>> > For example, here's the output of the original patch:
>> >
>> > -=-=-=-=-=-=-=-=-=-=-
>> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
>> >Encap geneve
>> >ip: "127.0.0.1"
>> >Port_Binding
>> >logical_port: "sw0-port0"
>> >logical_port: "sw0-port1"
>> > -=-=-=-=-=-=-=-=-=-=-
>> >
>> > Here it is with the attached patch:
>> >
>> > -=-=-=-=-=-=-=-=-=-=-
>> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
>> >Encap geneve
>> >ip: "127.0.0.1"
>> >Port_Binding "sw0-port0"
>> >Port_Binding "sw0-port1"
>> > -=-=-=-=-=-=-=-=-=-=-
>> >
>> > Does that seem reasonable?  If we later want to add other columns from
>> Port_Binding, they can then be placed under each entry.
>>
>
> Yeah, totally makes sense,
>
> >
>> >> @@ -157,11 +168,16 @@ struct ctl_command *ctl_parse_commands(int argc,
>> char *argv[],
>> >> *
>> >> * - 'columns[]' allows user to specify the print of additional columns
>> >> *   in 'table'.
>> >> + *
>> >> + * - if 'wref_table' is filled, print the 'wref_table.table' rows
>> that refer
>> >> + *   to the 'table'.
>> >
>> > What do you think of this phrasing instead?  I think it's a bit more
>> descriptive.
>> >
>> > * - if 'wref_table' is populated, print 'wref_table.name_column' for
>> > *   each row in table 'wref_table.table' that has a reference to 'table'
>> > *   in 'wref_table.wref_column'

Re: [ovs-dev] [PATCH] db-ctl-base: Allow print rows that weak reference to table in 'cmd_show_table'.

2015-08-13 Thread Justin Pettit
Awesome!  I tested it, and it looks good.  Thanks so much.

--Justin


> On Aug 13, 2015, at 12:55 AM, Alex Wang  wrote:
> 
> Adopted all suggestions and fixed unittest,
> 
> Applied to master~
> 
> Thanks,
> Alex Wang,
> 
> On Thu, Aug 13, 2015 at 12:17 AM, Alex Wang  wrote:
> Hey Justin,
> 
> Thx for the responsive review~
> 
> On Wed, Aug 12, 2015 at 11:58 PM, Justin Pettit  wrote:
> I should have added that if you use the attached patch, you'll need to update 
> the ovn-sbctl unit test you were nice enough to include in this patch.
> 
> --Justin
> 
> 
> > On Aug 12, 2015, at 11:52 PM, Justin Pettit  wrote:
> >
> >
> >> On Aug 12, 2015, at 9:49 PM, Alex Wang  wrote:
> >>
> >> diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
> >> index 3028c09..cbafff1 100644
> >> --- a/lib/db-ctl-base.c
> >> +++ b/lib/db-ctl-base.c
> >> @@ -1620,6 +1620,12 @@ pre_cmd_show(struct ctl_context *ctx)
> >>ovsdb_idl_add_column(ctx->idl, column);
> >>}
> >>}
> >> +if (show->wref_table.name_column) {
> >> +ovsdb_idl_add_column(ctx->idl, show->wref_table.name_column);
> >> +}
> >> +if (show->wref_table.wref_column) {
> >> +ovsdb_idl_add_column(ctx->idl, show->wref_table.wref_column);
> >> +}
> >
> > Do you need to do a ovsdb_idl_add_table() on 'wref_table.table'?
> >
> 
> Yeah, I should do that,
>  
> >> +/*  Prints table entries that weak reference the 'cur_row'. */
> >> +static void
> >> +cmd_show_weak_ref(struct ctl_context *ctx, const struct cmd_show_table 
> >> *show,
> >> +  const struct ovsdb_idl_row *cur_row, int level)
> >> +{
> >> +const struct ovsdb_idl_row *row_wref;
> >> +const struct ovsdb_idl_table_class *table = show->wref_table.table;
> >> +const struct ovsdb_idl_column *name_column
> >> += show->wref_table.name_column;
> >> +const struct ovsdb_idl_column *wref_column
> >> += show->wref_table.wref_column;
> >> +
> >> +if (!table || !name_column || !wref_column) {
> >> +return;
> >> +}
> >> +
> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 1) * 4);
> >> +ds_put_format(&ctx->output, "%s", table->name);
> >> +ds_put_char(&ctx->output, '\n');
> >> +
> >> +for (row_wref = ovsdb_idl_first_row(ctx->idl, table); row_wref;
> >> + row_wref = ovsdb_idl_next_row(row_wref)) {
> >> +const struct ovsdb_datum *wref_datum
> >> += ovsdb_idl_read(row_wref, wref_column);
> >> +/* If weak reference refers to the 'cur_row', prints it. */
> >> +if (wref_datum->n
> >> +&& uuid_equals(&cur_row->uuid, &wref_datum->keys[0].uuid)) {
> >> +const struct ovsdb_datum *name_datum
> >> += ovsdb_idl_read(row_wref, name_column);
> >> +
> >> +ds_put_char_multiple(&ctx->output, ' ', (level + 2) * 4);
> >> +ds_put_format(&ctx->output, "%s: ", name_column->name);
> >> +ovsdb_datum_to_string(name_datum, &name_column->type, 
> >> &ctx->output);
> >> +ds_put_char(&ctx->output, '\n');
> >> +}
> >> +}
> >> +}
> >
> > I think this causes the "ovn-controller-vtep" test case to fail due to the 
> > text printing before the "for" loop; that gets printed regardless of 
> > whether there are any matches or not.  I originally wrote a patch to just 
> > not print that preamble if there were no matches.  However, I think the 
> > attached patch might be more consistent with commands like "ovs-vsctl show" 
> > which print the root name for a row on the same line as the table name.  It 
> > also passes the unit tests.
> >
> 
> Sorry forget to run all tests,
> 
>  
> > For example, here's the output of the original patch:
> >
> > -=-=-=-=-=-=-=-=-=-=-
> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
> >Encap geneve
> >ip: "127.0.0.1"
> >Port_Binding
> >logical_port: "sw0-port0"
> >logical_port: "sw0-port1"
> > -=-=-=-=-=-=-=-=-=-=-
> >
> > Here it is with the attached patch:
> >
> > -=-=-=-=-=-=-=-=-=-=-
> > Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
> >Encap geneve
> >ip: "127.0.0.1"
> >Port_Binding "sw0-port0"
> >Port_Binding "sw0-port1"
> > -=-=-=-=-=-=-=-=-=-=-
> >
> > Does that seem reasonable?  If we later want to add other columns from 
> > Port_Binding, they can then be placed under each entry.
> 
> Yeah, totally makes sense,
> 
> >
> >> @@ -157,11 +168,16 @@ struct ctl_command *ctl_parse_commands(int argc, 
> >> char *argv[],
> >> *
> >> * - 'columns[]' allows user to specify the print of additional columns
> >> *   in 'table'.
> >> + *
> >> + * - if 'wref_table' is filled, print the 'wref_table.table' rows that 
> >> refer
> >> + *   to the 'table'.
> >
> > What do you think of this phrasing instead?  I think it's a bit more 
> > descriptive.
> >
> > * - if 'wref_table' is populated, print 'wref_table.name_column' for
> > *   each row in table 'wref_table.table' that has a reference

Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Purge all ukeys when reconfigure pmd.

2015-08-13 Thread Ilya Maximets
Alex,
I have lost at least 32 packets while testing this patch yesterday(~15 
configuration changes).
Is it bad? What is worse: loosing of ~1-2 bursts of packets or a little stats 
about threads?

Best regards, Ilya Maximets.

On 12.08.2015 23:09, Alex Wang wrote:
> 
> 
> On Wed, Aug 12, 2015 at 1:10 AM, Ilya Maximets  > wrote:
> 
> Ok, but why don't just put dp->dp_purge_cb(dp->dp_purge_aux) right after
> dp_netdev_destroy_all_pmds(dp) in dpif_netdev_pmd_set() ? In that case,
> disabling of upcalls will not be necessary.
> 
> 
> Hey Ilya,
> 
> Calling it before 'dp_netdev_destroy_all_pmds()' can also allow the update
> of 'dpif-netdev' flows' stats before they are gone.
> 
> Also, since the pmd thread recreation change is deemed big configuration
> change and enable/disable upcall is only one locking operation each, so I
> think it will not matter very much.  Would like to hear your concerns,
> 
> Thanks,
> Alex Wang,
> 
> 
>  
> 
> P.S. The second letter of my name is 'L'.
> 
> 
> Apology~  for keeping using it,
> 
> 
> 
>  
> 
> On 11.08.2015 21:06, Alex Wang wrote:
> > Thx, IIya for the review, please see my reply below,
> >
> >
> >
> > On Tue, Aug 11, 2015 at 4:41 AM, Ilya Maximets    >> wrote:
> >
> > On 11.08.2015 05:47, Alex Wang wrote:
> > > When dpdk configuration changes, all pmd threads are recreated
> > > and rx queues of each port are reloaded.  After this process,
> > > rx queue could be mapped to a different pmd thread other than
> > > the one before reconfiguration.  However, this is totally
> > > transparent to ofproto layer modules.  So, if the 
> ofproto-dpif-upcall
> > > module still holds ukeys generated before pmd thread recreation,
> > > this old ukey will collide with the ukey for the new upcalls
> > > from same traffic flow, causing flow installation failure.
> > >
> > > To fix the bug, this commit adds a new call-back function
> > > in dpif layer for notifying upper layer the purging of datapath
> > > (e.g. pmd threads recreation in dpif-netdev).  So, the
> > > ofproto-dpif-upcall module can react properly with deleting
> > > all the ukeys and with collecting all flows' last stats.
> > >
> > > Reported-by: Ilya Maximets    >>
> > > Signed-off-by: Alex Wang   >>
> > > ---
> > >  lib/dpif-netdev.c |   25 +
> > >  lib/dpif-netlink.c|1 +
> > >  lib/dpif-provider.h   |   11 ++-
> > >  lib/dpif.c|8 
> > >  lib/dpif.h|   12 
> > >  ofproto/ofproto-dpif-upcall.c |   16 
> > >  6 files changed, 72 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > > index c144352..a54853f 100644
> > > --- a/lib/dpif-netdev.c
> > > +++ b/lib/dpif-netdev.c
> > > @@ -203,6 +203,11 @@ struct dp_netdev {
> > >  upcall_callback *upcall_cb;  /* Callback function for 
> executing upcalls. */
> > >  void *upcall_aux;
> > >
> > > +/* Callback function for notifying the purging of dp flows 
> (during
> > > + * reseting pmd threads). */
> > > +dp_purge_callback *dp_purge_cb;
> > > +void *dp_purge_aux;
> > > +
> > >  /* Stores all 'struct dp_netdev_pmd_thread's. */
> > >  struct cmap poll_threads;
> > >
> > > @@ -223,6 +228,8 @@ struct dp_netdev {
> > >
> > >  static struct dp_netdev_port *dp_netdev_lookup_port(const struct 
> dp_netdev *dp,
> > >  odp_port_t);
> > > +static void dp_netdev_disable_upcall(struct dp_netdev *);
> > > +static void dp_netdev_enable_upcall(struct dp_netdev *);
> > >
> > >  enum dp_stat_type {
> > >  DP_STAT_EXACT_HIT,  /* Packets that had an exact 
> match (emc). */
> > > @@ -2440,10 +2447,18 @@ dpif_netdev_pmd_set(struct dpif *dpif, 
> unsigned int n_rxqs, const char *cmask)
> > >  free(dp->pmd_cmask);
> > >  dp->pmd_cmask = cmask ? xstrdup(cmask) : NULL;
> > >
> > > +/* Disables upcall and notifies higher layer about the 
> incoming
> > > + * datapath purging. */
> > > +dp_netdev_disable_upcall(dp);
> > > +dp->dp_purge_cb(dp->dp_purge_aux);
> > >

[ovs-dev] Rate limiting on unknown unicast packet

2015-08-13 Thread abhishek.labh
Hi All,

I want to apply rate limiting on unknown unicast packet. When destination 
address is not found in the mac table flooding will occurs in ovs. Can we apply 
any policy/rule to rate limit the traffic for unknown packet/traffic.

Thanks & Regards,
Abhishek
The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] dpif-netdev: proper tx queue id

2015-08-13 Thread Ilya Maximets
ping.

On 06.08.2015 17:32, Ilya Maximets wrote:
> Currently tx_qid is equal to pmd->core_id. This leads to unexpected
> behavior if pmd-cpu-mask different from '/(0*)(1|3|7)?(f*)/',
> e.g. if core_ids are not sequential, or doesn't start from 0, or both.
> 
> Example:
>   starting 2 pmd threads with 1 port, 2 rxqs per port,
>   pmd-cpu-mask = 0014 and let dev->real_n_txq = 2
> 
>   It that case pmd_1->tx_qid = 2, pmd_2->tx_qid = 4 and
>   txq_needs_locking = true (if device hasn't ovs_numa_get_n_cores()+1
>   queues).
> 
>   In that case, after truncating in netdev_dpdk_send__():
>   'qid = qid % dev->real_n_txq;'
>   pmd_1: qid = 2 % 2 = 0
>   pmd_2: qid = 4 % 2 = 0
> 
>   So, both threads will call dpdk_queue_pkts() with same qid = 0.
>   This is unexpected behavior if there is 2 tx queues in device.
>   Queue #1 will not be used and both threads will lock queue #0
>   on each send.
> 
> Fix that by introducing per pmd thread hash map 'tx_queues', where will
> be stored all available tx queues for that pmd thread with
> port_no as a key(hash). All tx_qid-s will be unique per port and
> sequential to prevent described unexpected mapping after truncating.
> 
> Implemented infrastructure can be used in the future to choose
> between all tx queues available for that pmd thread.
> 
> Signed-off-by: Ilya Maximets 
> ---
>  lib/dpif-netdev.c | 110 
> --
>  1 file changed, 91 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index c144352..309994c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -365,6 +365,13 @@ struct dp_netdev_pmd_cycles {
>  atomic_ullong n[PMD_N_CYCLES];
>  };
>  
> +struct dp_netdev_pmd_txq {
> +struct cmap_node node;/* In owning dp_netdev_pmd_thread's */
> +  /* 'tx_queues'. */
> +struct dp_netdev_port *port;
> +int tx_qid;
> +};
> +
>  /* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
>   * the performance overhead of interrupt processing.  Therefore netdev can
>   * not implement rx-wait for these devices.  dpif-netdev needs to poll
> @@ -420,8 +427,8 @@ struct dp_netdev_pmd_thread {
>  /* threads on same numa node. */
>  unsigned core_id;   /* CPU core id of this pmd thread. */
>  int numa_id;/* numa node id of this pmd thread. */
> -int tx_qid; /* Queue id used by this pmd thread to
> - * send packets on all netdevs */
> +struct cmap tx_queues;  /* Queue ids used by this pmd thread to
> + * send packets to ports */
>  
>  /* Only a pmd thread can write on its own 'cycles' and 'stats'.
>   * The main thread keeps 'stats_zero' and 'cycles_zero' as base
> @@ -2602,6 +2609,46 @@ dpif_netdev_wait(struct dpif *dpif)
>  seq_wait(tnl_conf_seq, dp->last_tnl_conf_seq);
>  }
>  
> +static void
> +pmd_flush_tx_queues(struct dp_netdev_pmd_thread *pmd)
> +{
> +struct dp_netdev_pmd_txq *txq;
> +
> +CMAP_FOR_EACH (txq, node, &pmd->tx_queues) {
> +cmap_remove(&pmd->tx_queues, &txq->node,
> +hash_port_no(txq->port->port_no));
> +port_unref(txq->port);
> +free(txq);
> +}
> +}
> +
> +static void OVS_UNUSED
> +pmd_tx_queues_print(struct dp_netdev_pmd_thread *pmd)
> +{
> +struct dp_netdev_pmd_txq *txq;
> +
> +CMAP_FOR_EACH (txq, node, &pmd->tx_queues) {
> +VLOG_INFO("Core_id: %d, Port: %s, tx_qid: %d\n",
> +   pmd->core_id, netdev_get_name(txq->port->netdev),
> +   txq->tx_qid);
> +}
> +}
> +
> +static struct dp_netdev_pmd_txq *
> +pmd_lookup_txq(const struct dp_netdev_pmd_thread *pmd,
> +   const struct dp_netdev_port *port)
> +{
> +struct dp_netdev_pmd_txq *txq;
> +
> +CMAP_FOR_EACH_WITH_HASH (txq, node, hash_port_no(port->port_no),
> + &pmd->tx_queues) {
> +if (txq->port == port) {
> +return txq;
> +}
> +}
> +return NULL;
> +}
> +
>  struct rxq_poll {
>  struct dp_netdev_port *port;
>  struct netdev_rxq *rx;
> @@ -2613,16 +2660,20 @@ pmd_load_queues(struct dp_netdev_pmd_thread *pmd,
>  {
>  struct rxq_poll *poll_list = *ppoll_list;
>  struct dp_netdev_port *port;
> -int n_pmds_on_numa, index, i;
> +struct dp_netdev_pmd_txq *txq;
> +int n_pmds_on_numa, rx_index, tx_index, i;
>  
>  /* Simple scheduler for netdev rx polling. */
> +pmd_flush_tx_queues(pmd);
> +
>  for (i = 0; i < poll_cnt; i++) {
>  port_unref(poll_list[i].port);
>  }
>  
>  poll_cnt = 0;
>  n_pmds_on_numa = get_n_pmd_threads_on_numa(pmd->dp, pmd->numa_id);
> -index = 0;
> +rx_index = 0;
> +tx_index = 0;
>  
>  CMAP_FOR_EACH (port, node,

Re: [ovs-dev] [PATCH RFC] coverage: fix coverage accounting for pmd threads

2015-08-13 Thread Ilya Maximets
Ok, thanks for explanation.
In that case, I think, try-lock solution will be good.

Best regards, Ilya Maximets.

On 12.08.2015 23:52, Alex Wang wrote:
> Hey Ilya,
> 
> Thanks for the reply please see my comments inline,
> 
> Thanks,
> Alex Wang,
> 
> On Wed, Aug 12, 2015 at 4:19 AM, Ilya Maximets  > wrote:
> 
> I agree, that this solution is much more simple, but I don't like this 
> race for
> coverage_mutex.
> Why this all done so?
> 
> 
> Before implementing pmd threads, waiting on the ‘coverage_mutex’ is 
> affordable for all other threads.
> 
> 
>  
> 
> I mean, why we should run coverage_clear every second? 
> 
> 
> Because, we (the main thread) need to calculate the per second/minute/hour 
> rate..
> 
>  
> 
> What if we change type of
> thread_local unsigned int counter_##COUNTER from unsigned int to unsigned 
> long long
> and will call coverage_clear() in coverage_read() ? (May be possible with 
> my patch
> applied) Also, we may collect stats for all threads separately.
> 
>  
> There is one major issue with you change to 'COVERAGE_DEFNE()'
> 
>>  /* Defines COUNTER.  There must be exactly one such definition at 
>> file scope
>>   * within a program. */
>>  #define COVERAGE_DEFINE(COUNTER)
>> \
>>  DEFINE_STATIC_PER_THREAD_DATA(unsigned int, 
>> \
>>counter_##COUNTER, 0);
>> \
>> -static unsigned int COUNTER##_count(void)   
>> \
>> +static unsigned int COUNTER##_count(unsigned int i) 
>> \
>>  {   
>> \
>> -unsigned int *countp = counter_##COUNTER##_get();   
>> \
>> +volatile unsigned int *countp = coverage_val[i];
>> \
>>  unsigned int count = *countp;   
>> \
>>  *countp = 0;
>> \
>>  return count;   
>> \
>> @@ -72,11 +77,17 @@ void coverage_counter_register(struct 
>> coverage_counter*);
>>  {   
>> \
>>  *counter_##COUNTER##_get() += n;
>> \
>>  }   
>> \
>> -extern struct coverage_counter counter_##COUNTER;   
>> \
>> -struct coverage_counter counter_##COUNTER   
>> \
>> +extern thread_local struct coverage_counter 
>> counter_##COUNTER;  \
>> +thread_local struct coverage_counter counter_##COUNTER  
>> \
>>  = { #COUNTER, COUNTER##_count, 0, 0, {0}, {0} };
>>
> 
> 
> Here a minor issue is that 'thread_local' is not portable.  You need to define
> it using the ovs wrappers documented in lib/ovs-thread.h
> 
> 
> 
>> -OVS_CONSTRUCTOR(COUNTER##_init) {   
>> \
>> +static void COUNTER##_initializer(void) {   
>> \
>>  coverage_counter_register(&counter_##COUNTER);  
>> \
>> +coverage_val[n_coverage_counters - 1] = 
>> \
>> +  counter_##COUNTER##_get();
>>\
>> +}   
>> \
>> +OVS_CONSTRUCTOR(COUNTER##_init) {   
>> \
>> +coverage_initializer_register(&COUNTER##_initializer);  
>> \
>> +COUNTER##_initializer();
>> \
>>  }
>>
> 
> We really should not have main thread accessing the per thread data of other
> thread.  As mentioned in the comments in lib/ovs-thread.h,
> 
> """
>  * - The thread_local feature newly defined in C11  works with
>  *   any data type and initializer, and it is fast.  thread_local does not
>  *   require once-only initialization like pthread_key_t. * C11 does not*
> * *   define what happens if one attempts to access a thread_local object*
> * *   from a thread other than the one to which that object belongs.*  
> There
>  *   is no provision to call a user-specified destructor when a thread
>  *   ends.  Typical implementations allow for an arbitrary amount of
>  *   thread_local storage, but statically allocated only.
> """
> 
> We really should not play with any undefined behavior.  (since has no way
> to guarantee the portability).  In other words, we really should not use 
> thread local vari

Re: [ovs-dev] [PATCH v4 1/2] ofproto-dpif-upcall: Make ukey actions modifiable with RCU.

2015-08-13 Thread Jarno Rajahalme
Acked-by: Jarno Rajahalme 

  Jarno

> On Aug 12, 2015, at 6:14 PM, Ethan J. Jackson  wrote:
> 
> From: Ethan Jackson 
> 
> Future patches will need to modify ukey actions in some instances.
> This patch makes this possible by protecting them with RCU.  It also
> adds thread safety checks to enforce the new protection mechanism.
> 
> Signed-off-by: Ethan J. Jackson 
> ---
> ofproto/ofproto-dpif-upcall.c | 42 +-
> 1 file changed, 33 insertions(+), 9 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index e604aa3..2151831 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -215,7 +215,6 @@ struct udpif_key {
> size_t key_len;/* Length of 'key'. */
> const struct nlattr *mask; /* Datapath flow mask. */
> size_t mask_len;   /* Length of 'mask'. */
> -struct ofpbuf *actions;/* Datapath flow actions as nlattrs. */
> ovs_u128 ufid; /* Unique flow identifier. */
> bool ufid_present; /* True if 'ufid' is in datapath. */
> uint32_t hash; /* Pre-computed hash for 'key'. */
> @@ -228,6 +227,9 @@ struct udpif_key {
> uint64_t reval_seq OVS_GUARDED;   /* Tracks udpif->reval_seq. */
> bool flow_exists OVS_GUARDED; /* Ensures flows are only 
> deleted
>  once. */
> +/* Datapath flow actions as nlattrs.  Protected by RCU.  Read with
> + * ukey_get_actions(), and write with ukey_set_actions(). */
> +OVSRCU_TYPE(struct ofpbuf *) actions;
> 
> struct xlate_cache *xcache OVS_GUARDED;   /* Cache for xlate entries that
>* are affected by this ukey.
> @@ -287,6 +289,8 @@ static struct udpif_key *ukey_create_from_upcall(struct 
> upcall *,
> static int ukey_create_from_dpif_flow(const struct udpif *,
>   const struct dpif_flow *,
>   struct udpif_key **);
> +static void ukey_get_actions(struct udpif_key *, const struct nlattr 
> **actions,
> + size_t *size);
> static bool ukey_install_start(struct udpif *, struct udpif_key *ukey);
> static bool ukey_install_finish(struct udpif_key *ukey, int error);
> static bool ukey_install(struct udpif *udpif, struct udpif_key *ukey);
> @@ -1131,7 +1135,7 @@ process_upcall(struct udpif *udpif, struct upcall 
> *upcall,
> if (upcall->sflow) {
> union user_action_cookie cookie;
> const struct nlattr *actions;
> -int actions_len = 0;
> +size_t actions_len = 0;
> struct dpif_sflow_actions sflow_actions;
> memset(&sflow_actions, 0, sizeof sflow_actions);
> memset(&cookie, 0, sizeof cookie);
> @@ -1149,8 +1153,7 @@ process_upcall(struct udpif *udpif, struct upcall 
> *upcall,
> /* Lookup actions in userspace cache. */
> struct udpif_key *ukey = ukey_lookup(udpif, upcall->ufid);
> if (ukey) {
> -actions = ukey->actions->data;
> -actions_len = ukey->actions->size;
> +ukey_get_actions(ukey, &actions, &actions_len);
> dpif_sflow_read_actions(flow, actions, actions_len,
> &sflow_actions);
> }
> @@ -1277,8 +1280,8 @@ handle_upcalls(struct udpif *udpif, struct upcall 
> *upcalls,
> op->dop.u.flow_put.mask_len = ukey->mask_len;
> op->dop.u.flow_put.ufid = upcall->ufid;
> op->dop.u.flow_put.stats = NULL;
> -op->dop.u.flow_put.actions = ukey->actions->data;
> -op->dop.u.flow_put.actions_len = ukey->actions->size;
> +ukey_get_actions(ukey, &op->dop.u.flow_put.actions,
> + &op->dop.u.flow_put.actions_len);
> }
> 
> if (upcall->odp_actions.size) {
> @@ -1344,6 +1347,24 @@ ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid)
> return NULL;
> }
> 
> +/* Provides safe lockless access of RCU protected 'ukey->actions'.  Callers 
> may
> + * alternatively access the field directly if they take 'ukey->mutex'. */
> +static void
> +ukey_get_actions(struct udpif_key *ukey, const struct nlattr **actions, 
> size_t *size)
> +{
> +const struct ofpbuf *buf = ovsrcu_get(struct ofpbuf *, &ukey->actions);
> +*actions = buf->data;
> +*size = buf->size;
> +}
> +
> +static void
> +ukey_set_actions(struct udpif_key *ukey, const struct ofpbuf *actions)
> +{
> +ovsrcu_postpone(ofpbuf_delete,
> +ovsrcu_get_protected(struct ofpbuf *, &ukey->actions));
> +ovsrcu_set(&ukey->actions, ofpbuf_clone(actions));
> +}
> +
> static struct udpif_key *
> ukey_create__(const struct nlattr *key, size_t key_len,
>   const struct nlattr *mask, 

[ovs-dev] [PATCH] rhel: define rundir for older distros

2015-08-13 Thread Flavio Leitner
Some older distros might not define _rundir yet so in this
case the RPM build breaks.  This patch defines it to the
Fedora's default.

Signed-off-by: Flavio Leitner 
---
 rhel/openvswitch-fedora.spec.in | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
index 48c34de..68810e2 100644
--- a/rhel/openvswitch-fedora.spec.in
+++ b/rhel/openvswitch-fedora.spec.in
@@ -16,6 +16,12 @@
 # Enable PIE, bz#955181
 %global _hardened_build 1
 
+# some distros (e.g: RHEL-7) don't define _rundir macro yet
+# Fedora 15 onwards uses /run as _rundir
+%if 0%{!?_rundir:1}
+%define _rundir /run
+%endif
+
 Name: openvswitch
 Summary: Open vSwitch
 Group: System Environment/Daemons
-- 
2.1.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] coverage: Add coverage_try_clear() for performance-critical threads.

2015-08-13 Thread Alex Wang
For performance-critical threads like pmd threads, we currently make them
never call coverage_clear() to avoid contention over the global mutex
'coverage_mutex'.  So, even though pmd thread still keeps updating their
thread-local coverage count, the count is never attributed to the global
total.  But it is useful to have them available.

This commit makes this happen by implementing a non-contending version
of the clear function, coverage_try_clear().  The function will use
the ovs_mutex_trylock() and return immediately if the mutex cannot
be acquired.  Since threads like pmd thread are always busy-looping,
the lock will eventually be acquired.

Requested-by: Ilya Maximets 
Signed-off-by: Alex Wang 
---
 lib/coverage.c|   33 +
 lib/coverage.h|1 +
 lib/dpif-netdev.c |2 ++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/lib/coverage.c b/lib/coverage.c
index 6121956..9584cac 100644
--- a/lib/coverage.c
+++ b/lib/coverage.c
@@ -245,9 +245,14 @@ coverage_read(struct svec *lines)
 
 /* Runs approximately every COVERAGE_CLEAR_INTERVAL amount of time to
  * synchronize per-thread counters with global counters. Every thread maintains
- * a separate timer to ensure all counters are periodically aggregated. */
-void
-coverage_clear(void)
+ * a separate timer to ensure all counters are periodically aggregated.
+ *
+ * Uses 'ovs_mutex_trylock()' if 'trylock' is true.  This is to prevent
+ * multiple performance-critical threads contending over the 'coverage_mutex'.
+ *
+ * */
+static void
+coverage_clear__(bool trylock)
 {
 long long int now, *thread_time;
 
@@ -262,7 +267,15 @@ coverage_clear(void)
 if (now >= *thread_time) {
 size_t i;
 
-ovs_mutex_lock(&coverage_mutex);
+if (trylock) {
+/* Returns if cannot acquire lock. */
+if (ovs_mutex_trylock(&coverage_mutex)) {
+return;
+}
+} else {
+ovs_mutex_lock(&coverage_mutex);
+}
+
 for (i = 0; i < n_coverage_counters; i++) {
 struct coverage_counter *c = coverage_counters[i];
 c->total += c->count();
@@ -272,6 +285,18 @@ coverage_clear(void)
 }
 }
 
+void
+coverage_clear(void)
+{
+coverage_clear__(false);
+}
+
+void
+coverage_try_clear(void)
+{
+coverage_clear__(true);
+}
+
 /* Runs approximately every COVERAGE_RUN_INTERVAL amount of time to update the
  * coverage counters' 'min' and 'hr' array.  'min' array is for cumulating
  * per second counts into per minute count.  'hr' array is for cumulating per
diff --git a/lib/coverage.h b/lib/coverage.h
index 832c433..34a04aa 100644
--- a/lib/coverage.h
+++ b/lib/coverage.h
@@ -88,6 +88,7 @@ void coverage_counter_register(struct coverage_counter*);
 void coverage_init(void);
 void coverage_log(void);
 void coverage_clear(void);
+void coverage_try_clear(void);
 void coverage_run(void);
 
 #endif /* coverage.h */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index c144352..f4ff8cb 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -42,6 +42,7 @@
 #include "fat-rwlock.h"
 #include "flow.h"
 #include "cmap.h"
+#include "coverage.h"
 #include "latch.h"
 #include "list.h"
 #include "match.h"
@@ -2696,6 +2697,7 @@ reload:
 lc = 0;
 
 emc_cache_slow_sweep(&pmd->flow_cache);
+coverage_try_clear();
 ovsrcu_quiesce();
 
 atomic_read_relaxed(&pmd->change_seq, &seq);
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] coverage: fix coverage accounting for pmd threads

2015-08-13 Thread Alex Wang
On Thu, Aug 13, 2015 at 8:26 AM, Ilya Maximets 
wrote:

> Ok, thanks for explanation.
> In that case, I think, try-lock solution will be good.
>
> Best regards, Ilya Maximets.
>


Thx Ilya, for the suggestions and discussion

I posted a patch here:

http://openvswitch.org/pipermail/dev/2015-August/058855.html



> On 12.08.2015 23:52, Alex Wang wrote:
> > Hey Ilya,
> >
> > Thanks for the reply please see my comments inline,
> >
> > Thanks,
> > Alex Wang,
> >
> > On Wed, Aug 12, 2015 at 4:19 AM, Ilya Maximets  > wrote:
> >
> > I agree, that this solution is much more simple, but I don't like
> this race for
> > coverage_mutex.
> > Why this all done so?
> >
> >
> > Before implementing pmd threads, waiting on the ‘coverage_mutex’ is
> > affordable for all other threads.
> >
> >
> >
> >
> > I mean, why we should run coverage_clear every second?
> >
> >
> > Because, we (the main thread) need to calculate the per
> second/minute/hour
> > rate..
> >
> >
> >
> > What if we change type of
> > thread_local unsigned int counter_##COUNTER from unsigned int to
> unsigned long long
> > and will call coverage_clear() in coverage_read() ? (May be possible
> with my patch
> > applied) Also, we may collect stats for all threads separately.
> >
> >
> > There is one major issue with you change to 'COVERAGE_DEFNE()'
> >
> >>  /* Defines COUNTER.  There must be exactly one such definition
> at file scope
> >>   * within a program. */
> >>  #define COVERAGE_DEFINE(COUNTER)
>   \
> >>  DEFINE_STATIC_PER_THREAD_DATA(unsigned int,
>  \
> >>counter_##COUNTER, 0);
>   \
> >> -static unsigned int COUNTER##_count(void)
>  \
> >> +static unsigned int COUNTER##_count(unsigned int i)
>  \
> >>  {
>  \
> >> -unsigned int *countp = counter_##COUNTER##_get();
>  \
> >> +volatile unsigned int *countp = coverage_val[i];
>   \
> >>  unsigned int count = *countp;
>  \
> >>  *countp = 0;
>   \
> >>  return count;
>  \
> >> @@ -72,11 +77,17 @@ void coverage_counter_register(struct
> coverage_counter*);
> >>  {
>  \
> >>  *counter_##COUNTER##_get() += n;
>   \
> >>  }
>  \
> >> -extern struct coverage_counter counter_##COUNTER;
>  \
> >> -struct coverage_counter counter_##COUNTER
>  \
> >> +extern thread_local struct coverage_counter
> counter_##COUNTER;  \
> >> +thread_local struct coverage_counter
> counter_##COUNTER  \
> >>  = { #COUNTER, COUNTER##_count, 0, 0, {0}, {0} };
> >
> >
> > Here a minor issue is that 'thread_local' is not portable.  You need to
> define
> > it using the ovs wrappers documented in lib/ovs-thread.h
> >
> >
> >
> >> -OVS_CONSTRUCTOR(COUNTER##_init) {
>  \
> >> +static void COUNTER##_initializer(void) {
>  \
> >>  coverage_counter_register(&counter_##COUNTER);
>   \
> >> +coverage_val[n_coverage_counters - 1] =
>  \
> >> +
> counter_##COUNTER##_get();   \
> >> +}
>  \
> >> +OVS_CONSTRUCTOR(COUNTER##_init) {
>  \
> >> +
> coverage_initializer_register(&COUNTER##_initializer);  \
> >> +COUNTER##_initializer();
>   \
> >>  }
> >>
> >
> > We really should not have main thread accessing the per thread data of
> other
> > thread.  As mentioned in the comments in lib/ovs-thread.h,
> >
> > """
> >  * - The thread_local feature newly defined in C11  works
> with
> >  *   any data type and initializer, and it is fast.  thread_local
> does not
> >  *   require once-only initialization like pthread_key_t. * C11 does
> not*
> > * *   define what happens if one attempts to access a thread_local
> object*
> > * *   from a thread other than the one to which that object
> belongs.*  There
> >  *   is no provision to call a user-specified destructor when a
> thread
> >  *   ends.  Typical implementations allow for an arbitrary amount of
> >  *   thread_local storage, but statically allocated only.
> > """
> >
> > We really should not play with any undefined behavior.  (since has no way
> > to guarantee the portability).  In other words, we really should not use
> > thread local variables, if we ever want to re-architect the coverage
> module
> > to have main thread taking care of collecting the stats.
> >
> > So, we think the try lock solution should do good now,
> >
> > What do you think?
> >
> > Thanks,
> > Alex Wang,
> >

[ovs-dev] [PATCH] rhel: add installed but not packaged OVN tools

2015-08-13 Thread Flavio Leitner
This patch adds the following to OVN %files:
   /usr/bin/ovn-controller-vtep
   /usr/bin/ovn-sbctl
   /usr/share/man/man8/ovn-controller-vtep.8.gz
   /usr/share/man/man8/ovn-sbctl.8.gz

Signed-off-by: Flavio Leitner 
---
 rhel/openvswitch-fedora.spec.in | 4 
 1 file changed, 4 insertions(+)

diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
index 48c34de..5198dbe 100644
--- a/rhel/openvswitch-fedora.spec.in
+++ b/rhel/openvswitch-fedora.spec.in
@@ -317,16 +317,20 @@ rm -rf $RPM_BUILD_ROOT
 
 %files ovn
 %{_bindir}/ovn-controller
+%{_bindir}/ovn-controller-vtep
 %{_bindir}/ovn-nbctl
 %{_bindir}/ovn-northd
+%{_bindir}/ovn-sbctl
 %{_datadir}/openvswitch/scripts/ovn-ctl
 %{_mandir}/man8/ovs-testcontroller.8*
 %{_mandir}/man5/ovn-nb.5*
 %{_mandir}/man5/ovn-sb.5*
 %{_mandir}/man7/ovn-architecture.7*
 %{_mandir}/man8/ovn-controller.8*
+%{_mandir}/man8/ovn-controller-vtep.8*
 %{_mandir}/man8/ovn-ctl.8*
 %{_mandir}/man8/ovn-nbctl.8*
+%{_mandir}/man8/ovn-sbctl.8*
 %config %{_datadir}/openvswitch/ovn-nb.ovsschema
 %config %{_datadir}/openvswitch/ovn-sb.ovsschema
 %{_unitdir}/ovn-controller.service
-- 
2.1.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH/RFC v2] flow: Ignore invalid ICMPv6 fields when parsing packets

2015-08-13 Thread Jesse Gross
On Thu, Aug 13, 2015 at 12:55 AM, Simon Horman
 wrote:
> There is a miss-match between the handling of invalid ICMPv6 fields in the
> implementations of parse_icmpv6() in user-space and in the kernel datapath.
>
> This patch addresses that by modifying the user-space implementation to
> match that of the kernel datapath; processing is terminated without
> rather than with an error and partial information is cleared.
>
> With these changes the user-space implementation of parse_icmpv6()
> never returns an error. Accordingly the return type and caller have been
> updated.
>
> The original motivation for this is to allow matching the ICMPv6 type and
> code of packets with invalid neighbour discovery options although only the
> change around the '(!opt_len || opt_len > *sizep)' conditional is necessary
> to achieve that goal.
>
> Signed-off-by: Simon Horman 

Thanks, applied to master, branch-2.4, and branch-2.3.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] fix match issue for decap when the remote_ip=flow in userspace implementation

2015-08-13 Thread Mengke
From: mengke 

In the test the bridge is configured with type "netdev" and the VXLAN port is 
configured with "options: remote_ip=flow options: key=flow", the VXLAN packets 
can't be matched for the rule (ovs-ofctl add-flow br-int 
"priority=200,in_port=2,tun_src=200.2.0.101, ip, actions= drop").

After looking into the code, I found the reason is that when tunnel port is 
added, the match.wc.masks.nw_src argument is set "OVS_BE32_MAX" in 
"tnl_port_map_insert" function whether the "options: remote_ip" is set "flow" 
or constant IP_ADDR. This indicates the remote_ip is compared anyway, but the 
correct way is that the remote_ip should not be matched in 
"tnl_port_map_lookup" function when the "options: remote_ip" is set "flow".

The patch with unit test is as following:

---
 lib/tnl-ports.c |  3 ++-
 tests/ofproto-macros.at | 21 +
 tests/tunnel.at | 50 +
 3 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/lib/tnl-ports.c b/lib/tnl-ports.c
index a0a73c8..dc1ab01 100644
--- a/lib/tnl-ports.c
+++ b/lib/tnl-ports.c
@@ -97,7 +97,8 @@ tnl_port_map_insert(odp_port_t port, ovs_be32 ip_dst, 
ovs_be16 udp_port,
 match.wc.masks.nw_proto = 0xff;
 match.wc.masks.nw_frag = 0xff;  /* XXX: No fragments support. */
 match.wc.masks.tp_dst = OVS_BE16_MAX;
-match.wc.masks.nw_src = OVS_BE32_MAX;
+if(ip_dst)
+match.wc.masks.nw_src = OVS_BE32_MAX;
 
 cls_rule_init(&p->cr, &match, 0, CLS_MIN_VERSION); /* Priority == 0. */
 ovs_refcount_init(&p->ref_cnt);
diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index 74b02b7..b08cd26 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -115,6 +115,27 @@ m4_define([OVS_VSWITCHD_START],
AT_CHECK([ovs-vsctl -- add-br br0 -- set bridge br0 datapath-type=dummy 
other-config:datapath-id=fedcba9876543210 other-config:hwaddr=aa:55:aa:55:00:00 
protocols=[[OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15]] 
fail-mode=secure -- $1 m4_if([$2], [], [], [| ${PERL} $srcdir/uuidfilt.pl])], 
[0], [$2])
 ])
 
+# OVS_VSWITCHD_START_NETDEV([vsctl-args], [vsctl-output], [=override])
+#
+# Creates a database and starts ovsdb-server, starts ovs-vswitchd
+# connected to that database, calls ovs-vsctl to create a bridge named
+# br0 with predictable settings, passing 'vsctl-args' as additional
+# commands to ovs-vsctl.  If 'vsctl-args' causes ovs-vsctl to provide
+# output (e.g. because it includes "create" commands) then 'vsctl-output'
+# specifies the expected output after filtering through uuidfilt.pl.
+#
+# If a test needs to use "system" devices (as dummies), then specify
+# =override (literally) as the third argument.  Otherwise, system devices
+# won't work at all (which makes sense because tests should not access a
+# system's real Ethernet devices).
+m4_define([OVS_VSWITCHD_START_NETDEV],
+  [_OVS_VSWITCHD_START([--disable-system])
+
+   dnl Add bridges, ports, etc.
+   AT_CHECK([ovs-vsctl -- add-br br0 -- set bridge br0 datapath-type=netdev 
protocols=[[OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15]] 
fail-mode=secure -- $1 m4_if([$2], [], [], [| ${PERL} $srcdir/uuidfilt.pl])], 
[0], [$2])
+])
+
+
 m4_divert_push([PREPARE_TESTS])
 check_logs () {
 sed -n "$1
diff --git a/tests/tunnel.at b/tests/tunnel.at
index 7ff1ba4..3bab497 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -411,3 +411,53 @@ AT_CHECK([tail -1 stdout], [0],
 ])
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([userspace - tunnel-test for remote_ip=flow])
+OVS_VSWITCHD_START_NETDEV([dnl
+add-port br0 p1 -- set Interface p1 type=vxlan options:key=flow \
+options:remote_ip=flow ofport_request=1])
+
+AT_DATA([flows.txt], [dnl
+in_port=91 actions=local
+])
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
+AT_CHECK([ovs-appctl ofproto/trace br0 
'in_port=91,dl_src=50:54:00:00:00:05,dl_dst=50:54:00:00:00:07,dl_type=0x0800,nw_src=5.5.5.5,nw_dst=200.2.0.100,nw_proto=17,nw_tos=0,nw_ttl=128,udp_src=50031,udp_dst=4789'],
 [0], [stdout])
+
+AT_CHECK([tail -1 stdout], [0],
+  [Datapath actions: tnl_pop(1) 
+])
+AT_CHECK([ovs-vsctl del-br br0])
+OVS_VSWITCHD_STOP(["/The Open vSwitch kernel module is probably not loaded/d"])
+AT_CLEANUP
+
+AT_SETUP([userspace ofproto-dpif - set_field - tun_src/tun_dst/tun_id])
+OVS_VSWITCHD_START_NETDEV([dnl
+add-port br0 p1 -- set Interface p1 type=vxlan options:key=flow \
+options:remote_ip=1.1.1.1 ofport_request=1 \
+-- add-port br0 p2 -- set Interface p2 type=vxlan options:key=flow \
+options:remote_ip=flow ofport_request=2 \
+-- add-port br0 p3 -- set Interface p3 type=vxlan options:key=flow \
+options:remote_ip=flow options:local_ip=flow ofport_request=3 \
+-- add-port br0 p4 -- set Interface p4 type=vxlan options:key=3 \
+options:remote_ip=flow ofport_request=4 \
+-- add-port br0 p5 -- set Interface p5 type

Re: [ovs-dev] [PATCH] fix match issue for decap when the remote_ip=flow in userspace implementation

2015-08-13 Thread Jesse Gross
On Thu, Aug 13, 2015 at 6:19 PM, Mengke  wrote:
> From: mengke 
>
> In the test the bridge is configured with type "netdev" and the VXLAN port is 
> configured with "options: remote_ip=flow options: key=flow", the VXLAN 
> packets can't be matched for the rule (ovs-ofctl add-flow br-int 
> "priority=200,in_port=2,tun_src=200.2.0.101, ip, actions= drop").
>
> After looking into the code, I found the reason is that when tunnel port is 
> added, the match.wc.masks.nw_src argument is set "OVS_BE32_MAX" in 
> "tnl_port_map_insert" function whether the "options: remote_ip" is set "flow" 
> or constant IP_ADDR. This indicates the remote_ip is compared anyway, but the 
> correct way is that the remote_ip should not be matched in 
> "tnl_port_map_lookup" function when the "options: remote_ip" is set "flow".
>
> The patch with unit test is as following:

This has already been fixed. I guess you need to update your git repository?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] fix match issue for decap when the remote_ip=flow in userspace implementation

2015-08-13 Thread Liu, Mengke
OK , Thanks.

-Original Message-
From: Jesse Gross [mailto:je...@nicira.com] 
Sent: Friday, August 14, 2015 9:17 AM
To: Liu, Mengke
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] fix match issue for decap when the 
remote_ip=flow in userspace implementation

On Thu, Aug 13, 2015 at 6:19 PM, Mengke  wrote:
> From: mengke 
>
> In the test the bridge is configured with type "netdev" and the VXLAN port is 
> configured with "options: remote_ip=flow options: key=flow", the VXLAN 
> packets can't be matched for the rule (ovs-ofctl add-flow br-int 
> "priority=200,in_port=2,tun_src=200.2.0.101, ip, actions= drop").
>
> After looking into the code, I found the reason is that when tunnel port is 
> added, the match.wc.masks.nw_src argument is set "OVS_BE32_MAX" in 
> "tnl_port_map_insert" function whether the "options: remote_ip" is set "flow" 
> or constant IP_ADDR. This indicates the remote_ip is compared anyway, but the 
> correct way is that the remote_ip should not be matched in 
> "tnl_port_map_lookup" function when the "options: remote_ip" is set "flow".
>
> The patch with unit test is as following:

This has already been fixed. I guess you need to update your git repository?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH/RFC] openvswitch: Retain parsed IPv6 header fields in flow on error skipping extension headers

2015-08-13 Thread Simon Horman
When an error occurs skipping IPv6 extension headers retain the already
parsed IP protocol and IPv6 addresses in the flow. Also assume that the
packet is not a fragment in the absence of information to the contrary;
that is always use the frag_off value set by ipv6_skip_exthdr().

This allows matching on the IP protocol and IPv6 addresses of packets
with malformed extension headers.

Signed-off-by: Simon Horman 

---

* Some consideration should be given to unwanted side effects of this patch
  as it affects the handling of malformed packets.

Signed-off-by: Simon Horman 
---
 net/openvswitch/flow.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 8db22ef73626..3c5336c5d4ea 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -271,8 +271,6 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct 
sw_flow_key *key)
key->ipv6.addr.dst = nh->daddr;
 
payload_ofs = ipv6_skip_exthdr(skb, payload_ofs, &nexthdr, &frag_off);
-   if (unlikely(payload_ofs < 0))
-   return -EINVAL;
 
if (frag_off) {
if (frag_off & htons(~0x7))
@@ -283,6 +281,13 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct 
sw_flow_key *key)
key->ip.frag = OVS_FRAG_TYPE_NONE;
}
 
+   /* Delayed handling of error in ipv6_skip_exthdr() as it
+* always sets frag_off to a valid value which may be
+* used to set key->ip.frag above.
+*/
+   if (unlikely(payload_ofs < 0))
+   return -EINVAL;
+
nh_len = payload_ofs - nh_ofs;
skb_set_transport_header(skb, nh_ofs + nh_len);
key->ip.proto = nexthdr;
@@ -622,8 +627,6 @@ static int key_extract(struct sk_buff *skb, struct 
sw_flow_key *key)
 
nh_len = parse_ipv6hdr(skb, key);
if (unlikely(nh_len < 0)) {
-   memset(&key->ip, 0, sizeof(key->ip));
-   memset(&key->ipv6.addr, 0, sizeof(key->ipv6.addr));
if (nh_len == -EINVAL) {
skb->transport_header = skb->network_header;
error = 0;
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] tun-metadata: Always set option present when copying data.

2015-08-13 Thread Jesse Gross
Whenever we write into a tunnel option field, we also need to mark
it as significant. If we don't, then the data will later be ignored.

We currently do this in every case except for flow metadata. This causes
us to not correctly serialize the tunnel metadata for Packet Ins to the
controller.

Rather than separately writing the data and marking the options as present,
it is better to combine the two steps to ensure that one can never be
done without the other.

Signed-off-by: Jesse Gross 
---
 lib/tun-metadata.c   | 27 ++-
 tests/tunnel-push-pop.at | 16 +---
 2 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
index 216d5e4..acaacff 100644
--- a/lib/tun-metadata.c
+++ b/lib/tun-metadata.c
@@ -60,7 +60,8 @@ static enum ofperr tun_metadata_add_entry(struct tun_table 
*map, uint8_t idx,
 static void tun_metadata_del_entry(struct tun_table *map, uint8_t idx)
 OVS_REQUIRES(tab_mutex);
 static void memcpy_to_metadata(struct tun_metadata *dst, const void *src,
-   const struct tun_metadata_loc *);
+   const struct tun_metadata_loc *,
+   unsigned int idx);
 static void memcpy_from_metadata(void *dst, const struct tun_metadata *src,
  const struct tun_metadata_loc *);
 
@@ -273,10 +274,8 @@ tun_metadata_write(struct flow_tnl *tnl,
 }
 
 loc = &map->entries[idx].loc;
-
-ULLONG_SET1(tnl->metadata.present.map, idx);
 memcpy_to_metadata(&tnl->metadata,
-   value->tun_metadata + mf->n_bytes - loc->len, loc);
+   value->tun_metadata + mf->n_bytes - loc->len, loc, idx);
 }
 
 static const struct tun_metadata_loc *
@@ -355,8 +354,8 @@ tun_metadata_set_match(const struct mf_field *mf, const 
union mf_value *value,
mask->tun_metadata[data_offset + i];
 }
 }
-ULLONG_SET1(match->flow.tunnel.metadata.present.map, idx);
-memcpy_to_metadata(&match->flow.tunnel.metadata, data.tun_metadata, loc);
+memcpy_to_metadata(&match->flow.tunnel.metadata, data.tun_metadata,
+   loc, idx);
 
 if (!value) {
 memset(data.tun_metadata, 0, loc->len);
@@ -365,8 +364,8 @@ tun_metadata_set_match(const struct mf_field *mf, const 
union mf_value *value,
 } else {
 memcpy(data.tun_metadata, mask->tun_metadata + data_offset, loc->len);
 }
-ULLONG_SET1(match->wc.masks.tunnel.metadata.present.map, idx);
-memcpy_to_metadata(&match->wc.masks.tunnel.metadata, data.tun_metadata, 
loc);
+memcpy_to_metadata(&match->wc.masks.tunnel.metadata, data.tun_metadata,
+   loc, idx);
 }
 
 static bool
@@ -427,11 +426,11 @@ tun_metadata_get_fmd(const struct flow_tnl *tnl, struct 
match *flow_metadata)
 
 memcpy_from_metadata(opts.tun_metadata, &flow.metadata, old_loc);
 memcpy_to_metadata(&flow_metadata->flow.tunnel.metadata,
-   opts.tun_metadata, new_loc);
+   opts.tun_metadata, new_loc, i);
 
 memset(opts.tun_metadata, 0xff, old_loc->len);
 memcpy_to_metadata(&flow_metadata->wc.masks.tunnel.metadata,
-   opts.tun_metadata, new_loc);
+   opts.tun_metadata, new_loc, i);
 }
 }
 
@@ -456,7 +455,7 @@ tun_meta_find_key(const struct hmap *hmap, uint32_t key)
 
 static void
 memcpy_to_metadata(struct tun_metadata *dst, const void *src,
-   const struct tun_metadata_loc *loc)
+   const struct tun_metadata_loc *loc, unsigned int idx)
 {
 const struct tun_metadata_loc_chain *chain = &loc->c;
 int addr = 0;
@@ -467,6 +466,8 @@ memcpy_to_metadata(struct tun_metadata *dst, const void 
*src,
 addr += chain->len;
 chain = chain->next;
 }
+
+ULLONG_SET1(dst->present.map, idx);
 }
 
 static void
@@ -654,8 +655,8 @@ tun_metadata_from_geneve__(const struct tun_metadata 
*flow_metadata,
flow_opt->type));
 if (entry) {
 if (entry->loc.len == flow_opt->length * 4) {
-memcpy_to_metadata(metadata, opt + 1, &entry->loc);
-ULLONG_SET1(metadata->present.map, entry - map->entries);
+memcpy_to_metadata(metadata, opt + 1, &entry->loc,
+   entry - map->entries);
 } else {
 return EINVAL;
 }
diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
index 0f1724a..a34af3f 100644
--- a/tests/tunnel-push-pop.at
+++ b/tests/tunnel-push-pop.at
@@ -123,16 +123,26 @@ AT_CHECK([ovs-ofctl dump-ports int-br | grep 'port  3'], 
[0], [dnl
 ])
 
 dnl Check decapsulation of Geneve packet with options
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl monitor int-br 65534 --detach --no-chdir --pidfile 2> 
ofctl_mon