from:"Mike"

[ovs-dev] Hi

2016-01-04 Thread Mike

 

-- 
My name is Mike, from United States. I want to get to know you better,
if I may be so bold. I consider myself an easy-going man, and I am
currently looking for a relationship in which I feel loved. I want us to
be friends.

Please tell me more about yourself, if you don't mind.

Regards,
Mike.
 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Mike Bayer




On 03/10/2016 06:50 PM, Ben Pfaff wrote:


I've been a fan of Postgres since I used in the 1990s for a web-based
application.  It didn't occur to me that it was appropriate here.
Julien, thanks so much for joining the discussion.


So yes, it has everything OVN needs. It can push notifications to
clients via the NOTIFY¹ command (that you can use in any
procedure/trigger). For example, you could imagine creating a trigger
that sends a JSON payload for each new update/insert in the database.
That's literally 10 lines of PL/SQL.


That's good to know.  I hadn't figured out how to do this kind of thing
with SQL-based systems.


¹  http://www.postgresql.org/docs/9.5/static/sql-notify.html

I think that PostgreSQL would be the safer bet in this move, as:
- building something on top of etcd would seem weak w.r.t your
schema/table requirements
- investing in OVSDB (though keep in mind I don't know it :-) would
probably end up in redoing a job PostgreSQL people already have done
better than you would ;-)

The only questions that this raises to me are:
- whether PostgreSQL is too large/complex to deploy for OVN. Seeing the
   list of candidates that were evaluated, I wouldn't think so, but there
   can be a lot of different opinions on that based on different
   perception of PostgreSQL. And since you're targeting a network DB, you
   definitely need a daemon configured and set-up so I'm only partially
   worried here. :)


Hi there, Russell Bryant invited me to this list to chime in on this 
discussion.   If it were me, I *might* not build out based on NOTIFY as 
the core system of notifying clients, and I'd likely stick with a tool
that's designed for cluster communication and in this case the custom 
service that's already there seems like it might be the best bet; I'd 
actually build out the service and use RAFT to keep it in sync with 
itself.


The reason is because Postgresql is not supplying you with an easy 
out-of-the-box HA component in any case (Galera does, but then you don't 
get NOTIFY), so you're going to have to build out something like RAFT or 
such on the PG side in any case in order to handle failover. 
Postgresql's HA story is not very good right now, it's very much 
roll-your-own, and it is nowhere near the sophistication of Galera's 
multi-master approach which would be an enormous muilt-year undertaking 
to recreate on Posgtresql. IMO building out the HA part from scratch 
is the difficult part; being able to send events to clients is pretty 
easy from any kind of custom service.   Since to do HA in PG you'd have 
to build your own event-dispatch system anyway (e.g. to determine a node 
is down and send out the call to pick a new master node as well as some 
method to get all the clients to send data updates to this node), might 
as well just build your custom service to do just the thing you need.











___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Gold dust/Bar

2012-12-29 Thread Mike Amoasie

We are engaged in mechanized mining of gold and precious minerals in  
the West Africa Sub-Region. In view of your correspondence of  
interest, this is our Full Corporate Offer (F.C.O) for your perusal  
and subsequent action.


1.   Product : Alluvial Gold Dust
2.   Quality: 23+carats
3.   Purity : 92% to 94%
4.   Quantity:  50kg to 100kg
5.   Origin: West Africa/Ghana
6.   Price: US 38,000 per kilo

7.   The goods would be delivered to buyer?s destination.

8.   Buyer will have to come to (Ghana) but in the situaition where by  
buyer cannot come to Ghana we can make an alternative arrangement to  
send the Gold via London to buyers destination for verification and  
testing of the gold. When buyer/mandate is satisfied with the  
inspection of the gold and sample of the product taken for the assay  
analysis the sellers mandate and the buyer will then convey the  
consignment to the buyer?s refinery to be refined. After Final Assay  
at buyer?s destination the buyer will then wire the full amount into  
sellers account or supply machinery.


9.   The buyer will have to be responsible for the expense cost for  
the transport of GOLD from London to its country after being satisfied  
with assay made in London.


10. Upon acceptance of these conditions our Lawyers will send to the  
buyer/buyers a Contractual Agreement binding both parties ?seller and  
buyer/mandate? together to honour and respect the seal and legal  
documents as signed by the Lawyers. Any parties who the breach or are  
in default of the said contract will then be held fully accountable  
for the cost incurred.



Upon receiving this FCO a reply is needed as soon as possible as  
acknowledgement of the FCO.


Best Regards
Mike Amoasie

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] netdev-linux: Don't restrict policing to IPv4 and don't call "tc".

2011-12-05 Thread Mike Bursell

I'm certainly very happy with a re-write: this seems like a much nicer way of 
doing things.  

-Mike.

> -Original Message-
> From: Justin Pettit [mailto:jpet...@nicira.com]
> Sent: 05 December 2011 00:57
> To: dev@openvswitch.org
> Cc: Mike Bursell; Jamal Hadi Salim
> Subject: [PATCH] netdev-linux: Don't restrict policing to IPv4 and don't call
> "tc".
> 
> Mike Bursell pointed out that our policer only works on IPv4 traffic--and
> specifically not IPv6.  By using the "basic" filter, we can enforce policing 
> on all
> traffic for a particular interface.
> 
> Jamal Hadi Salim pointed out that calling "tc" directly with system() is 
> pretty
> ugly.  This commit switches our remaining "tc" calls to directly sending the
> appropriate netlink messages.
> 
> Suggested-by: Mike Bursell 
> Suggested-by: Jamal Hadi Salim 
> ---
>  AUTHORS|2 +
>  INSTALL.Linux  |6 +-
>  lib/netdev-linux.c |  191 +++---
> -
>  3 files changed, 136 insertions(+), 63 deletions(-)
> 
> diff --git a/AUTHORS b/AUTHORS
> index 6cf99da..964e32d 100644
> --- a/AUTHORS
> +++ b/AUTHORS
> @@ -78,6 +78,7 @@ Hassan Khan hassan.k...@seecs.edu.pk
>  Hector Oron hector.o...@gmail.com
>  Henrik Amrenhen...@nicira.com
>  Jad Naous   jna...@gmail.com
> +Jamal Hadi Salimh...@cyberus.ca
>  Jan Medved  jmed...@juniper.net
>  Janis Hamme janis.ha...@student.kit.edu
>  Jari Sundellsundell.softw...@gmail.com
> @@ -90,6 +91,7 @@ Krishna Miriyalakris...@nicira.com
>  Luiz Henrique Ozaki luiz.oz...@gmail.com
>  Michael Hu  m...@nicira.com
>  Michael Mao m...@nicira.com
> +Mike Bursellmike.burs...@citrix.com
>  Murphy McCauley murphy.mccau...@gmail.com
>  Mikael Doverhag mdover...@nicira.com
>  Niklas Anderssonnanders...@nicira.com
> diff --git a/INSTALL.Linux b/INSTALL.Linux index 4477a60..7a55ccd 100644
> --- a/INSTALL.Linux
> +++ b/INSTALL.Linux
> @@ -46,9 +46,9 @@ INSTALL.userspace for more information.
>bridge") before starting the datapath.
> 
>For optional support of ingress policing, you must enable kernel
> -  configuration options NET_CLS_ACT, NET_CLS_U32, NET_SCH_INGRESS,
> -  and NET_ACT_POLICE, either built-in or as modules.
> -  (NET_CLS_POLICE is obsolete and not needed.)
> +  configuration options NET_CLS_BASIC, NET_SCH_INGRESS, and
> +  NET_ACT_POLICE, either built-in or as modules.  (NET_CLS_POLICE is
> +  obsolete and not needed.)
> 
>If GRE tunneling is being used it is recommended that the kernel
>be compiled with IPv6 support (CONFIG_IPV6).  This allows for diff 
> --git
> a/lib/netdev-linux.c b/lib/netdev-linux.c index 90e88c7..8293bb1 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -326,6 +327,9 @@ static unsigned int tc_buffer_per_jiffy(unsigned int
> rate);  static struct tcmsg *tc_make_request(const struct netdev *, int type,
>   unsigned int flags, struct ofpbuf *);  
> static int
> tc_transact(struct ofpbuf *request, struct ofpbuf **replyp);
> +static int tc_add_del_ingress_qdisc(struct netdev *netdev, bool add);
> +static int tc_add_policer(struct netdev *netdev, int kbits_rate,
> +  int kbits_burst);
> 
>  static int tc_parse_qdisc(const struct ofpbuf *, const char **kind,
>struct nlattr **options); @@ -1564,50 +1568,8 @@
> netdev_linux_set_advertisements(struct netdev *netdev, uint32_t
> advertise)
> ETHTOOL_SSET, "ETHTOOL_SSET");  }
> 
> -#define POLICE_ADD_CMD "/sbin/tc qdisc add dev %s handle : ingress"
> -#define POLICE_CONFIG_CMD "/sbin/tc filter add dev %s parent :
> protocol ip prio 50 u32 match ip src 0.0.0.0/0 police rate %dkbit burst %dk 
> mtu
> 65535 drop flowid :1"
> -
> -/* Remove ingress policing from 'netdev'.  Returns 0 if successful, otherwise
> a
> - * positive errno value.
> - *
> - * This function is equivalent to running
> - * /sbin/tc qdisc del dev %s handle : ingress
> - * but it is much, much faster.
> - */
> -static int
> -netdev_linux_remove_policing(struct netdev *netdev) -{
> -struct netdev_dev_linux *netdev_dev =
> -netdev_dev_linux_cast(netdev_get_dev(netdev));
> -const char *netdev_name = netdev_get_n

[ovs-dev] hello

2012-02-01 Thread Rechel Mike

Hello


i am rechel


I guess you will not be surprise to receive my 
mail? i saw your profile and it sound well.I will like us to exchange 
good relationship.I am rechel by name,No kid and never marry.so i hope 
we can be good friend i hope both of us can make it  together.so reply 
me at (rechelm...@yahoo.com)so that i will send you my pic.yours rechel___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [Patch] Fixes DPDK Queue size for IVSHMEM VM communications

2014-08-15 Thread Polehn, Mike A

Separates loop list process size from the endpoint DPDK queue size.
Corrected DPDK queue size to be a power of 2 which allows dpdkr interface to be 
created.
Increased queue size to improve zero loss data rate.
Changed NIC queue size comment to make NIC queue size formula more clear.

Signed-off-by: Mike A. Polehn 

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
old mode 100644
new mode 100755
index 6ee9803..3a19db0
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -71,8 +71,11 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 
 #define NON_PMD_THREAD_TX_QUEUE 0
 
-#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue, Max 
(n+32<=4096)*/
-#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue, Max 
(n+32<=4096)*/
+#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue, Size: 
(x*32<=4064)*/
+#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue, Size: 
(x*32<=4064)*/
+
+#define DPDK_RX_Q_SIZE 2048  /* Size of DPDK RX Client Queue, Size: 
(x**2)*/
+#define DPDK_TX_Q_SIZE 2048  /* Size of DPDK TX Client Queue, Size: 
(x**2)*/
 
 /* XXX: Needs per NIC value for these constants. */
 #define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
@@ -1236,7 +1239,7 @@ dpdk_ring_create(const char dev_name[], unsigned int 
port_no,
 return -err;
 }
 
-ivshmem->cring_tx = rte_ring_create(ring_name, MAX_RX_QUEUE_LEN, SOCKET0, 
0);
+ivshmem->cring_tx = rte_ring_create(ring_name, DPDK_TX_Q_SIZE, SOCKET0, 0);
 if (ivshmem->cring_tx == NULL) {
 rte_free(ivshmem);
 return ENOMEM;
@@ -1247,7 +1250,7 @@ dpdk_ring_create(const char dev_name[], unsigned int 
port_no,
 return -err;
 }
 
-ivshmem->cring_rx = rte_ring_create(ring_name, MAX_RX_QUEUE_LEN, SOCKET0, 
0);
+ivshmem->cring_rx = rte_ring_create(ring_name, DPDK_RX_Q_SIZE, SOCKET0, 0);
 if (ivshmem->cring_rx == NULL) {
 rte_free(ivshmem);
 return ENOMEM;
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [Patch] Documentation for DPDK IVSHMEM VM Communications

2014-08-15 Thread Polehn, Mike A

Adds documentation on how to run IVSHMEM communication through VM.

Signed-off-by: Mike A. Polehn 

diff --git a/INSTALL.DPDK b/INSTALL.DPDK
index 4551f4c..8d866e9 100644
--- a/INSTALL.DPDK
+++ b/INSTALL.DPDK
@@ -19,10 +19,14 @@ Recommended to use DPDK 1.6.
 DPDK:
 Set dir i.g.:   export DPDK_DIR=/usr/src/dpdk-1.6.0r2
 cd $DPDK_DIR
-update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate 
single lib file.
+update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate
+single lib file (modification also required for IVSHMEM build).
 CONFIG_RTE_BUILD_COMBINE_LIBS=y
 
-make install T=x86_64-default-linuxapp-gcc
+For default install without IVSHMEM (old):
+  make install T=x86_64-default-linuxapp-gcc
+To include IVSHMEM (shared memory):
+  make install T=x86_64-ivshmem-linuxapp-gcc
 For details refer to  http://dpdk.org/
 
 Linux kernel:
@@ -32,7 +36,10 @@ DPDK kernel requirement.
 OVS:
 cd $(OVS_DIR)/openvswitch
 ./boot.sh
-export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
+Without IVSHMEM
+  export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
+With IVSHMEM:
+  export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-ivshmem-linuxapp-gcc
 ./configure --with-dpdk=$DPDK_BUILD
 make
 
@@ -44,12 +51,18 @@ Using the DPDK with ovs-vswitchd:
 
 Setup system boot:
kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=1
+To include 3 GB memory for VM (2 socket system, half on each NUMA node)
+   kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=8
 
 First setup DPDK devices:
   - insert uio.ko
 e.g. modprobe uio
-  - insert igb_uio.ko
+
+  - insert igb_uio.ko (non-IVSHMEM case)
 e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
+  - insert igb_uio.ko (IVSHMEM case)
+e.g. insmod DPDK/x86_64-ivshmem-linuxapp-gcc/kmod/igb_uio.ko
+
   - Bind network device to ibg_uio.
 e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
 Alternate binding method:
@@ -73,7 +86,7 @@ First setup DPDK devices:
 
 Prepare system:
   - mount hugetlbfs
-e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
+e.g. mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
 
 Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
 
@@ -91,7 +104,7 @@ Start ovsdb-server as discussed in INSTALL doc:
   ./ovsdb/ovsdb-server 
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
   --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
   --private-key=db:Open_vSwitch,SSL,private_key \
-  --certificate=dbitch,SSL,certificate \
+  --certificate=Open_vSwitch,SSL,certificate \
   --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
 First time after db creation, initialize:
   cd $OVS_DIR
@@ -105,12 +118,13 @@ for dpdk initialization.
 
e.g.
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
-   ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK  --pidfile 
--detach
+   ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile 
--detach
 
-If allocated more than 1 GB huge pages, set amount and use NUMA node 0 memory:
+If allocated more than one 1 GB hugepage (as for IVSHMEM), set amount and use 
NUMA
+node 0 memory:
 
./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
-  -- unix:$DB_SOCK  --pidfile --detach
+  -- unix:$DB_SOCK --pidfile --detach
 
 To use ovs-vswitchd with DPDK, create a bridge with datapath_type
 "netdev" in the configuration database.  For example:
@@ -136,9 +150,7 @@ Test flow script across NICs (assuming ovs in /usr/src/ovs):
 # Script:
 
 #! /bin/sh
-
 # Move to command directory
-
 cd /usr/src/ovs/utilities/
 
 # Clear current flows
@@ -158,7 +170,8 @@ help.
 
 At this time all ovs-vswitchd tasks end up being affinitized to cpu core 0
 but this may change. Lets pick a target core for 100% task to run on, i.e. 
core 7.
-Also assume a dual 8 core sandy bridge system with hyperthreading enabled.
+Also assume a dual 8 core sandy bridge system with hyperthreading enabled
+where CPU1 has cores 0,...,7 and 16,...,23 & CPU2 cores 8,...,15 & 24,...,31.
 (A different cpu configuration will have different core mask requirements).
 
 To give better ownership of 100%, isolation maybe useful.
@@ -178,11 +191,11 @@ taskset -p 080 1762
   pid 1762's new affinity mask: 80
 
 Assume that all other ovs-vswitchd threads to be on other socket 0 cores.
-Affinitize the rest of the ovs-vswitchd thread ids to 0x0FF007F
+Affinitize the rest of the ovs-vswitchd thread ids to 0x07F007F
 
-taskset -p 0x0FF007F {thread pid, e.g 1738}
+taskset -p 0x07F007F {thread pid, e.g 1738}
   pid 1738's current affinity mask: 1
-  pid 1738's new affinity mask: ff007f
+  pid 1738's new affinity mask: 7f007f
 . . .
 
 The core 23 is left idle, which allows core 7 to run at full rate.
@@ -207,8 +220,8 @@ with the ring naming used within ovs.
 l

Re: [ovs-dev] [Patch] Documentation for DPDK IVSHMEM VM Communications

2014-09-03 Thread Polehn, Mike A

The setup for packet transfer between the switch and VM by shared memory 
(IVSHMEM) is moderately complex and most details are not easily found.  Also 
this is a different transfer method than user side vhost which copies between 
the separate memory spaces at a cost of slower packet rate or higher CPU core 
load(s). Shared memory transfer is much more efficient transfer method since it 
is only copying packet pointers and not packet data. However lacks security 
since the VM can see all the packet buffer memory space at all times. However 
efficiency vs security is something only the user of the system can determine 
since they know if the system is a closed environment or not.

I put in enough details to allow someone, who has not been intimately involved 
in doing IVSHMEM packet processing work to setup and get the shared memory 
transfer working in a short time given today's current build state. Including 
the system packages to allow a proper build (qemu in particular) maybe over 
kill, but figuring out these required packages can be very time consuming. 
Unless it is working, you cannot experiment with or test the IVSHMEM shared 
memory operation or even move the method forward as an alternative setup, the 
correct information is just not readily available, weeks can easily be spent 
(only if you are very determined to get it to work).

The INSTALL.DPDK also needs to be update for DPDK 1.7 ...

Would you like to have this put in as separate doc, INSTALL.DPDK.IVSHMEM?

Mike

-Original Message-
From: Pravin Shelar [mailto:pshe...@nicira.com] 
Sent: Friday, August 29, 2014 3:54 PM
To: Polehn, Mike A
Cc: d...@openvswitch.com
Subject: Re: [ovs-dev] [Patch] Documentation for DPDK IVSHMEM VM Communications

On Fri, Aug 15, 2014 at 7:07 AM, Polehn, Mike A  wrote:
> Adds documentation on how to run IVSHMEM communication through VM.
>
I think INSTALL.DPDK is getting rather large and hard to understand with all 
details.
so I dropped "Alternative method to get QEMU, download and build from OVDK" 
section.
We can add this documentation to separate file once vhost support is added.

Thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition

2014-06-09 Thread Polehn, Mike A

Added details to dpdk poll mode setup and use to make it easier for some not 
familiar to get it operating.

Signed-off-by: Mike A. Polehn 

diff --git a/INSTALL.DPDK b/INSTALL.DPDK
index 3e0247a..f55ae8b 100644
--- a/INSTALL.DPDK
+++ b/INSTALL.DPDK
@@ -17,7 +17,8 @@ Building and Installing:
 Recommended to use DPDK 1.6.
 
 DPDK:
-cd DPDK
+Set dir i.g.:   export DPDK_DIR=/usr/src/dpdk-1.6.0r2
+cd $DPDK_DIR
 update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate 
single lib file.
 CONFIG_RTE_BUILD_COMBINE_LIBS=y
 
@@ -31,7 +32,8 @@ DPDK kernel requirement.
 OVS:
 cd $(OVS_DIR)/openvswitch
 ./boot.sh
-./configure --with-dpdk=$(DPDK_BUILD)
+export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
+./configure --with-dpdk=$DPDK_BUILD
 make
 
 Refer to INSTALL.userspace for general requirements of building
@@ -40,25 +42,77 @@ userspace OVS.
 Using the DPDK with ovs-vswitchd:
 -
 
+Setup system boot:
+   kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=1
+
 First setup DPDK devices:
   - insert uio.ko
+e.g. modprobe uio
   - insert igb_uio.ko
 e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
-  - mount hugetlbfs
-e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
   - Bind network device to ibg_uio.
 e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
+Alternate binding method:
+ Find target Ethernet devices
+  lspci -nn|grep Ethernet
+ Bring Down (e.g. eth2, eth3)
+  ifconfig eth2 down
+  ifconfig eth3 down
+ Look at current devices (e.g ixgbe devices)
+  ls /sys/bus/pci/drivers/ixgbe/
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+ Unbind target pci devices from current driver (e.g. 02:00.0 ...)
+  echo :02:00.0 > /sys/bus/pci/drivers/ixgbe/unbind
+  echo :02:00.1 > /sys/bus/pci/drivers/ixgbe/unbind
+ Bind to target driver (e.g. igb_uio)
+  echo :02:00.0 > /sys/bus/pci/drivers/igb_uio/bind
+  echo :02:00.1 > /sys/bus/pci/drivers/igb_uio/bind
+ Check binding for listed devices
+  ls /sys/bus/pci/drivers/igb_uio
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+
+Prepare system:
+  - load ovs kernel module
+e.g modprobe openvswitch
+  - mount hugetlbfs
+e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
 
 Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
 
+Start vsdb-server as discussed in INSTALL doc:
+  Summary e.g.:
+First time only db creation (or clearing):
+  mkdir -p /usr/local/etc/openvswitch
+  mkdir -p /usr/local/var/run/openvswitch
+  rm /usr/local/etc/openvswitch/conf.db
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
+./vswitchd/vswitch.ovsschema
+start vsdb-server
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-server 
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
+  --remote=db:OpenOpen_vSwitch,manager_options \
+  --private-key=db:Open_vSwitch,SSL,private_key \
+  --certificate=dbitch,SSL,certificate \
+  --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
+First time after db creation, initialize:
+  cd $OVS_DIR
+  ./utilities/ovs-vsctl --no-wait init
+
 Start vswitchd:
 DPDK configuration arguments can be passed to vswitchd via `--dpdk`
-argument. dpdk arg -c is ignored by ovs-dpdk, but it is required parameter
+argument. dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
 for dpdk initialization.
 
e.g.
+   export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK  --pidfile 
--detach
 
+If allocated more than 1 GB huge pages, set amount and use NUMA node 0 memory:
+
+   ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
+  -- unix:$DB_SOCK  --pidfile --detach
+
 To use ovs-vswitchd with DPDK, create a bridge with datapath_type
 "netdev" in the configuration database.  For example:
 
@@ -69,11 +123,72 @@ Now you can add dpdk devices. OVS expect DPDK device name 
start with dpdk
 and end with portid. vswitchd should print number of dpdk devices found.
 
 ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+ovs-vsctl add-port br0 dpdki -- set Interface dpdk1 type=dpdk
 
-Once first DPDK port is added vswitchd, it creates Polling thread and
+Once first DPDK port is added to vswitchd, it creates a Polling thread and
 polls dpdk device in continuous loop. Therefore CPU utilization
 for that thread is always 100%.
 
+Test flow script across NICs (assuming ovs in /usr/src/ovs):
+  Assume 1.1.1.1 on NIC port 1 (dpdk0)
+  Assume 1.1.1.2 on NIC port 2 (dpdk1)
+  Execute script:
+
+# Script:
+
+#! /bin/sh
+
+# Move to command directory
+
+cd /usr/src/ovs/utilities/
+
+# Clear current flows
+./ovs-ofctl del-flows br0

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition, rev 1

2014-06-09 Thread Polehn, Mike A

Added details to dpdk poll mode setup and use to make it easier for some not 
familiar to get it operating.

Signed-off-by: Mike A. Polehn 

diff --git a/INSTALL.DPDK b/INSTALL.DPDK
index 3e0247a..689d95d 100644
--- a/INSTALL.DPDK
+++ b/INSTALL.DPDK
@@ -17,7 +17,8 @@ Building and Installing:
 Recommended to use DPDK 1.6.
 
 DPDK:
-cd DPDK
+Set dir i.g.:   export DPDK_DIR=/usr/src/dpdk-1.6.0r2
+cd $DPDK_DIR
 update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate 
single lib file.
 CONFIG_RTE_BUILD_COMBINE_LIBS=y
 
@@ -31,7 +32,8 @@ DPDK kernel requirement.
 OVS:
 cd $(OVS_DIR)/openvswitch
 ./boot.sh
-./configure --with-dpdk=$(DPDK_BUILD)
+export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
+./configure --with-dpdk=$DPDK_BUILD
 make
 
 Refer to INSTALL.userspace for general requirements of building
@@ -40,25 +42,77 @@ userspace OVS.
 Using the DPDK with ovs-vswitchd:
 -
 
+Setup system boot:
+   kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=1
+
 First setup DPDK devices:
   - insert uio.ko
+e.g. modprobe uio
   - insert igb_uio.ko
 e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
-  - mount hugetlbfs
-e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
   - Bind network device to ibg_uio.
 e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
+Alternate binding method:
+ Find target Ethernet devices
+  lspci -nn|grep Ethernet
+ Bring Down (e.g. eth2, eth3)
+  ifconfig eth2 down
+  ifconfig eth3 down
+ Look at current devices (e.g ixgbe devices)
+  ls /sys/bus/pci/drivers/ixgbe/
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+ Unbind target pci devices from current driver (e.g. 02:00.0 ...)
+  echo :02:00.0 > /sys/bus/pci/drivers/ixgbe/unbind
+  echo :02:00.1 > /sys/bus/pci/drivers/ixgbe/unbind
+ Bind to target driver (e.g. igb_uio)
+  echo :02:00.0 > /sys/bus/pci/drivers/igb_uio/bind
+  echo :02:00.1 > /sys/bus/pci/drivers/igb_uio/bind
+ Check binding for listed devices
+  ls /sys/bus/pci/drivers/igb_uio
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+
+Prepare system:
+  - load ovs kernel module
+e.g modprobe openvswitch
+  - mount hugetlbfs
+e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
 
 Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
 
+Start vsdb-server as discussed in INSTALL doc:
+  Summary e.g.:
+First time only db creation (or clearing):
+  mkdir -p /usr/local/etc/openvswitch
+  mkdir -p /usr/local/var/run/openvswitch
+  rm /usr/local/etc/openvswitch/conf.db
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
+./vswitchd/vswitch.ovsschema
+start vsdb-server
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-server 
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
+  --remote=db:OpenOpen_vSwitch,manager_options \
+  --private-key=db:Open_vSwitch,SSL,private_key \
+  --certificate=dbitch,SSL,certificate \
+  --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
+First time after db creation, initialize:
+  cd $OVS_DIR
+  ./utilities/ovs-vsctl --no-wait init
+
 Start vswitchd:
 DPDK configuration arguments can be passed to vswitchd via `--dpdk`
-argument. dpdk arg -c is ignored by ovs-dpdk, but it is required parameter
+argument. dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
 for dpdk initialization.
 
e.g.
+   export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK  --pidfile 
--detach
 
+If allocated more than 1 GB huge pages, set amount and use NUMA node 0 memory:
+
+   ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
+  -- unix:$DB_SOCK  --pidfile --detach
+
 To use ovs-vswitchd with DPDK, create a bridge with datapath_type
 "netdev" in the configuration database.  For example:
 
@@ -69,11 +123,72 @@ Now you can add dpdk devices. OVS expect DPDK device name 
start with dpdk
 and end with portid. vswitchd should print number of dpdk devices found.
 
 ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+ovs-vsctl add-port br0 dpdki -- set Interface dpdk1 type=dpdk
 
-Once first DPDK port is added vswitchd, it creates Polling thread and
+Once first DPDK port is added to vswitchd, it creates a Polling thread and
 polls dpdk device in continuous loop. Therefore CPU utilization
 for that thread is always 100%.
 
+Test flow script across NICs (assuming ovs in /usr/src/ovs):
+  Assume 1.1.1.1 on NIC port 1 (dpdk0)
+  Assume 1.1.1.2 on NIC port 2 (dpdk1)
+  Execute script:
+
+# Script:
+
+#! /bin/sh
+
+# Move to command directory
+
+cd /usr/src/ovs/utilities/
+
+# Clear current flows
+./ovs-ofctl del-flows br0

Re: [ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition

2014-06-09 Thread Polehn, Mike A

Yes, good catch!

Mike Polehn

-Original Message-
From: John W. Linville [mailto:linvi...@tuxdriver.com] 
Sent: Monday, June 09, 2014 8:16 AM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition

On Mon, Jun 09, 2014 at 02:47:05PM +, Polehn, Mike A wrote:

> @@ -69,11 +123,72 @@ Now you can add dpdk devices. OVS expect DPDK 
> device name start with dpdk  and end with portid. vswitchd should print 
> number of dpdk devices found.
>  
>  ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> +ovs-vsctl add-port br0 dpdki -- set Interface dpdk1 type=dpdk

Drive-by comment -- is that supposed to be "dpdki"?  Or should it be "dpdk1"?

-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition, rev 2

2014-06-09 Thread Polehn, Mike A

Added details to dpdk poll mode setup and use to make it easier for some not 
familiar to get it operating.

Signed-off-by: Mike A. Polehn 

diff --git a/INSTALL.DPDK b/INSTALL.DPDK
index 3e0247a..df497fb 100644
--- a/INSTALL.DPDK
+++ b/INSTALL.DPDK
@@ -17,7 +17,8 @@ Building and Installing:
 Recommended to use DPDK 1.6.
 
 DPDK:
-cd DPDK
+Set dir i.g.:   export DPDK_DIR=/usr/src/dpdk-1.6.0r2
+cd $DPDK_DIR
 update config/defconfig_x86_64-default-linuxapp-gcc so that dpdk generate 
single lib file.
 CONFIG_RTE_BUILD_COMBINE_LIBS=y
 
@@ -31,7 +32,8 @@ DPDK kernel requirement.
 OVS:
 cd $(OVS_DIR)/openvswitch
 ./boot.sh
-./configure --with-dpdk=$(DPDK_BUILD)
+export DPDK_BUILD=/usr/src/dpdk-1.6.0r2/x86_64-default-linuxapp-gcc
+./configure --with-dpdk=$DPDK_BUILD
 make
 
 Refer to INSTALL.userspace for general requirements of building
@@ -40,25 +42,77 @@ userspace OVS.
 Using the DPDK with ovs-vswitchd:
 -
 
+Setup system boot:
+   kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=1
+
 First setup DPDK devices:
   - insert uio.ko
+e.g. modprobe uio
   - insert igb_uio.ko
 e.g. insmod DPDK/x86_64-default-linuxapp-gcc/kmod/igb_uio.ko
-  - mount hugetlbfs
-e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
   - Bind network device to ibg_uio.
 e.g. DPDK/tools/pci_unbind.py --bind=igb_uio eth1
+Alternate binding method:
+ Find target Ethernet devices
+  lspci -nn|grep Ethernet
+ Bring Down (e.g. eth2, eth3)
+  ifconfig eth2 down
+  ifconfig eth3 down
+ Look at current devices (e.g ixgbe devices)
+  ls /sys/bus/pci/drivers/ixgbe/
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+ Unbind target pci devices from current driver (e.g. 02:00.0 ...)
+  echo :02:00.0 > /sys/bus/pci/drivers/ixgbe/unbind
+  echo :02:00.1 > /sys/bus/pci/drivers/ixgbe/unbind
+ Bind to target driver (e.g. igb_uio)
+  echo :02:00.0 > /sys/bus/pci/drivers/igb_uio/bind
+  echo :02:00.1 > /sys/bus/pci/drivers/igb_uio/bind
+ Check binding for listed devices
+  ls /sys/bus/pci/drivers/igb_uio
+  :02:00.0  :02:00.1  bind  module  new_id  remove_id  uevent  
unbind
+
+Prepare system:
+  - load ovs kernel module
+e.g modprobe openvswitch
+  - mount hugetlbfs
+e.g. mount -t hugetlbfs -o pagesize=1G none /mnt/huge/
 
 Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
 
+Start vsdb-server as discussed in INSTALL doc:
+  Summary e.g.:
+First time only db creation (or clearing):
+  mkdir -p /usr/local/etc/openvswitch
+  mkdir -p /usr/local/var/run/openvswitch
+  rm /usr/local/etc/openvswitch/conf.db
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
+./vswitchd/vswitch.ovsschema
+start vsdb-server
+  cd $OVS_DIR
+  ./ovsdb/ovsdb-server 
--remote=punix:/usr/local/var/run/openvswitch/db.sock \
+  --remote=db:OpenOpen_vSwitch,manager_options \
+  --private-key=db:Open_vSwitch,SSL,private_key \
+  --certificate=dbitch,SSL,certificate \
+  --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
+First time after db creation, initialize:
+  cd $OVS_DIR
+  ./utilities/ovs-vsctl --no-wait init
+
 Start vswitchd:
 DPDK configuration arguments can be passed to vswitchd via `--dpdk`
-argument. dpdk arg -c is ignored by ovs-dpdk, but it is required parameter
+argument. dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
 for dpdk initialization.
 
e.g.
+   export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK  --pidfile 
--detach
 
+If allocated more than 1 GB huge pages, set amount and use NUMA node 0 memory:
+
+   ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
+  -- unix:$DB_SOCK  --pidfile --detach
+
 To use ovs-vswitchd with DPDK, create a bridge with datapath_type
 "netdev" in the configuration database.  For example:
 
@@ -69,11 +123,72 @@ Now you can add dpdk devices. OVS expect DPDK device name 
start with dpdk
 and end with portid. vswitchd should print number of dpdk devices found.
 
 ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
+ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
 
-Once first DPDK port is added vswitchd, it creates Polling thread and
+Once first DPDK port is added to vswitchd, it creates a Polling thread and
 polls dpdk device in continuous loop. Therefore CPU utilization
 for that thread is always 100%.
 
+Test flow script across NICs (assuming ovs in /usr/src/ovs):
+  Assume 1.1.1.1 on NIC port 1 (dpdk0)
+  Assume 1.1.1.2 on NIC port 2 (dpdk1)
+  Execute script:
+
+# Script:
+
+#! /bin/sh
+
+# Move to command directory
+
+cd /usr/src/ovs/utilities/
+
+# Clear current flows
+./ovs-ofctl del-flows br0

Re: [ovs-dev] [PATCH v2] dpif-netdev: Polling threads directly call ofproto upcall functions.

2014-06-17 Thread Polehn, Mike A

A good reason to offload an ofproto upcall function in polling mode is to allow 
a different CPU to do a time consuming inexact rule matches while the polling 
thread maintains fast packet switching. At low data and packet rates or low 
rate Ethernet interface (1 GbE and lower) this does not matter, however when 
higher packet rates are achieved this is going to be critical since the input 
queue will get easily overrun at 10 GbE rates with moderate delays, especially 
with smaller packet sizes. 

At this time, for polling mode DPDK, all threads have default affinitization to 
only one core. So ofproto upcalls are being run on the same core, so a change 
to call ofproto upcall functions will not show little or no performance 
difference, and may even show as a packet rate gain since no Linux process 
scheduling overhead will be present. Currently, changing to affiliation to 
allow polling thread to execute on different cpu cores than the other 
ovs-vswitchd results in occasional polling halts/hangs which gives very 
unpredictable zero loss performance, resulting in poorer zero loss operation 
then with all ovs-vswitchd threads affinitized to one core. However this is SMP 
synchronization issue(s) that hopefully will eventually get solved.

Calling some ofproto upcall functions directly has potential benefits. It is 
very desirable to setup exact match flow entries while sequentially processing 
packets if this can be done in just several microseconds at most and the NIC RX 
queue has room to absorb and average this time across multiple loop and with 
the loop processing packets faster than average input rate. This would require 
vary optimized inexact rule lookup code. A trip to an open flow controller 
would still need to be offloaded to a non-realtime thread.

Directly calling ofproto upcall functions, before inexact rule lookup code is 
highly optimized for lookup speed when having large number of rules, would make 
it more difficult to get the DPDK packet processing rate up and also to test 
and verify fast packet processing rates.

Mike Polehn  

-Original Message-
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Ryan Wilson
Sent: Monday, June 16, 2014 11:45 PM
To: dev@openvswitch.org
Subject: [ovs-dev] [PATCH v2] dpif-netdev: Polling threads directly call 
ofproto upcall functions.

Typically, kernel datapath threads send upcalls to userspace where handler 
threads process the upcalls. For TAP and DPDK devices, the datapath threads 
operate in userspace, so there is no need for separate handler threads.

This patch allows userspace datapath threads to directly call the ofproto 
upcall functions, eliminating the need for handler threads for datapaths of 
type 'netdev'.

Signed-off-by: Ryan Wilson 
---
v2: Fix race condition found during perf test
---
 lib/dpif-netdev.c |  327 +++--
 lib/dpif-netdev.h |7 +
 lib/dpif.c|   68 ++---
 lib/dpif.h|1 +
 ofproto/ofproto-dpif-upcall.c |  227 +---
 ofproto/ofproto-dpif-upcall.h |4 +
 6 files changed, 258 insertions(+), 376 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6c281fe..626b3e6 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -15,8 +15,6 @@
  */
 
 #include 
-#include "dpif.h"
-
 #include 
 #include 
 #include 
@@ -35,6 +33,7 @@
 #include "cmap.h"
 #include "csum.h"
 #include "dpif.h"
+#include "dpif-netdev.h"
 #include "dpif-provider.h"
 #include "dummy.h"
 #include "dynamic-string.h"
@@ -76,10 +75,7 @@ DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0)
 /* Configuration parameters. */
 enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */
 
-/* Queues. */
-enum { MAX_QUEUE_LEN = 128 };   /* Maximum number of packets per queue. */
-enum { QUEUE_MASK = MAX_QUEUE_LEN - 1 }; 
-BUILD_ASSERT_DECL(IS_POW2(MAX_QUEUE_LEN));
+static exec_upcall_func *exec_upcall_cb = NULL;
 
 /* Protects against changes to 'dp_netdevs'. */  static struct ovs_mutex 
dp_netdev_mutex = OVS_MUTEX_INITIALIZER; @@ -88,27 +84,6 @@ static struct 
ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;  static struct shash 
dp_netdevs OVS_GUARDED_BY(dp_netdev_mutex)
 = SHASH_INITIALIZER(&dp_netdevs);
 
-struct dp_netdev_upcall {
-struct dpif_upcall upcall;  /* Queued upcall information. */
-struct ofpbuf buf;  /* ofpbuf instance for upcall.packet. */
-};
-
-/* A queue passing packets from a struct dp_netdev to its clients (handlers).
- *
- *
- * Thread-safety
- * =
- *
- * Any access at all requires the owning 'dp_netdev''s queue_rwlock and
- * its own mutex. */
-struct dp_netdev_queue {
-struct ovs_mutex mutex;
-struct seq *seq;  /* Incremented whenever a packet is queued. */
-struct dp_netdev_upcall upcalls

[ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

2014-06-19 Thread Polehn, Mike A

Large TX and RX queues are needed for high speed 10 GbE physical NICS.

Signed-off-by: Mike A. Polehn 

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fbdb6b3..d1bcc73 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 
 #define NON_PMD_THREAD_TX_QUEUE 0
 
+#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue (n*32<4096)*/
+#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue (n*32<4096)*/
+
 /* TODO: Needs per NIC value for these constants. */
 #define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
 #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */
@@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
-diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
+diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE,
   dev->socket_id, &tx_conf);
 if (diag) {
 VLOG_ERR("eth dev tx queue setup error %d",diag);
@@ -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
-diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
+diag = rte_eth_rx_queue_setup(dev->port_id, i, NIC_PORT_RX_Q_SIZE,
   dev->socket_id,
   &rx_conf, dev->dpdk_mp->mp);
 if (diag) {
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

2014-06-19 Thread Polehn, Mike A

I coming from an earlier version that had the arguments first setup was as a 
number, then used in several 
places including the tx cache size and didn't catch that new 3rd definition 
were used as I moved the patch forward 
to try on the latest git updates before sending. 

There is also a queue sizing formula in the comment that is not obvious.

 Mike Polehn

-Original Message-
From: Ethan Jackson [mailto:et...@nicira.com] 
Sent: Thursday, June 19, 2014 10:21 AM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

One question: why not just increase MAX_RX_QUEUE_LEN and MAX_TX_QUEUE_LEN 
instead of creating new #defines?

Just a thought.  I'd like Pravin to review this as I don't know this code as 
well as him.

Ethan

On Thu, Jun 19, 2014 at 9:59 AM, Polehn, Mike A  wrote:
> Large TX and RX queues are needed for high speed 10 GbE physical NICS.
>
> Signed-off-by: Mike A. Polehn 
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 
> fbdb6b3..d1bcc73 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = 
> VLOG_RATE_LIMIT_INIT(5, 20);
>
>  #define NON_PMD_THREAD_TX_QUEUE 0
>
> +#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue 
> +(n*32<4096)*/ #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical 
> +NIC TX Queue (n*32<4096)*/
> +
>  /* TODO: Needs per NIC value for these constants. */  #define 
> RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */  
> #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */ 
> @@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
> +diag = rte_eth_tx_queue_setup(dev->port_id, i, 
> + NIC_PORT_TX_Q_SIZE,
>dev->socket_id, &tx_conf);
>  if (diag) {
>  VLOG_ERR("eth dev tx queue setup error %d",diag); @@ 
> -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
> +diag = rte_eth_rx_queue_setup(dev->port_id, i, 
> + NIC_PORT_RX_Q_SIZE,
>dev->socket_id,
>&rx_conf, dev->dpdk_mp->mp);
>  if (diag) {
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

2014-06-19 Thread Polehn, Mike A

There is an improvement in 2544 zero loss measurements, but it takes another 
patch to actually be able to get a reasonable measurement with standard test 
equipment.

Should I redo it with the new enum change. I am not sure of using an enum for a 
single constant.

Mike Polehn

-Original Message-
From: Ethan Jackson [mailto:et...@nicira.com] 
Sent: Thursday, June 19, 2014 2:54 PM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

Also another question.  Does this patch result in a measurable improvement in 
any benchmarks?  If so, would you please note it in the commit message?  If 
not, I'm not sure we should merge this yet.

Ethan

On Thu, Jun 19, 2014 at 2:45 PM, Polehn, Mike A  wrote:
> I coming from an earlier version that had the arguments first setup 
> was as a number, then used in several places including the tx cache 
> size and didn't catch that new 3rd definition were used as I moved the patch 
> forward to try on the latest git updates before sending.
>
> There is also a queue sizing formula in the comment that is not obvious.
>
>  Mike Polehn
>
> -Original Message-
> From: Ethan Jackson [mailto:et...@nicira.com]
> Sent: Thursday, June 19, 2014 10:21 AM
> To: Polehn, Mike A
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue 
> size
>
> One question: why not just increase MAX_RX_QUEUE_LEN and MAX_TX_QUEUE_LEN 
> instead of creating new #defines?
>
> Just a thought.  I'd like Pravin to review this as I don't know this code as 
> well as him.
>
> Ethan
>
> On Thu, Jun 19, 2014 at 9:59 AM, Polehn, Mike A  
> wrote:
>> Large TX and RX queues are needed for high speed 10 GbE physical NICS.
>>
>> Signed-off-by: Mike A. Polehn 
>>
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> fbdb6b3..d1bcc73 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = 
>> VLOG_RATE_LIMIT_INIT(5, 20);
>>
>>  #define NON_PMD_THREAD_TX_QUEUE 0
>>
>> +#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue 
>> +(n*32<4096)*/ #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical 
>> +NIC TX Queue (n*32<4096)*/
>> +
>>  /* TODO: Needs per NIC value for these constants. */  #define 
>> RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */ 
>> #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */ 
>> @@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
>> OVS_REQUIRES(dpdk_mutex)
>>  }
>>
>>  for (i = 0; i < NR_QUEUE; i++) {
>> -diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
>> +diag = rte_eth_tx_queue_setup(dev->port_id, i, 
>> + NIC_PORT_TX_Q_SIZE,
>>dev->socket_id, &tx_conf);
>>  if (diag) {
>>  VLOG_ERR("eth dev tx queue setup error %d",diag); @@
>> -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
>> OVS_REQUIRES(dpdk_mutex)
>>  }
>>
>>  for (i = 0; i < NR_QUEUE; i++) {
>> -diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
>> +diag = rte_eth_rx_queue_setup(dev->port_id, i, 
>> + NIC_PORT_RX_Q_SIZE,
>>dev->socket_id,
>>&rx_conf, dev->dpdk_mp->mp);
>>  if (diag) {
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

2014-06-19 Thread Polehn, Mike A

Checked out the code that was modified by the patch and found that both 
MAX_RX_QUEUE_LEN and 
MAX_TX_QUEUE_LEN definitions are dually used for different meaning. Also the 
name 
implies something different then a set NIC queue size. Resubmitting patch with 
zero loss gain in 
comment following this.

-Original Message-
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Polehn, Mike A
Sent: Thursday, June 19, 2014 2:45 PM
To: Ethan Jackson
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

I coming from an earlier version that had the arguments first setup was as a 
number, then used in several places including the tx cache size and didn't 
catch that new 3rd definition were used as I moved the patch forward to try on 
the latest git updates before sending. 

There is also a queue sizing formula in the comment that is not obvious.

 Mike Polehn

-Original Message-
From: Ethan Jackson [mailto:et...@nicira.com]
Sent: Thursday, June 19, 2014 10:21 AM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

One question: why not just increase MAX_RX_QUEUE_LEN and MAX_TX_QUEUE_LEN 
instead of creating new #defines?

Just a thought.  I'd like Pravin to review this as I don't know this code as 
well as him.

Ethan

On Thu, Jun 19, 2014 at 9:59 AM, Polehn, Mike A  wrote:
> Large TX and RX queues are needed for high speed 10 GbE physical NICS.
>
> Signed-off-by: Mike A. Polehn 
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> fbdb6b3..d1bcc73 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = 
> VLOG_RATE_LIMIT_INIT(5, 20);
>
>  #define NON_PMD_THREAD_TX_QUEUE 0
>
> +#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue 
> +(n*32<4096)*/ #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical 
> +NIC TX Queue (n*32<4096)*/
> +
>  /* TODO: Needs per NIC value for these constants. */  #define 
> RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */ 
> #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */ 
> @@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
> +diag = rte_eth_tx_queue_setup(dev->port_id, i, 
> + NIC_PORT_TX_Q_SIZE,
>dev->socket_id, &tx_conf);
>  if (diag) {
>  VLOG_ERR("eth dev tx queue setup error %d",diag); @@
> -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
> +diag = rte_eth_rx_queue_setup(dev->port_id, i, 
> + NIC_PORT_RX_Q_SIZE,
>dev->socket_id,
>&rx_conf, dev->dpdk_mp->mp);
>  if (diag) {
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

2014-06-19 Thread Polehn, Mike A

Large TX and RX queues are needed for high speed 10 GbE physical NICS.
Observed a 250% zero loss improvement over small NIC queue test for 
A port to port flow test.

Signed-off-by: Mike A. Polehn 

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fbdb6b3..d1bcc73 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 
 #define NON_PMD_THREAD_TX_QUEUE 0
 
+#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue (n*32<4096)*/
+#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue (n*32<4096)*/
+
 /* TODO: Needs per NIC value for these constants. */
 #define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
 #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */
@@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
-diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
+diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE,
   dev->socket_id, &tx_conf);
 if (diag) {
 VLOG_ERR("eth dev tx queue setup error %d",diag);
@@ -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
-diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
+diag = rte_eth_rx_queue_setup(dev->port_id, i, NIC_PORT_RX_Q_SIZE,
   dev->socket_id,
   &rx_conf, dev->dpdk_mp->mp);
 if (diag) {


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

2014-06-20 Thread Polehn, Mike A

The idle timeout of 1.5 seconds for exact match flows creates big problems for 
testing. I suspect that other than for internal generated flows within the 
system, that 1.5 seconds is not a reasonable timeout. I worry about thrashing 
of the flows in a congested setting where the load is so high that packets are 
discarded from not having enough time to setup flows and the discarded packets 
cause flows to be deleted since the packets that would maintain the flows are 
not seen. This type of an issue combined with other similar type issues can 
greatly reduce the overall network performance. One way to approach this is to 
get the network performance working well and then retune such parameters that 
could have an impact.

Imagine someone using ssh terminal, there would be new flows constantly being 
created and destroyed for each small user hesitation. Imagine that a lot the 
communications through to the system happen to be through ssh terminals and the 
system was gateway for these communications to a lot of systems. The network 
performance for this case would probably degrade down to the flow setup 
performance.  This is just one example of many that could exist, while there 
are of course scenarios that 1.5 seconds would be ideal. I think the setting of 
1.5 seconds is due to inexperience and needs to be drastically changed. If flow 
timeout is specified on the OpenFlow command, the exact match flow timeout 
should use the OpenFlow set timeout and not an arbitrary value since there was 
probably a reason for setting the particular value. 

Currently I use a patch for testing that sets the idle timeout to 15 seconds 
and it solves a big issue of the OVS not being able to setup flows fast enough 
for high speed flows, since my equipment (which is industry standard network 
test equipment that I cannot modify) can be set to send small set of packets at 
a low packet rate before sending the high speed flows, but there is a moderate 
time gap of 11 seconds between these two. 

It is not obvious to me how to set the OVS idle from an exterior interface. Can 
an example, or documentation update to indicate exactly how to set this new 
idle timeout parameter. Hopefully this is a global setting and not flow 
specific setting. 

Mike Polehn

-Original Message-
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Alex Wang
Sent: Thursday, June 19, 2014 9:20 PM
To: dev@openvswitch.org
Subject: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

This series backports the commit 72310b04 (upcall: Configure datapath max-idle 
through ovs-vsctl.) to branch-1.10 - branch-2.1, for testing purpose.

--
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

2014-06-20 Thread Polehn, Mike A

I had done an internal patch on OVS 2.0 code, does not seem like years ago, but 
the default timeout 
was 5 seconds for flow counts less than flow_eviction_threshold.
The histogram as written had some algorithm issues, so had the potential to 
thrash the 
system for excessive flow removal counts per loop that exceeded 
flow_eviction_threshold.

It might have been this discard of 99% (total / 100) of the flows (in reality 
all flows over 
flow_eviction_threshold) each revalidator loop could have been causing the 
thrashing 
problem that was observed with the earlier OVS versions.

Maybe making a more efficient revalidator might be another way to help this 
issue.

Mike

diff --git a/openvswitch/ofproto/ofproto-dpif.c 
b/openvswitch/ofproto/ofproto-dpif.c
index 92f3262..3dd297f 100644
--- a/openvswitch/ofproto/ofproto-dpif.c
+++ b/openvswitch/ofproto/ofproto-dpif.c
@@ -76,6 +76,19 @@ COVERAGE_DEFINE(subfacet_install_fail);
 COVERAGE_DEFINE(packet_in_overflow);
 COVERAGE_DEFINE(flow_mod_overflow);
 
+/* Flow IDLE Timeout definitions */
+
+/* Millseconds to timeout flow, original 5000 */
+#define IDLE_FLOW_TIMEOUT  3
+/* Millseconds to timeout flow minimum, original 100 */
+#define IDLE_FLOW_TIMEOUT_MIN   5000
+/* Millseconds to timeout special flow, original 1 */
+#define IDLE_FLOW_TIMEOUT_SPECIAL  4
+/* Idle histogram bucket width (to keep same number of buckets), original 100 
*/
+#define IDLE_HIST_TIME_WIDTH   500
+/* Idle upper amount to keep each discard cycle, original 0.01 */
+#define IDLE_EVICTION_KEEP_RATE0.9
+
 /* Number of implemented OpenFlow tables. */
 enum { N_TABLES = 255 };
 enum { TBL_INTERNAL = N_TABLES - 1 };/* Used for internal hidden rules. */
@@ -3786,17 +3799,17 @@ subfacet_max_idle(const struct dpif_backer *backer)
  * pass made by update_stats(), because the former function never looks at
  * uninstallable subfacets.
  */
-enum { BUCKET_WIDTH = ROUND_UP(100, TIME_UPDATE_INTERVAL) };
-enum { N_BUCKETS = 5000 / BUCKET_WIDTH };
+enum { BUCKET_WIDTH = ROUND_UP(IDLE_HIST_TIME_WIDTH, TIME_UPDATE_INTERVAL) 
};
+enum { N_BUCKETS = IDLE_FLOW_TIMEOUT / BUCKET_WIDTH };
 int buckets[N_BUCKETS] = { 0 };
-int total, subtotal, bucket;
+int total, subtotal, bucket, keep, idle_timeout;
 struct subfacet *subfacet;
 long long int now;
 int i;
 
 total = hmap_count(&backer->subfacets);
 if (total <= flow_eviction_threshold) {
-return N_BUCKETS * BUCKET_WIDTH;
+return IDLE_FLOW_TIMEOUT;
 }
 
 /* Build histogram. */
@@ -3810,11 +3823,13 @@ subfacet_max_idle(const struct dpif_backer *backer)
 }
 
 /* Find the first bucket whose flows should be expired. */
-subtotal = bucket = 0;
+keep = MAX(flow_eviction_threshold, (int)(total * 
IDLE_EVICTION_KEEP_RATE));
+subtotal = bucket = idle_timeout = 0;
 do {
 subtotal += buckets[bucket++];
+  idle_timeout += BUCKET_WIDTH;
 } while (bucket < N_BUCKETS &&
- subtotal < MAX(flow_eviction_threshold, total / 100));
+   (subtotal < keep || idle_timeout < IDLE_FLOW_TIMEOUT_MIN));
 
 if (VLOG_IS_DBG_ENABLED()) {
 struct ds s;
@@ -3833,7 +3848,7 @@ subfacet_max_idle(const struct dpif_backer *backer)
 ds_destroy(&s);
 }
 
-return bucket * BUCKET_WIDTH;
+return idle_timeout;
 }
 
 static void
@@ -3844,7 +3859,7 @@ expire_subfacets(struct dpif_backer *backer, int 
dp_max_idle)
 
 /* We really want to keep flows for special protocols around, so use a more
  * conservative cutoff. */
-long long int special_cutoff = time_msec() - 1;
+long long int special_cutoff = time_msec() - IDLE_FLOW_TIMEOUT_SPECIAL;
 
 struct subfacet *subfacet, *next_subfacet;
 struct subfacet *batch[SUBFACET_DESTROY_MAX_BATCH];

-Original Message-
From: Ethan Jackson [mailto:et...@nicira.com] 
Sent: Friday, June 20, 2014 11:11 AM
To: Polehn, Mike A
Cc: Alex Wang; dev@openvswitch.org
Subject: Re: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

> I think the setting of 1.5 seconds is due to inexperience and needs to be 
> drastically changed. If flow timeout is specified on the OpenFlow command, 
> the exact match flow timeout should use the OpenFlow set timeout and not an 
> arbitrary value since there was probably a reason for setting the particular 
> value.

The 1.5 second number does not come from inexperience, in fact exactly the 
opposite.  Over years of running Open vSwitch in multiple production 
deployments, we've found that a key factor in maintaining reasonable 
performance is management of the datapath flow cache.  If the idle timeout is 
too large, then the datapath fills up with unused flows which stress the 
revalidators and take up space that newer more useful flows could occupy.  I 
can see that when doing performance testing a larger nu

[ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and packet surge assoprtion.

2014-06-20 Thread Polehn, Mike A

Put in a DPDK queue to receive from multiple core SMP input from vSwitch for 
NIC TX output.
Eliminated the inside polling loop SMP TX output lock (DPDK queue handles SMP).
Added a SMP lock for non-polling operation to allow TX output by the 
non-polling thread 
 when interface not being polled. Lock accessed only when polling is not 
enabled.
Added new netdev subroutine to control polling lock and enable and disable flag.
Packets do not get discarded between TX pre-queue and NIC queue to handle 
surges.

Measured improved average PMD port to port 2544 zero loss packet rate of 
268,000 
for packets 512 bytes and smaller. Predict double that when using 1 cpu 
core/interface.

Observed better persistence of obtaining 100% 10 GbE for larger packets with 
the 
added DPDK queue, consistent with other tests outside of OVS where large surges
from fast path interfaces transferring larger sized packets from VMs were being 
absorbed in the NIC TX pre-queue and TX queue and packet loss was suppressed. 

Signed-off-by: Mike A. Polehn 

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 6c281fe..478a0d9 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1873,6 +1873,10 @@ reload:
 poll_cnt = pmd_load_queues(f, &poll_list, poll_cnt);
 atomic_read(&f->change_seq, &port_seq);
 
+/* get poll ownership */
+for (i = 0; i < poll_cnt; i++)
+   netdev_rxq_do_polling(poll_list[i].rx, true);
+
 for (;;) {
 unsigned int c_port_seq;
 int i;
@@ -1895,6 +1899,10 @@ reload:
 }
 }
 
+/* release poll ownership */
+for (i = 0; i < poll_cnt; i++)
+   netdev_rxq_do_polling(poll_list[i].rx, false);
+
 if (!latch_is_set(&f->dp->exit_latch)){
 goto reload;
 }
diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 35a8da4..0f61777 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -1596,6 +1596,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
 netdev_bsd_rxq_recv, \
 netdev_bsd_rxq_wait, \
 netdev_bsd_rxq_drain,\
+NULL, /* rxq_do_polling */   \
 }
 
 const struct netdev_class netdev_bsd_class =
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index d1bcc73..78f0329 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -73,6 +73,9 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 #define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue */
 #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue */
 
+#define NIC_TX_PRE_Q_SIZE  4096   /* Size of Physical NIC TX Pre Queue (2**n)*/
+#define NIC_TX_PRE_Q_TRANS   64   /* Pre Queue to Physical NIC Transfer */
+
 /* TODO: Needs per NIC value for these constants. */
 #define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
 #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */
@@ -122,8 +125,6 @@ static const struct rte_eth_txconf tx_conf = {
 };
 
 enum { MAX_RX_QUEUE_LEN = 64 };
-enum { MAX_TX_QUEUE_LEN = 64 };
-enum { DRAIN_TSC = 20ULL };
 
 static int rte_eal_init_ret = ENODEV;
 
@@ -145,10 +146,12 @@ struct dpdk_mp {
 };
 
 struct dpdk_tx_queue {
-rte_spinlock_t tx_lock;
+bool is_polled;
+int port_id;
 int count;
-uint64_t tsc;
-struct rte_mbuf *burst_pkts[MAX_TX_QUEUE_LEN];
+struct rte_mbuf *tx_trans[NIC_TX_PRE_Q_TRANS];
+struct rte_ring *tx_preq;
+rte_spinlock_t tx_lock;
 };
 
 struct netdev_dpdk {
@@ -360,6 +363,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 struct ether_addr eth_addr;
 int diag;
 int i;
+char qname[32];
 
 if (dev->port_id < 0 || dev->port_id >= rte_eth_dev_count()) {
 return -ENODEV;
@@ -372,12 +376,21 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
+dev->tx_q[i].port_id = dev->port_id;
 diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE,
   dev->socket_id, &tx_conf);
 if (diag) {
 VLOG_ERR("eth dev tx queue setup error %d",diag);
 return diag;
 }
+
+snprintf(qname, sizeof(qname),"NIC_TX_Pre_Q_%u_%u", dev->port_id, i);
+dev->tx_q[i].tx_preq = rte_ring_create(qname, NIC_TX_PRE_Q_SIZE,
+   dev->socket_id, RING_F_SC_DEQ);
+if (NULL == dev->tx_q[i].tx_preq) {
+VLOG_ERR("eth dev tx pre-queue alloc error");
+return -ENOMEM;
+}
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
@@ -451,6 +464,7 @@ netdev_dpdk_construct(struct netdev *netdev_)
 port_no = strtol(cport, 0, 0); /* string must be null terminated */
 
 for (i = 0; i < NR_QUEUE; i++) {
+netdev

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

2014-06-26 Thread Polehn, Mike A

(n*32<4096) is the formula for calculating the buffer size. The maximum size is 
not
4096 but 4096 - 32 =  4064. I ran into this issue on a different project where 
I needed to 
use the largest buffers size available. Often 2**n is used for queue sizing, 
but is not the 
case for the PMD driver.

Mike Polehn

-Original Message-
From: Pravin Shelar [mailto:pshe...@nicira.com] 
Sent: Wednesday, June 25, 2014 4:19 PM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, 
resubmit

On Thu, Jun 19, 2014 at 3:58 PM, Polehn, Mike A  wrote:
> Large TX and RX queues are needed for high speed 10 GbE physical NICS.
> Observed a 250% zero loss improvement over small NIC queue test for A 
> port to port flow test.
>
> Signed-off-by: Mike A. Polehn 
I am fine with the
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 
> fbdb6b3..d1bcc73 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = 
> VLOG_RATE_LIMIT_INIT(5, 20);
>
>  #define NON_PMD_THREAD_TX_QUEUE 0
>
> +#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue 
> +(n*32<4096)*/

I am not sure what does "(n*32<4096)" means. Can you elaborate it bit?

> +#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue 
> +(n*32<4096)*/
> +
>  /* TODO: Needs per NIC value for these constants. */  #define 
> RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */  
> #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */ 
> @@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
> +diag = rte_eth_tx_queue_setup(dev->port_id, i, 
> + NIC_PORT_TX_Q_SIZE,
>dev->socket_id, &tx_conf);
>  if (diag) {
>  VLOG_ERR("eth dev tx queue setup error %d",diag); @@ 
> -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>
>  for (i = 0; i < NR_QUEUE; i++) {
> -diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
> +diag = rte_eth_rx_queue_setup(dev->port_id, i, 
> + NIC_PORT_RX_Q_SIZE,
>dev->socket_id,
>&rx_conf, dev->dpdk_mp->mp);
>  if (diag) {
>
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and packet surge assoprtion.

2014-06-26 Thread Polehn, Mike A

I'll rebase, study and correct to OVS coding style and repost.

There is a very good reason for putting constants on the left hand side of a 
compare statement.
For example:
if (NULL = x)  
will be a compiler error, while the following will compile and need debugging:
if (x = NULL)

Although I try not making the comparison mistakes, I have recently made that 
exact mistake and had to debug.
If I had used the second format, the complier would have output an error and 
saved the time of debugging.

Mike Polehn

-Original Message-
From: Ben Pfaff [mailto:b...@nicira.com] 
Sent: Wednesday, June 25, 2014 1:48 PM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and 
packet surge assoprtion.

On Fri, Jun 20, 2014 at 10:24:33PM +, Polehn, Mike A wrote:
> Put in a DPDK queue to receive from multiple core SMP input from vSwitch for 
> NIC TX output.
> Eliminated the inside polling loop SMP TX output lock (DPDK queue handles 
> SMP).
> Added a SMP lock for non-polling operation to allow TX output by the 
> non-polling thread 
>  when interface not being polled. Lock accessed only when polling is not 
> enabled.
> Added new netdev subroutine to control polling lock and enable and disable 
> flag.
> Packets do not get discarded between TX pre-queue and NIC queue to handle 
> surges.
> 
> Measured improved average PMD port to port 2544 zero loss packet rate 
> of 268,000 for packets 512 bytes and smaller. Predict double that when using 
> 1 cpu core/interface.
> 
> Observed better persistence of obtaining 100% 10 GbE for larger 
> packets with the added DPDK queue, consistent with other tests outside 
> of OVS where large surges from fast path interfaces transferring 
> larger sized packets from VMs were being absorbed in the NIC TX pre-queue and 
> TX queue and packet loss was suppressed.
> 
> Signed-off-by: Mike A. Polehn 

This doesn't apply to the current tree.  You'll need to rebase and repost it.

I have some stylistic comments.  Most of the following are cut-and-paste from 
CONTRIBUTING or CodingStyle (please read both).
Many of them apply in multiple places, but I only pointed them out once.

Please limit lines in the commit message to 79 characters in width.

Comments should be written as full sentences that start with a capital letter 
and end with a period:
> +/* get poll ownership */

  Enclose single statements in braces:

if (a > b) {
return a;
} else {
return b;
}
> +for (i = 0; i < poll_cnt; i++)
> +   netdev_rxq_do_polling(poll_list[i].rx, true);
> +
>  for (;;) {
>  unsigned int c_port_seq;
>  int i;

When using a relational operator like "<" or "==", put an expression or 
variable argument on the left and a constant argument on the right, e.g. "x == 
0", *not* "0 == x":
> +if (NULL == dev->tx_q[i].tx_preq) {
> +VLOG_ERR("eth dev tx pre-queue alloc error");
> +return -ENOMEM;
> +}
>  }

We don't generally put "inline" on functions in C files, since it suppresses 
otherwise useful "function not used" warnings and doesn't usually help code 
generation:
>  inline static void
> -dpdk_queue_flush(struct netdev_dpdk *dev, int qid)
> +dpdk_port_out(struct dpdk_tx_queue *tx_q, int qid)
>  {
> -struct dpdk_tx_queue *txq = &dev->tx_q[qid];
> -uint32_t nb_tx;
> +/* get packets from NIC tx staging queue */
> +if (likely(tx_q->count == 0))
> +tx_q->count =  rte_ring_sc_dequeue_burst(tx_q->tx_preq, 
> +(void **)&tx_q->tx_trans[0], NIC_TX_PRE_Q_TRANS);
> +
> +/* send packets to NIC tx queue */
> +if (likely(tx_q->count != 0)) {
> +unsigned sent = rte_eth_tx_burst(tx_q->port_id, qid, tx_q->tx_trans,
> + tx_q->count);
> +tx_q->count -= sent;
> +
> +if (unlikely((tx_q->count != 0) && (sent > 0)))
> +/* move unsent packets to front of list */
> +memmove(&tx_q->tx_trans[0], &tx_q->tx_trans[sent],
> +(sizeof(struct rte_mbuf *) * tx_q->count));
> +}
> +}
>  
> -if (txq->count == 0) {
> -return;

  Put the return type, function name, and the braces that surround the 
function's code on separate lines, all starting in column 0:
> +static void netdev_dpdk_do_poll(struct netdev_rxq *rxq_, unsigned 
> +enable) {
> +struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq_);
> +struct netdev *netdev = rx->up.netdev;
> +struct netdev_dpdk *dev = netdev_dpdk_cast

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

2014-06-26 Thread Polehn, Mike A

If someone wants to change the queue size, maybe bigger, this provides a 
formula to allow this.
There are advantages and also some disadvantages on using a larger queue.
Doubling the size does not work since 4096 is invalid and will fail 
compilation. Unless they 
researched this carefully, they may think that 2048 is the largest size 
possible. 
This is to give a hint of what values the defined value can be set to.

Mike Polehn

-Original Message-
From: Pravin Shelar [mailto:pshe...@nicira.com] 
Sent: Thursday, June 26, 2014 2:18 PM
To: Polehn, Mike A
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, 
resubmit

On Thu, Jun 26, 2014 at 7:08 AM, Polehn, Mike A  wrote:
> (n*32<4096) is the formula for calculating the buffer size. The 
> maximum size is not
> 4096 but 4096 - 32 =  4064. I ran into this issue on a different 
> project where I needed to use the largest buffers size available. 
> Often 2**n is used for queue sizing, but is not the case for the PMD driver.
>

I still not sure how is related to the patch where you set queue size of 2048.


> Mike Polehn
>
> -Original Message-
> From: Pravin Shelar [mailto:pshe...@nicira.com]
> Sent: Wednesday, June 25, 2014 4:19 PM
> To: Polehn, Mike A
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue 
> size, resubmit
>
> On Thu, Jun 19, 2014 at 3:58 PM, Polehn, Mike A  
> wrote:
>> Large TX and RX queues are needed for high speed 10 GbE physical NICS.
>> Observed a 250% zero loss improvement over small NIC queue test for A 
>> port to port flow test.
>>
>> Signed-off-by: Mike A. Polehn 
> I am fine with the
>>
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> fbdb6b3..d1bcc73 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -70,6 +70,9 @@ static struct vlog_rate_limit rl = 
>> VLOG_RATE_LIMIT_INIT(5, 20);
>>
>>  #define NON_PMD_THREAD_TX_QUEUE 0
>>
>> +#define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue 
>> +(n*32<4096)*/
>
> I am not sure what does "(n*32<4096)" means. Can you elaborate it bit?
>
>> +#define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue 
>> +(n*32<4096)*/
>> +
>>  /* TODO: Needs per NIC value for these constants. */  #define 
>> RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */ 
>> #define RX_HTHRESH 32 /* Default values of RX host threshold reg. */ 
>> @@ -369,7 +372,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
>> OVS_REQUIRES(dpdk_mutex)
>>  }
>>
>>  for (i = 0; i < NR_QUEUE; i++) {
>> -diag = rte_eth_tx_queue_setup(dev->port_id, i, MAX_TX_QUEUE_LEN,
>> +diag = rte_eth_tx_queue_setup(dev->port_id, i, 
>> + NIC_PORT_TX_Q_SIZE,
>>dev->socket_id, &tx_conf);
>>  if (diag) {
>>  VLOG_ERR("eth dev tx queue setup error %d",diag); @@
>> -378,7 +381,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
>> OVS_REQUIRES(dpdk_mutex)
>>  }
>>
>>  for (i = 0; i < NR_QUEUE; i++) {
>> -diag = rte_eth_rx_queue_setup(dev->port_id, i, MAX_RX_QUEUE_LEN,
>> +diag = rte_eth_rx_queue_setup(dev->port_id, i, 
>> + NIC_PORT_RX_Q_SIZE,
>>dev->socket_id,
>>&rx_conf, dev->dpdk_mp->mp);
>>  if (diag) {
>>
>>
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [Patch v2 1/1] PMD SMP DPDK TX queue added with packet output surge adsorption

2014-06-27 Thread Polehn, Mike A

Version 2: Changes:
Rebased due to recent changes in code.
Made coding style changes based on feedback from Ben Pfaff.

Put in a DPDK queue to receive multiple SMP input from vSwitch for NIC TX 
output.
Eliminated the inside polling loop SMP TX output lock (DPDK queue handles SMP).
Reused SMP tx-lock for non-polling operation to allow TX output by a 
non-polling thread
 when interface not being polled. Lock only accessed only when polling is 
not enabled.
Added new netdev subroutine to control polling lock and enable and disable flag.
Packets do not get discarded between TX pre-queue and NIC queue to handle 
surges.
Removed new code packet buffer leak.

Measured improved port to port packet rates.

Measured improved average PMD port to port 2544 zero loss packet rate of 
299,830 for 
packets 256 bytes and smaller. Predict double that when using 1 cpu 
core/interface.

Observed better persistence of obtaining 100% 10 GbE for larger packets with 
the added
DPDK queue, consistent with other tests outside of OVS where large surges from 
fast path 
interfaces transferring larger sized packets from VMs were being absorbed in 
the NIC 
TX pre-queue and TX queue and packet loss was suppressed.

Requires earlier patch: PATCH [1/1] High speed PMD physical NIC queue size

Signed-off-by: Mike A. Polehn 

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
old mode 100644
new mode 100755
index f490900..2a6d79f
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1894,6 +1894,11 @@ reload:
 poll_cnt = pmd_load_queues(f, &poll_list, poll_cnt);
 atomic_read(&f->change_seq, &port_seq);
 
+/* Get polling ownership of interfaces. */
+for (i = 0; i < poll_cnt; i++) {
+   netdev_rxq_set_polling(poll_list[i].rx, true);
+}
+
 for (;;) {
 unsigned int c_port_seq;
 int i;
@@ -1916,6 +1921,11 @@ reload:
 }
 }
 
+/* Release polling ownership of interfaces */
+for (i = 0; i < poll_cnt; i++) {
+   netdev_rxq_set_polling(poll_list[i].rx, false);
+}
+
 if (!latch_is_set(&f->dp->exit_latch)){
 goto reload;
 }
diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 65ae9f9..fc77b6a 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -1609,6 +1609,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
 netdev_bsd_rxq_recv, \
 netdev_bsd_rxq_wait, \
 netdev_bsd_rxq_drain,\
+NULL, /* rxq_set_polling */   \
 }
 
 const struct netdev_class netdev_bsd_class =
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index eb06595..e26c6fb 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -73,6 +73,8 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 
 #define NIC_PORT_RX_Q_SIZE 2048  /* Size of Physical NIC RX Queue (n*32<4096)*/
 #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue (n*32<4096)*/
+#define NIC_TX_PRE_Q_SIZE  4096  /* Size of Physical NIC TX Pre-Que (2**n)*/
+#define NIC_TX_PRE_Q_TRANS   64  /* Pre-Que to Physical NIC Que Transfer */
 
 /* TODO: Needs per NIC value for these constants. */
 #define RX_PTHRESH 32 /* Default values of RX prefetch threshold reg. */
@@ -124,8 +126,6 @@ static const struct rte_eth_txconf tx_conf = {
 };
 
 enum { MAX_RX_QUEUE_LEN = 64 };
-enum { MAX_TX_QUEUE_LEN = 64 };
-enum { DRAIN_TSC = 20ULL };
 
 static int rte_eal_init_ret = ENODEV;
 
@@ -147,10 +147,12 @@ struct dpdk_mp {
 };
 
 struct dpdk_tx_queue {
-rte_spinlock_t tx_lock;
+bool is_polled;
+int port_id;
 int count;
-uint64_t tsc;
-struct rte_mbuf *burst_pkts[MAX_TX_QUEUE_LEN];
+struct rte_mbuf *tx_trans[NIC_TX_PRE_Q_TRANS];
+struct rte_ring *tx_preq;
+rte_spinlock_t tx_lock;
 };
 
 struct netdev_dpdk {
@@ -363,6 +365,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 struct ether_addr eth_addr;
 int diag;
 int i;
+char qname[32];
 
 if (dev->port_id < 0 || dev->port_id >= rte_eth_dev_count()) {
 return -ENODEV;
@@ -375,12 +378,21 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) 
OVS_REQUIRES(dpdk_mutex)
 }
 
 for (i = 0; i < NR_QUEUE; i++) {
+dev->tx_q[i].port_id = dev->port_id;
 diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE,
   dev->socket_id, &tx_conf);
 if (diag) {
 VLOG_ERR("eth dev tx queue setup error %d",diag);
 return diag;
 }
+
+snprintf(qname, sizeof(qname),"NIC_TX_Pre_Q_%u_%u", dev->port_id, i);
+dev->tx_q[i].tx_preq = rte_ring_create(qname, NIC_TX_PRE_Q_SIZE,
+   dev->socket_id, RING_F_SC_DEQ);
+if (dev->tx_q[i].tx_preq == NULL) {
+VLOG_ERR("eth dev tx pre-queue alloc error&q

Re: [ovs-dev] [PATCH v2 5/5] netdev-dpdk: Add OVS_UNLIKELY annotations in dpdk_do_tx_copy().

2014-06-30 Thread Polehn, Mike A

These are already in the git repository code.

Mike Polehn

-Original Message-
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Daniele Di Proietto
Sent: Monday, June 30, 2014 10:00 AM
To: Ryan Wilson
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v2 5/5] netdev-dpdk: Add OVS_UNLIKELY annotations 
in dpdk_do_tx_copy().

Acked-by: Daniele Di Proietto 

On Jun 26, 2014, at 7:17 PM, Ben Pfaff  wrote:

> Great, thanks. Looks good.
> 
> I'll leave it to whoever reviews the series as a whole to push this.
> On Jun 26, 2014 6:36 PM, "Ryan Wilson 76511"  wrote:
> 
>> Crap, its late in the day and I can't think / type apparently. Yes 
>> 0.04 million is what I meant.
>> 
>> And I ran 2 more tests in the meantime with and without the patch and 
>> I got a 0.03 and 0.04 million PPS increase, respectively. 
>> Nonetheless, the increase is fairly consistent over 5 different tests.
>> 
>> Ryan
>> 
>>  From: Ben Pfaff 
>> Date: Thursday, June 26, 2014 6:26 PM
>> To: Ryan Wilson 
>> Cc: Ryan Wilson , "dev@openvswitch.org" < 
>> dev@openvswitch.org>
>> Subject: Re: [ovs-dev] [PATCH v2 5/5] netdev-dpdk: Add OVS_UNLIKELY 
>> annotations in dpdk_do_tx_copy().
>> 
>>  .4 million or .04 million? There's a big difference.
>> On Jun 26, 2014 6:24 PM, "Ryan Wilson 76511"  wrote:
>> 
>>> Its between 0.2 - 0.6 million PPS increase after running 3 tests 
>>> with and without this patch. So I went with the average of 0.4 :)
>>> 
>>> And we actually use these annotations elsewhere in 
>>> netdev_dpdk_send() where we measure size of packets and dropped 
>>> packets, so it would be nice to add these annotations for code consistency 
>>> as well.
>>> 
>>> Ryan
>>> 
>>>  From: Ben Pfaff 
>>> Date: Thursday, June 26, 2014 6:20 PM
>>> To: Ryan Wilson 
>>> Cc: "dev@openvswitch.org" 
>>> Subject: Re: [ovs-dev] [PATCH v2 5/5] netdev-dpdk: Add OVS_UNLIKELY 
>>> annotations in dpdk_do_tx_copy().
>>> 
>>>  That's pretty impressive. Is the performance consistent enough to 
>>> be sure, then?
>>> 
>>> In either case I don't object to the patch.
>>> On Jun 26, 2014 6:17 PM, "Ryan Wilson"  wrote:
>>> 
>>>> Since dropped packets due to large packet size or lack of memory 
>>>> are unlikely, it is best to add OVS_UNLIKELY annotations to these 
>>>> conditions.
>>>> 
>>>> With DPDK fast path forwarding test, this increased throughtput 
>>>> from 4.12 to 4.16 million packets per second.
>>>> 
>>>> Signed-off-by: Ryan Wilson 
>>>> ---
>>>> lib/netdev-dpdk.c |4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 
>>>> 0aee14e..03f1e02 100644
>>>> --- a/lib/netdev-dpdk.c
>>>> +++ b/lib/netdev-dpdk.c
>>>> @@ -664,7 +664,7 @@ dpdk_do_tx_copy(struct netdev *netdev, struct 
>>>> dpif_packet ** pkts, int cnt)
>>>> 
>>>> for (i = 0; i < cnt; i++) {
>>>> int size = ofpbuf_size(&pkts[i]->ofpbuf);
>>>> -if (size > dev->max_packet_len) {
>>>> +if (OVS_UNLIKELY(size > dev->max_packet_len)) {
>>>> VLOG_WARN_RL(&rl, "Too big size %d max_packet_len %d",
>>>>  (int)size , dev->max_packet_len);
>>>> 
>>>> @@ -688,7 +688,7 @@ dpdk_do_tx_copy(struct netdev *netdev, struct 
>>>> dpif_packet ** pkts, int cnt)
>>>> newcnt++;
>>>> }
>>>> 
>>>> -if (dropped) {
>>>> +if (OVS_UNLIKELY(dropped)) {
>>>> ovs_mutex_lock(&dev->mutex);
>>>> dev->stats.tx_dropped += dropped;
>>>> ovs_mutex_unlock(&dev->mutex);
>>>> --
>>>> 1.7.9.5
>>>> 
>>>> ___
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/m
>>>> ailman/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=MV9BdLjtF
>>>> IdhBDBaw5z%2BU6SSA2gAfY4L%2F1HCy3VjlKU%3D%0A&m=B%2BD2KiuphwYDp1kjSp
>>>> IP5KeaBvJJGWoiQ7P6URgnkvM%3D%0A&s=9ce118c52fc0ec372ba651cd20cfd5e5b
>>>> 2f4692865c242bb3adea3834b82fb5f 
>>>> <https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/
>>>> mailman/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=TfBS78Vw
>>>> 3dzttvXidhbffg%3D%3D%0A&m=wtH3lN2ST0E5hR7ESg7AwzXseDogoZZdb1KOoAV5u
>>>> Q0%3D%0A&s=1542518c0ff9ce83f83a308a7e942d661a79c78b4fbac3e67a27b268
>>>> c9d58df0>
>>>> 
>>> 
> ___
> dev mailing list
> dev@openvswitch.org
> https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/mail
> man/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=MV9BdLjtFIdhBDB
> aw5z%2BU6SSA2gAfY4L%2F1HCy3VjlKU%3D%0A&m=B%2BD2KiuphwYDp1kjSpIP5KeaBvJ
> JGWoiQ7P6URgnkvM%3D%0A&s=9ce118c52fc0ec372ba651cd20cfd5e5b2f4692865c24
> 2bb3adea3834b82fb5f

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Urgent Attention

2015-05-15 Thread REV. MIKE EDWARD

Attn Pls

Good day dear beneificiary and pls i want to inform you that the sum of($3.6
m)has been released and the transfer began today through Western union
transfer $8000.00 per a day.So contact Western union Director Rev.Mike Edward 
and
ask him what youneedto do to enable them activate your account file so that
you will be
able to pick your $8000.00, their email is (rev.m...@inbox.lv) Tel:
+22998074957.

Your name
Your Address
Your country
Your city
Your tell
Your test question
Your test answer
Your id

forward the information or call
Sincerely,
Sincerely,
Mrs Susan David
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 1/1] netdev IVSHMEM shared memory usage documentation

2015-06-19 Thread Polehn, Mike A

This adds documentation for DPDK netdev to do an IVSHMEM shared memory to host 
app or VM app
test using current OVS code. This example allows people to do learn how it is 
done, so that they can
develop their own shared IVSHMEM memory applications.  Also adds knowledge to 
better system
setup for realtime task operation.

Signed-off-by: Mike A. Polehn 

diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
index cdef6cf..64ae6f1 100644
--- a/INSTALL.DPDK.md
+++ b/INSTALL.DPDK.md
@@ -49,6 +49,41 @@ on Debian/Ubuntu)
  For further details refer to http://dpdk.org/
+1b. Alternative DPDK 2.0 install
+  1. Get DPDK from git repository
+
+ cd /usr/src
+ git clone git://dpdk.org/dpdk
+ cd /usr/src/dpdk
+ git checkout -b test_v2.0.0 v2.0.0
+ export DPDK_DIR=/usr/src/dpdk
+
+  2 If DPDK already installed with different version or build parameters.
+ Ideally done before checking out a new version.
+
+ cd $(DPDK_DIR)
+ make uninstall
+
+  3. Build DPDK with IVSHMEM and User Side vHost support.
+ Note: Split ring (optional CONFIG_RTE_RING_SPLIT_PROD_CONS=y) has
+ notably better performance for two simaltanious data sources, as in
+ the case of two simultaneous port tasks or threads, writing into an
+ IVSHMEM ring (in either host or VM) at the same time. However for just
+ one task or thread, for example one port of data being switched a full
+ rate into IVSHMEM ring buffer, little or no data rate difference will
+ be observed.
+
+ cd $(DPDK_DIR)
+ make install T=x86_64-ivshmem-linuxapp-gcc CONFIG_RTE_LIBRTE_VHOST=y \
+ CONFIG_RTE_BUILD_COMBINE_LIBS=y CONFIG_RTE_LIBRTE_VHOST_USER=n \
+ CONFIG_RTE_RING_SPLIT_PROD_CONS=y
+
+ Note: Any host or VM task using shared memory as in the case of IVSHMEM,
+ must have DPDK built and installed in exactly the same way for all
+ DPDK programs on the system. DPDK install in VM needs same DPDK source
+ and build. Any changes in DPDK build requires all apps, including OVS,
+ host apps, and VM apps to be rebuilt and relinked.
+
2. Configure & build the Linux kernel:
Refer to intel-dpdk-getting-started-guide.pdf for understanding
@@ -85,9 +120,24 @@ Using the DPDK with ovs-vswitchd:
-
 1. Setup system boot
-   Add the following options to the kernel bootline:
+   Add the following options to the kernel bootline for both 1 GB and 2 MB 
support:

-   `default_hugepagesz=1GB hugepagesz=1G hugepages=1`
+   `default_hugepagesz=1GB hugepagesz=1GB hugepages=16 hugepagesz=2M 
hugepages=2048`
+
+   For just 1 GB hugepage support:
+
+   `default_hugepagesz=1GB hugepagesz=1GB hugepages=16`
+
+   This kernel bootline will allocate half the hugepages on each NUMA node.  
For
+   the IVSHMEM test below, 4 GB of 1 GB huge pages is needed for the test (1GB
+   for OVS and 3 GB for VM).  This requires at least 8 1 GB pages for 4 1 GB
+   pages of NUMA node 0 hugepage memory to be available since half will be
+   allocated on NUMA Node 1 (assuming 2 CPU socket system). If system has
+   limited amount of memory or only 1 NUMA node, may need to adjust. At this
+   time 1GB pages are required and 2 MB pages are optional but very desirable
+   to have both 1 GB and 2 MB hugepage memory available on host at same time.
+   Dual hugepage memory size in VM is also very desirable (see IVSHMEM VM
+   setup information below).
 2. Setup DPDK devices:
@@ -112,9 +162,14 @@ Using the DPDK with ovs-vswitchd:
  3. Bind network device to vfio-pci:
 `$DPDK_DIR/tools/dpdk_nic_bind.py --bind=vfio-pci eth1`
-3. Mount the hugetable filsystem
-
+3. Mount the hugetable filesystem
+   The following may or may not be needed depending on host OS:
+   `mkdir /dev/hugepages`
+   Mount for 1 GB hugepages
`mount -t hugetlbfs -o pagesize=1G none /dev/hugepages`
+   For additional 2 MB hugepage support:
+   `mkdir /dev/hugepages_2mb`
+   `mount -t hugetlbfs nodev /dev/hugepages_2mb -o pagesize=2MB`
Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
@@ -267,7 +322,7 @@ Using the DPDK with ovs-vswitchd:
ovs-appctl dpif-netdev/pmd-stats-show
```
-DPDK Rings :
+DPDK Rings:

 Following the steps above to create a bridge, you can now add dpdk rings
@@ -299,12 +354,124 @@ The application simply receives an mbuf on the receive 
queue of the
ethernet ring and then places that same mbuf on the transmit ring of
the ethernet ring.  It is a trivial loopback application.
+DPDK Ring access on Host using IVSHMEM:
+---
+
+Use test program ring_client for IVSHMEM flow test. Requires DPDK to have
+been built with IVSHMEM support. Rebuild DPDK and OVS with IVSHMEM support
+(above) if not already.
+
+1: Move to directory with ring_client.c
+
+  cd $(OVS_DIR)/tests/dpdk
+
+  If desired, copy outside of OVS code tree and move to, to create
+  example of a DPDK host app with an IVSHMEM ring accessbile from OVS.
+
+2: Patch or

[ovs-dev] Hi

Re: [ovs-dev] RFC: OVN database options

[ovs-dev] Gold dust/Bar

Re: [ovs-dev] [PATCH] netdev-linux: Don't restrict policing to IPv4 and don't call "tc".

[ovs-dev] hello

[ovs-dev] [Patch] Fixes DPDK Queue size for IVSHMEM VM communications

[ovs-dev] [Patch] Documentation for DPDK IVSHMEM VM Communications

Re: [ovs-dev] [Patch] Documentation for DPDK IVSHMEM VM Communications

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition, rev 1

Re: [ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition

[ovs-dev] Patch [ 1/1] User space dpdk setup documentation addition, rev 2

Re: [ovs-dev] [PATCH v2] dpif-netdev: Polling threads directly call ofproto upcall functions.

[ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size

[ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

Re: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

Re: [ovs-dev] [Backport] Backport max-idle to branch-1.10 - branch-2.1.

[ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and packet surge assoprtion.

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

Re: [ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and packet surge assoprtion.

Re: [ovs-dev] PATCH [1/1] High speed PMD physical NIC queue size, resubmit

[ovs-dev] [Patch v2 1/1] PMD SMP DPDK TX queue added with packet output surge adsorption

Re: [ovs-dev] [PATCH v2 5/5] netdev-dpdk: Add OVS_UNLIKELY annotations in dpdk_do_tx_copy().

[ovs-dev] Urgent Attention

[ovs-dev] [PATCH 1/1] netdev IVSHMEM shared memory usage documentation

28 matches

Site Navigation

Mail list logo

Footer information