[dpdk-dev] [PATCH 01/10] examples/tep_termination:initialize the VXLAN sample

2015-05-18 Thread Liu, Jijiang
Hi Stephen,

Thanks for reviewing the patch.

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, May 16, 2015 7:54 AM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/10] examples/tep_termination:initialize the
> VXLAN sample
> 
> I agree that this is great to see a real example of this
> 
> On Fri, 15 May 2015 14:08:52 +0800
> Jijiang Liu  wrote:
> 
> > +static unsigned
> > +check_ports_num(unsigned nb_ports)
> > +{
> > +   unsigned valid_nb_ports = nb_ports;
> > +   unsigned portid;
> > +
> > +   if (nb_ports > nb_ports) {
> > +   RTE_LOG(INFO, VHOST_PORT, "\nSpecified port number(%u)
> exceeds total system port number(%u)\n",
> > +   nb_ports, nb_ports);
> > +   nb_ports = nb_ports;
> 
> This looks repetative, and wrong, is it something to shut up a compiler 
> warning?
> or something that happened as result of global replace?

Yes, it happened as result of global replace. I will fix this in next  patch 
version.




[dpdk-dev] [PATCH] fm10k: support XEN domain0

2015-05-18 Thread Liu, Jijiang
Hi guys,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Saturday, May 16, 2015 7:58 AM
> To: He, Shaopeng
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] fm10k: support XEN domain0
> 
> On Fri, 15 May 2015 16:56:02 +0800
> Shaopeng He  wrote:
> 
> > fm10k was failing to run in XEN domain0, as the physical memory for
> > DMA should be allocated and translated in a different way for XEN
> > domain0. So
> > rte_memzone_reserve_bounded() should be used for DMA memory
> > allocation, and rte_mem_phy2mch() should be used for DMA memory
> > address translation to support running fm10k PMD in XEN domain0.
> >
> > Signed-off-by: Shaopeng He 
> 
> I agree with Thomas that this code has spread everywhere and should be in a
> common spot.
> 
> Also, we discovered as part of the Xen net-front driver that it should be a
> runtime determination, not a config option!

I also agree that it should be in a common spot.
But  it had better to apply the following Stephen's patch first. If so, 
Shaopeng just use the common function in the patch, which would be good.  
http://dpdk.org/ml/archives/dev/2015-March/014992.html



[dpdk-dev] [PATCH 01/10] examples/tep_termination:initialize the VXLAN sample

2015-05-18 Thread Liu, Jijiang


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, May 16, 2015 7:56 AM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/10] examples/tep_termination:initialize the
> VXLAN sample
> 
> On Fri, 15 May 2015 14:08:52 +0800
> Jijiang Liu  wrote:
> 
> > +   while (dev_ll != NULL) {
> > +   /*get virtio device ID*/
> 
> Really minor style nit. Please put whitespace in comments.
> Do this instead.
>   /* get virtio device ID */
> 
> Also, the name virtio is confusing since it can be confused with KVM virtio.

Ok, will fix them in next version.



[dpdk-dev] [PATCH 00/10] Add a VXLAN sample

2015-05-18 Thread Liu, Jijiang
Hi John,

Though it is an example, I think we had better split these changes into 
multiple patch so as to understand what I have changed here.
It will be clear and easy if you can review these changes in the mail list
http://dpdk.org/ml/archives/dev/2015-May/017693.html


Thanks
Jijiang Liu

> -Original Message-
> From: Mcnamara, John
> Sent: Friday, May 15, 2015 4:19 PM
> To: Liu, Jijiang; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 00/10] Add a VXLAN sample
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > Sent: Friday, May 15, 2015 7:09 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 00/10] Add a VXLAN sample
> 
> Hi,
> 
> There are multiple changes on the same (new) files within the patchset. These
> would be better rebased/squashed into one patch. That would make it easier to
> review as well.
> 
> John



[dpdk-dev] [PATCH v4 1/5] vhost: eventfd_link: moving ioctl to a function

2015-05-18 Thread Xie, Huawei
On 5/7/2015 9:17 PM, Pavel Boldin wrote:


On Thu, May 7, 2015 at 10:57 AM, Xie, Huawei mailto:huawei.xie at intel.com>> wrote:
On 4/3/2015 1:02 AM, Pavel Boldin wrote:
> Move ioctl `EVENTFD_COPY' handler code to an inline function.
Pavel:
There is no necessity to inline this function.
Xie, there is even no necessity to split this in a five piece patchseries. I 
did that solely for the purpose of clean reading.

There is no necessity to inline any single-used functions as long the compiler 
is decent. But I prefer to instruct compiler to do this explictly so there is 
no call/ret path in the generated code.

The purpose of inline or not is not for friendly reading. inline is for 
performance only.
Pavel


/huawei




[dpdk-dev] Can't compile master branch with icc

2015-05-18 Thread Tetsuya Mukawa
Hi Helin,

It seems master branch cannot be compiled with icc like below.

$ T=x86_64-native-linuxapp-icc make install

...snip

dpdk/lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
type mixed with another type
  hw->aq.asq_last_status = old_asq_status;


As a result of bisecting, below patch may cause the issue.
3b7271f i40e/base: catch NVM write semaphore timeout and retry

Could you please check it?

Regards,
Tetsuya


[dpdk-dev] [PATCH v2] vhost: flush used->idx update before reading avail->flags

2015-05-18 Thread Nikita Kalyazin
Ah, sorry. I looked at it without the context. Thanks.

-- 

Best regards,

Nikita Kalyazin,
n.kalyazin at samsung.com

Software Engineer
CE OS Group
Samsung R&D Institute Russia
Tel: +7 (495) 797-25-00 #3816
Tel: +7 (495) 797-25-03
Office #1501, 12-1, Dvintsev str.,
Moscow, 127018, Russia

On Fri, May 15, 2015 at 05:23:35PM +0200, Michael S. Tsirkin wrote:
> On Fri, May 15, 2015 at 04:43:33PM +0300, Nikita Kalyazin wrote:
> > Hi,
> > 
> > 
> > Maybe I missed a part of the discussion, but is there any special purpose 
> > for using rte_mb (both read and write fence) here rather than rte_wmb 
> > (write fence only)?
> 
> The fence is between write of used->idx and read of avail->flags, so
> rte_wmb won't do anything useful.
> 
> > -- 
> > 
> > Best regards,
> > 
> > Nikita Kalyazin,
> > n.kalyazin at samsung.com
> > 
> > Software Engineer
> > CE OS Group
> > Samsung R&D Institute Russia
> > Tel: +7 (495) 797-25-00 #3816
> > Tel: +7 (495) 797-25-03
> > Office #1501, 12-1, Dvintsev str.,
> > Moscow, 127018, Russia
> > 
> > On Wed, May 13, 2015 at 12:46:30PM +0200, Thomas Monjalon wrote:
> > > 2015-04-29 19:11, Huawei Xie:
> > > > update of used->idx and read of avail->flags could be reordered.
> > > > memory fence should be used to ensure the order, otherwise guest could 
> > > > see a stale used->idx value after it toggles the interrupt suppression 
> > > > flag.
> > > > After guest sets the interrupt suppression flag, it will check if there 
> > > > is more buffer to process through used->idx. If it sees a stale value, 
> > > > it will exit the processing while host willn't send interrupt to guest.
> > > > 
> > > > Signed-off-by: Huawei Xie 
> > > 
> > > Applied with following title, thanks
> > >   vhost: fix virtio freeze due to missed interrupt
> > > 


[dpdk-dev] Can't compile master branch with icc

2015-05-18 Thread Zhang, Helin
Thank you very much for the good catch on ICC! Please try gcc for now. Sorry 
for any inconvenience!
I will send out the patch soon.

Regards,
Helin

> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Monday, May 18, 2015 2:09 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Can't compile master branch with icc
> 
> Hi Helin,
> 
> It seems master branch cannot be compiled with icc like below.
> 
> $ T=x86_64-native-linuxapp-icc make install
> 
> ...snip
> 
> dpdk/lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188:
> enumerated type mixed with another type
>   hw->aq.asq_last_status = old_asq_status;
> 
> 
> As a result of bisecting, below patch may cause the issue.
> 3b7271f i40e/base: catch NVM write semaphore timeout and retry
> 
> Could you please check it?
> 
> Regards,
> Tetsuya


[dpdk-dev] [PATCH 0/4] misc compilation fixes

2015-05-18 Thread Olivier Matz
This series contains compilation fixes.

Olivier Matz (4):
  examples/bond: fix compilation with clang
  examples/netmap: fix compilation for x86_x32-native-linuxapp-gcc
  pmds: fix 32 bits compilation with debug enabled
  examples/mk: add dependencies for timer and vm_power_manager

 examples/Makefile  |   4 +-
 examples/bond/main.c   |   2 +-
 examples/netmap_compat/lib/compat_netmap.c |   2 +-
 lib/librte_pmd_fm10k/fm10k_rxtx.c  |   5 +-
 lib/librte_pmd_i40e/i40e_ethdev.c  | 124 ++---
 lib/librte_pmd_i40e/i40e_rxtx.c|   2 +-
 lib/librte_pmd_virtio/virtio_ethdev.c  |   2 +-
 7 files changed, 72 insertions(+), 69 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH 1/4] examples/bond: fix compilation with clang

2015-05-18 Thread Olivier Matz
Fix the following compilation error:

examples/bond/main.c:717:1: error: control reaches end of
  non-void function [-Werror,-Wreturn-type]

The prompt() function does not return anything, so fix its prototype
to be void.

Signed-off-by: Olivier Matz 
---
 examples/bond/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/bond/main.c b/examples/bond/main.c
index e90dc1d..4622283 100644
--- a/examples/bond/main.c
+++ b/examples/bond/main.c
@@ -705,7 +705,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 };

 /* prompt function, called from main on MASTER lcore */
-static void *prompt(__attribute__((unused)) void *arg1)
+static void prompt(__attribute__((unused)) void *arg1)
 {
struct cmdline *cl;

-- 
2.1.4



[dpdk-dev] [PATCH 2/4] examples/netmap: fix compilation for x86_x32-native-linuxapp-gcc

2015-05-18 Thread Olivier Matz
Fix a cast issue:
examples/netmap_compat/lib/compat_netmap.c:827:10: error: cast to
  pointer from integer of different size [-Werror=int-to-pointer-cast]

Signed-off-by: Olivier Matz 
---
 examples/netmap_compat/lib/compat_netmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/netmap_compat/lib/compat_netmap.c 
b/examples/netmap_compat/lib/compat_netmap.c
index 1d86ef0..856ab6e 100644
--- a/examples/netmap_compat/lib/compat_netmap.c
+++ b/examples/netmap_compat/lib/compat_netmap.c
@@ -824,7 +824,7 @@ rte_netmap_mmap(void *addr, size_t length,
return (MAP_FAILED);
}

-   return ((void *)((uintptr_t)netmap.mem + offset));
+   return (void *)((uintptr_t)netmap.mem + (uintptr_t)offset);
 }

 /**
-- 
2.1.4



[dpdk-dev] [PATCH 3/4] pmds: fix 32 bits compilation with debug enabled

2015-05-18 Thread Olivier Matz
When debug is enabled for 32 bits targets, it triggers some format
errors that are not visible in 64 bits. Fix them by using the proper
format from inttypes.h or the proper cast.

Signed-off-by: Olivier Matz 
---
 lib/librte_pmd_fm10k/fm10k_rxtx.c |   5 +-
 lib/librte_pmd_i40e/i40e_ethdev.c | 124 +-
 lib/librte_pmd_i40e/i40e_rxtx.c   |   2 +-
 lib/librte_pmd_virtio/virtio_ethdev.c |   2 +-
 4 files changed, 68 insertions(+), 65 deletions(-)

diff --git a/lib/librte_pmd_fm10k/fm10k_rxtx.c 
b/lib/librte_pmd_fm10k/fm10k_rxtx.c
index 83bddfc..56df6cd 100644
--- a/lib/librte_pmd_fm10k/fm10k_rxtx.c
+++ b/lib/librte_pmd_fm10k/fm10k_rxtx.c
@@ -30,6 +30,9 @@
  *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
+
+#include 
+
 #include 
 #include 
 #include "fm10k.h"
@@ -57,7 +60,7 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
PMD_RX_LOG(DEBUG, "|   0x%08x   |   0x%08x   |", 0, rxd->d.rss);
PMD_RX_LOG(DEBUG, "+|+");
PMD_RX_LOG(DEBUG, "|TIME TAG |");
-   PMD_RX_LOG(DEBUG, "|   0x%016lx|", rxd->q.timestamp);
+   PMD_RX_LOG(DEBUG, "|   0x%016"PRIx64"|", rxd->q.timestamp);
PMD_RX_LOG(DEBUG, "+|+");
 }
 #endif
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 96700e4..ece88d9 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -1200,19 +1200,19 @@ i40e_update_vsi_stats(struct i40e_vsi *vsi)

PMD_DRV_LOG(DEBUG, "* VSI[%u] stats start 
***",
vsi->vsi_id);
-   PMD_DRV_LOG(DEBUG, "rx_bytes:%lu", nes->rx_bytes);
-   PMD_DRV_LOG(DEBUG, "rx_unicast:  %lu", nes->rx_unicast);
-   PMD_DRV_LOG(DEBUG, "rx_multicast:%lu", nes->rx_multicast);
-   PMD_DRV_LOG(DEBUG, "rx_broadcast:%lu", nes->rx_broadcast);
-   PMD_DRV_LOG(DEBUG, "rx_discards: %lu", nes->rx_discards);
-   PMD_DRV_LOG(DEBUG, "rx_unknown_protocol: %lu",
+   PMD_DRV_LOG(DEBUG, "rx_bytes:%"PRIu64"", nes->rx_bytes);
+   PMD_DRV_LOG(DEBUG, "rx_unicast:  %"PRIu64"", nes->rx_unicast);
+   PMD_DRV_LOG(DEBUG, "rx_multicast:%"PRIu64"", nes->rx_multicast);
+   PMD_DRV_LOG(DEBUG, "rx_broadcast:%"PRIu64"", nes->rx_broadcast);
+   PMD_DRV_LOG(DEBUG, "rx_discards: %"PRIu64"", nes->rx_discards);
+   PMD_DRV_LOG(DEBUG, "rx_unknown_protocol: %"PRIu64"",
nes->rx_unknown_protocol);
-   PMD_DRV_LOG(DEBUG, "tx_bytes:%lu", nes->tx_bytes);
-   PMD_DRV_LOG(DEBUG, "tx_unicast:  %lu", nes->tx_unicast);
-   PMD_DRV_LOG(DEBUG, "tx_multicast:%lu", nes->tx_multicast);
-   PMD_DRV_LOG(DEBUG, "tx_broadcast:%lu", nes->tx_broadcast);
-   PMD_DRV_LOG(DEBUG, "tx_discards: %lu", nes->tx_discards);
-   PMD_DRV_LOG(DEBUG, "tx_errors:   %lu", nes->tx_errors);
+   PMD_DRV_LOG(DEBUG, "tx_bytes:%"PRIu64"", nes->tx_bytes);
+   PMD_DRV_LOG(DEBUG, "tx_unicast:  %"PRIu64"", nes->tx_unicast);
+   PMD_DRV_LOG(DEBUG, "tx_multicast:%"PRIu64"", nes->tx_multicast);
+   PMD_DRV_LOG(DEBUG, "tx_broadcast:%"PRIu64"", nes->tx_broadcast);
+   PMD_DRV_LOG(DEBUG, "tx_discards: %"PRIu64"", nes->tx_discards);
+   PMD_DRV_LOG(DEBUG, "tx_errors:   %"PRIu64"", nes->tx_errors);
PMD_DRV_LOG(DEBUG, "* VSI[%u] stats end 
***",
vsi->vsi_id);
 }
@@ -1424,73 +1424,73 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
stats->ierrors  = stats->ibadcrc + stats->ibadlen + stats->imissed;

PMD_DRV_LOG(DEBUG, "* PF stats start 
***");
-   PMD_DRV_LOG(DEBUG, "rx_bytes:%lu", ns->eth.rx_bytes);
-   PMD_DRV_LOG(DEBUG, "rx_unicast:  %lu", ns->eth.rx_unicast);
-   PMD_DRV_LOG(DEBUG, "rx_multicast:%lu", ns->eth.rx_multicast);
-   PMD_DRV_LOG(DEBUG, "rx_broadcast:%lu", ns->eth.rx_broadcast);
-   PMD_DRV_LOG(DEBUG, "rx_discards: %lu", ns->eth.rx_discards);
-   PMD_DRV_LOG(DEBUG, "rx_unknown_protocol: %lu",
+   PMD_DRV_LOG(DEBUG, "rx_bytes:%"PRIu64"", ns->eth.rx_bytes);
+   PMD_DRV_LOG(DEBUG, "rx_unicast:  %"PRIu64"", 
ns->eth.rx_unicast);
+   PMD_DRV_LOG(DEBUG, "rx_multicast:%"PRIu64"", 
ns->eth.rx_multicast);
+   PMD_DRV_LOG(DEBUG, "rx_broadcast:%"PRIu64"", 
ns->eth.rx_broadcast);
+   PMD_DRV_LOG(DEBUG, "rx_discards: %"PRIu64"", 
ns->eth.rx_discards);
+   PMD_DRV_LOG(DEBUG, "rx_unknown_protocol: %"PRIu64"",
ns->eth.rx_unknown_protoc

[dpdk-dev] [PATCH 4/4] examples/mk: add dependencies for timer and vm_power_manager

2015-05-18 Thread Olivier Matz
Do not compile these examples if the related dpdk option is not
enabled, as it's done for other examples. It allows to build
the examples directory with a reduced dpdk configuration.

Signed-off-by: Olivier Matz 
---
 examples/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/Makefile b/examples/Makefile
index d549026..e659f6f 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -67,11 +67,11 @@ DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
 DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
-DIRS-y += timer
+DIRS-$(CONFIG_RTE_LIBRTE_TIMER) += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
 DIRS-y += vmdq
 DIRS-y += vmdq_dcb
-DIRS-y += vm_power_manager
+DIRS-$(CONFIG_RTE_LIBRTE_POWER) += vm_power_manager

 include $(RTE_SDK)/mk/rte.extsubdir.mk
-- 
2.1.4



[dpdk-dev] dpdk 2.0.0: Issue mapping mempool into guest using IVSHMEM

2015-05-18 Thread Mauricio Vásquez
Hi all,

I'm trying to map a mempool into a guest using the IVSHMEM library but the
mempool is not visible from the guest.

The code I'm running is quite simple, on the host I run a primary DPDK
process that creates the mempool, creates a metadata file and then adds the
mempool to it.

The code is:
...
int main(int argc, char * argv[])
{
int retval = 0;

/* Init EAL, parsing EAL args */
retval = rte_eal_init(argc, argv);
if (retval < 0)
return -1;

char cmdline[PATH_MAX] = {0};

struct rte_mempool *packets_pool;
//Create mempool
packets_pool = rte_mempool_create(
"packets",
NUM_PKTS,
MBUF_SIZE,
CACHE_SIZE,//This is the size of the mempool
cache
sizeof(struct rte_pktmbuf_pool_private),
rte_pktmbuf_pool_init,
NULL,
rte_pktmbuf_init,
NULL,
rte_socket_id(),
0 /*NO_FLAGS*/);

if (packets_pool == NULL)
rte_exit(EXIT_FAILURE,"Cannot init the packets pool\n");

//Create metadata file
if (rte_ivshmem_metadata_create(metadata_name) < 0)
rte_exit(EXIT_FAILURE, "Cannot create metadata file\n");

//Add mempool to metadata file
if(rte_ivshmem_metadata_add_mempool(packets_pool, metadata_name) < 0)
rte_exit(EXIT_FAILURE, "Cannot add mempool metadata file\n");

//Get qemu command line
if (rte_ivshmem_metadata_cmdline_generate(cmdline, sizeof(cmdline),
metadata_name) < 0)
rte_exit(EXIT_FAILURE, "Failed generating command line for qemu\n");

RTE_LOG(INFO, APP, "Command line for qemu: %s\n", cmdline);
save_ivshmem_cmdline_to_file(cmdline);

//Avoids the application closes
char x = getchar();
(void) x;
return 0;
}

When I run it I can see clearly that the memzone is added:

EAL: Adding memzone 'MP_packets' at 0x7ffec0e8c1c0 to metadata vm_1
EAL: Adding memzone 'RG_MP_packets' at 0x7ffec0d8c140 to metadata vm_1
APP: Command line for qemu: -device
ivshmem,size=2048M,shm=fd:/dev/hugepages/rtemap_0:0x0:0x4000:/dev/zero:0x0:0x3fffc000:/var/run/.dpdk_ivshmem_metadata_vm_1:0x0:0x4000

I run the modified version of QEMU provided by dpdk-ovs using the command
line generated by the host application, then in the guest I run an even
simpler application:

...
void mempool_walk_f(const struct rte_mempool *r, void * arg)
{
RTE_LOG(INFO, APP, "Mempool: %s\n", r->name);
(void) arg;
}

int main(int argc, char *argv[])
{
int retval = 0;

if ((retval = rte_eal_init(argc, argv)) < 0)
return -1;

argc -= retval;
argv += retval;

struct rte_mempool * packets;

packets = rte_mempool_lookup("packets");

if(packets == NULL)
{
RTE_LOG(ERR, APP, "Failed to find mempool\n");
}

RTE_LOG(INFO, APP, "List of mempool: \n");
rte_mempool_walk(mempool_walk_f, NULL);

return 0;
}
...

I can see in the application output that the mem zones that were added are
found:

EAL: Found memzone: 'RG_MP_packets' at 0x7ffec0d8c140 (len 0x100080)
EAL: Found memzone: 'MP_packets' at 0x7ffec0e8c1c0 (len 0x3832100)

But, the rte_mempool_lookup function returns NULL.
Using the rte_mempool_walker the program only prints a memzone called
log_history.

Do you have any suggestion?

Thank you very much for your help.


[dpdk-dev] [PATCH v4 1/5] vhost: eventfd_link: moving ioctl to a function

2015-05-18 Thread Pavel Boldin
On Mon, May 18, 2015 at 9:06 AM, Xie, Huawei  wrote:

> On 5/7/2015 9:17 PM, Pavel Boldin wrote:
>
>
> On Thu, May 7, 2015 at 10:57 AM, Xie, Huawei  huawei.xie at intel.com>> wrote:
> On 4/3/2015 1:02 AM, Pavel Boldin wrote:
> > Move ioctl `EVENTFD_COPY' handler code to an inline function.
> Pavel:
> There is no necessity to inline this function.
> Xie, there is even no necessity to split this in a five piece patchseries.
> I did that solely for the purpose of clean reading.
>
> There is no necessity to inline any single-used functions as long the
> compiler is decent. But I prefer to instruct compiler to do this explictly
> so there is no call/ret path in the generated code.
>
> The purpose of inline or not is not for friendly reading. inline is for
> performance only.
>
Well, an optimizing compiler `inline's all the `static' functions that are
called only once in the file. So, this `inline' is purely for readability
of the code. This makes user aware that the function will be `inline'd
anyway.

Pavel


>
>
>
> /huawei
>
>
>


[dpdk-dev] [PATCH 1/5] ixgbe: remove unnecessary casts

2015-05-18 Thread Bruce Richardson
On Fri, May 15, 2015 at 10:08:23AM -0700, Stephen Hemminger wrote:
> Don't do unnecessary casts when logging messages. Better to use
> the correct printf format code.
> 
> Signed-off-by: Stephen Hemminger 

+1 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 25 -
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index 5f9a1cf..a585151 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -568,8 +568,8 @@ ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev 
> *eth_dev,
>   (hw->mac.type != ixgbe_mac_X550EM_x))
>   return -ENOSYS;
>  
> - PMD_INIT_LOG(INFO, "Setting port %d, %s queue_id %d to stat index %d",
> -  (int)(eth_dev->data->port_id), is_rx ? "RX" : "TX",
> + PMD_INIT_LOG(INFO, "Setting port %u, %s queue_id %d to stat index %d",
> +  eth_dev->data->port_id, is_rx ? "RX" : "TX",
>queue_id, stat_idx);
>  
>   n = (uint8_t)(queue_id / NB_QMAP_FIELDS_PER_QSM_REG);
> @@ -594,8 +594,8 @@ ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev 
> *eth_dev,
>   else
>   stat_mappings->rqsmr[n] |= qsmr_mask;
>  
> - PMD_INIT_LOG(INFO, "Set port %d, %s queue_id %d to stat index %d",
> -  (int)(eth_dev->data->port_id), is_rx ? "RX" : "TX",
> + PMD_INIT_LOG(INFO, "Set port %u, %s queue_id %d to stat index %d",
> +  eth_dev->data->port_id, is_rx ? "RX" : "TX",
>queue_id, stat_idx);
>   PMD_INIT_LOG(INFO, "%s[%d] = 0x%08x", is_rx ? "RQSMR" : "TQSM", n,
>is_rx ? stat_mappings->rqsmr[n] : stat_mappings->tqsm[n]);
> @@ -889,11 +889,11 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
>  
>   if (ixgbe_is_sfp(hw) && hw->phy.sfp_type != ixgbe_sfp_type_not_present)
>   PMD_INIT_LOG(DEBUG, "MAC: %d, PHY: %d, SFP+: %d",
> -  (int) hw->mac.type, (int) hw->phy.type,
> -  (int) hw->phy.sfp_type);
> +  hw->mac.type, hw->phy.type,
> +  hw->phy.sfp_type);
>   else
>   PMD_INIT_LOG(DEBUG, "MAC: %d, PHY: %d",
> -  (int) hw->mac.type, (int) hw->phy.type);
> +  hw->mac.type, hw->phy.type);
>  
>   PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
>   eth_dev->data->port_id, pci_dev->id.vendor_id,
> @@ -2307,14 +2307,13 @@ ixgbe_dev_link_status_print(struct rte_eth_dev *dev)
>   memset(&link, 0, sizeof(link));
>   rte_ixgbe_dev_atomic_read_link_status(dev, &link);
>   if (link.link_status) {
> - PMD_INIT_LOG(INFO, "Port %d: Link Up - speed %u Mbps - %s",
> - (int)(dev->data->port_id),
> - (unsigned)link.link_speed,
> - link.link_duplex == ETH_LINK_FULL_DUPLEX ?
> + PMD_INIT_LOG(INFO, "Port %u: Link Up - speed %u Mbps - %s",
> +  dev->data->port_id, link.link_speed,
> +  link.link_duplex == ETH_LINK_FULL_DUPLEX ?
>   "full-duplex" : "half-duplex");
>   } else {
> - PMD_INIT_LOG(INFO, " Port %d: Link Down",
> - (int)(dev->data->port_id));
> + PMD_INIT_LOG(INFO, "Port %u: Link Down",
> + dev->data->port_id);
>   }
>   PMD_INIT_LOG(INFO, "PCI Address: %04d:%02d:%02d:%d",
>   dev->pci_dev->addr.domain,
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 3/5] ixgbe: raise priority of significant events

2015-05-18 Thread Bruce Richardson
On Fri, May 15, 2015 at 10:08:25AM -0700, Stephen Hemminger wrote:
> The driver does lots of logging at INFO level, but some setup
> events are significant and should be at NOTICE or ERR level
> since they are problems that user should see.
> 
> Also never put tabs in log messages because they get mangled
> by syslog processing.
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

One small nit below.

> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index fa335f4..7e75382 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -1054,8 +1054,8 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>   eth_dev->data->mac_addrs = NULL;
>   return diag;
>   }
> - PMD_INIT_LOG(INFO, "\tVF MAC address not assigned by Host PF");
> - PMD_INIT_LOG(INFO, "\tAssign randomly generated MAC address "
> + PMD_INIT_LOG(NOTICE, "VF MAC address not assigned by Host PF");
> + PMD_INIT_LOG(NOTICE, "Assign randomly generated MAC address "
>"%02x:%02x:%02x:%02x:%02x:%02x",
>perm_addr->addr_bytes[0],
>perm_addr->addr_bytes[1],
> @@ -1248,7 +1248,7 @@ ixgbe_vlan_hw_strip_disable(struct rte_eth_dev *dev, 
> uint16_t queue)
>  
>   if (hw->mac.type == ixgbe_mac_82598EB) {
>   /* No queue level support */
> - PMD_INIT_LOG(INFO, "82598EB not support queue level hw strip");
> + PMD_INIT_LOG(ERR, "82598EB not support queue level hw strip");

Should we not fix the text here to be "82599EB does not support ..." while we
are making this change? (Same with next chunk below too).

>   return;
>   }
>   else {
> @@ -1272,7 +1272,7 @@ ixgbe_vlan_hw_strip_enable(struct rte_eth_dev *dev, 
> uint16_t queue)
>  
>   if (hw->mac.type == ixgbe_mac_82598EB) {
>   /* No queue level supported */
> - PMD_INIT_LOG(INFO, "82598EB not support queue level hw strip");
> + PMD_INIT_LOG(ERR, "82598EB not support queue level hw strip");
>   return;
>   }
>   else {
> @@ -2951,12 +2951,12 @@ ixgbevf_dev_configure(struct rte_eth_dev *dev)
>*/
>  #ifndef RTE_LIBRTE_IXGBE_PF_DISABLE_STRIP_CRC
>   if (!conf->rxmode.hw_strip_crc) {
> - PMD_INIT_LOG(INFO, "VF can't disable HW CRC Strip");
> + PMD_INIT_LOG(NOTICE, "VF can't disable HW CRC Strip");
>   conf->rxmode.hw_strip_crc = 1;
>   }
>  #else
>   if (conf->rxmode.hw_strip_crc) {
> - PMD_INIT_LOG(INFO, "VF can't enable HW CRC Strip");
> + PMD_INIT_LOG(NOTICE, "VF can't enable HW CRC Strip");
>   conf->rxmode.hw_strip_crc = 0;
>   }
>  #endif
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 4/5] ixgbe: use RTE_LOG not rte_log

2015-05-18 Thread Bruce Richardson
On Fri, May 15, 2015 at 10:08:26AM -0700, Stephen Hemminger wrote:
> This driver should follow standard DPDK practice and use
> RTE_LOG macro which allows setting config option to remove
> the debug log messages.
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_ixgbe/ixgbe_logs.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_logs.h 
> b/lib/librte_pmd_ixgbe/ixgbe_logs.h
> index 572e030..53ba42d 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_logs.h
> +++ b/lib/librte_pmd_ixgbe/ixgbe_logs.h
> @@ -35,8 +35,7 @@
>  #define _IXGBE_LOGS_H_
>  
>  #define PMD_INIT_LOG(level, fmt, args...) \
> - rte_log(RTE_LOG_ ## level, RTE_LOGTYPE_PMD, \
> - "PMD: %s(): " fmt "\n", __func__, ##args)
> + RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ##args)
>  
>  #ifdef RTE_LIBRTE_IXGBE_DEBUG_INIT
>  #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 5/5] ixgbe: silence noisy log messages

2015-05-18 Thread Bruce Richardson
On Fri, May 15, 2015 at 10:08:27AM -0700, Stephen Hemminger wrote:
> The ixgbe driver likes to be far to chatty in the system log
> which is good for the original developer but not good for a production
> product. All the normal messages should be changed from INFO to DEBUG.
> 
> Signed-off-by: Stephen Hemminger 

For the most part, this looks fine. However, I'm unsure about changing the log
level of the messages stating what the RX and TX burst functions in use are. I
would view this as important information that should generally be displayed as
the performance impacts of using a sub-optimal RX/TX code path are large.

/Bruce

> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 14 +++---
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 20 ++--
>  2 files changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index 7e75382..bb24a17 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -568,7 +568,7 @@ ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev 
> *eth_dev,
>   (hw->mac.type != ixgbe_mac_X550EM_x))
>   return -ENOSYS;
>  
> - PMD_INIT_LOG(INFO, "Setting port %u, %s queue_id %d to stat index %d",
> + PMD_INIT_LOG(DEBUG, "Setting port %u, %s queue_id %d to stat index %d",
>eth_dev->data->port_id, is_rx ? "RX" : "TX",
>queue_id, stat_idx);
>  
> @@ -594,20 +594,20 @@ ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev 
> *eth_dev,
>   else
>   stat_mappings->rqsmr[n] |= qsmr_mask;
>  
> - PMD_INIT_LOG(INFO, "Set port %u, %s queue_id %d to stat index %d",
> + PMD_INIT_LOG(DEBUG, "Set port %u, %s queue_id %d to stat index %d",
>eth_dev->data->port_id, is_rx ? "RX" : "TX",
>queue_id, stat_idx);
> - PMD_INIT_LOG(INFO, "%s[%d] = 0x%08x", is_rx ? "RQSMR" : "TQSM", n,
> + PMD_INIT_LOG(DEBUG, "%s[%d] = 0x%08x", is_rx ? "RQSMR" : "TQSM", n,
>is_rx ? stat_mappings->rqsmr[n] : stat_mappings->tqsm[n]);
>  
>   /* Now write the mapping in the appropriate register */
>   if (is_rx) {
> - PMD_INIT_LOG(INFO, "Write 0x%x to RX IXGBE stat mapping reg:%d",
> + PMD_INIT_LOG(DEBUG, "Write 0x%x to RX IXGBE stat mapping 
> reg:%d",
>stat_mappings->rqsmr[n], n);
>   IXGBE_WRITE_REG(hw, IXGBE_RQSMR(n), stat_mappings->rqsmr[n]);
>   }
>   else {
> - PMD_INIT_LOG(INFO, "Write 0x%x to TX IXGBE stat mapping reg:%d",
> + PMD_INIT_LOG(DEBUG, "Write 0x%x to TX IXGBE stat mapping 
> reg:%d",
>stat_mappings->tqsm[n], n);
>   IXGBE_WRITE_REG(hw, IXGBE_TQSM(n), stat_mappings->tqsm[n]);
>   }
> @@ -751,7 +751,7 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
>   ixgbe_set_tx_function(eth_dev, txq);
>   } else {
>   /* Use default TX function if we get here */
> - PMD_INIT_LOG(INFO, "No TX queues configured yet. "
> + PMD_INIT_LOG(DEBUG, "No TX queues configured yet. "
>  "Using default TX function.");
>   }
>  
> @@ -2275,7 +2275,7 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
>  
>   /* read-on-clear nic registers here */
>   eicr = IXGBE_READ_REG(hw, IXGBE_EICR);
> - PMD_DRV_LOG(INFO, "eicr %x", eicr);
> + PMD_DRV_LOG(DEBUG, "eicr %x", eicr);
>  
>   intr->flags = 0;
>   if (eicr & IXGBE_EICR_LSC) {
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index 57c9430..08830bf 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -1871,23 +1871,23 @@ ixgbe_set_tx_function(struct rte_eth_dev *dev, struct 
> ixgbe_tx_queue *txq)
>   /* Use a simple Tx queue (no offloads, no multi segs) if possible */
>   if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
>   && (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)) {
> - PMD_INIT_LOG(INFO, "Using simple tx code path");
> + PMD_INIT_LOG(DEBUG, "Using simple tx code path");
>  #ifdef RTE_IXGBE_INC_VECTOR
>   if (txq->tx_rs_thresh <= RTE_IXGBE_TX_MAX_FREE_BUF_SZ &&
>   (rte_eal_process_type() != RTE_PROC_PRIMARY ||
>   ixgbe_txq_vec_setup(txq) == 0)) {
> - PMD_INIT_LOG(INFO, "Vector tx enabled.");
> + PMD_INIT_LOG(DEBUG, "Vector tx enabled.");
>   dev->tx_pkt_burst = ixgbe_xmit_pkts_vec;
>   } else
>  #endif
>   dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
>   } else {
> - PMD_INIT_LOG(INFO, "Using full-featured tx code path");
> - PMD_INIT_LO

[dpdk-dev] [PATCH 2/5] ixgbe: don't print PCI address on link change

2015-05-18 Thread Bruce Richardson
On Fri, May 15, 2015 at 10:08:24AM -0700, Stephen Hemminger wrote:
> Printing PCI information on link state change is unnecessary since
> the same information has already been displayed earlier in the log.
> 
> Signed-off-by: Stephen Hemminger 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index a585151..fa335f4 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -2315,11 +2315,6 @@ ixgbe_dev_link_status_print(struct rte_eth_dev *dev)
>   PMD_INIT_LOG(INFO, "Port %u: Link Down",
>   dev->data->port_id);
>   }
> - PMD_INIT_LOG(INFO, "PCI Address: %04d:%02d:%02d:%d",
> - dev->pci_dev->addr.domain,
> - dev->pci_dev->addr.bus,
> - dev->pci_dev->addr.devid,
> - dev->pci_dev->addr.function);
>  }
>  
>  /*
> -- 
> 2.1.4
> 


[dpdk-dev] No probed ethernet devices with shared library

2015-05-18 Thread Panu Matilainen
On 05/18/2015 12:55 AM, Stuart Andrews wrote:
> Hello,
>
> I've been trying to create an app which uses the DPDK shared library and
> therefore I have
>
> CONFIG_RTE_BUILD_SHARED_LIB=y
>
> However, when I try to run 'test-pmd' I get
>
> EAL: No probed ethernet devices
>
> This is strange because when I compile DPDK with
> CONFIG_RTE_BUILD_SHARED_LIB=n and run 'test-pmd' everything works fine.
>
> I'm using the IGB UIO module on a x86_64 Ubuntu OS running on a vm and I
> set everything up according to the documentation.
>
> Any help would be appreciated.

When building as shared library, all the drivers are dynamically 
loadable plugins instead of the big pile o' everything you get when 
statically linking. For now, you need to manually load any drivers you 
need with the EAL -d option, eg if you use virtio NIC in the VM you'd 
add this to testpmd: -d librte_pmd_virtio_uio.so

And yes its cumbersome. Doing something about it has been on my todo for 
a while now, just been busy with other stuff.

- Panu -


[dpdk-dev] [PATCH v2 05/19] e1000: move e1000 pmd to drivers/net directory

2015-05-18 Thread Bruce Richardson
On Sat, May 16, 2015 at 02:11:14PM -0400, Thomas F Herbert wrote:
> On 5/15/15 11:56 AM, Bruce Richardson wrote:> Move e1000 pmd to drivers/net
> directory
> > As part of move, rename "e1000" subdirectory, which contains the code
> > from the "base driver", to "base".
> >
> > Signed-off-by: Bruce Richardson 
> Bruce,
> 
> Thanks!
> 
> I tried applying the series to master and everything was fine until I got to
> patch 5 which didn't apply. See below.
> 
> This patch for the e1000 seems extremely long. Is it trying to re-create all
> new files for the e1000  driver? It is 91000 lines long!!
> 
> --TFH

Yes, it's huge because the diff of moving a directory is to have chunks 
completely
deleting each line of each old file and another chunk adding each line of the
file in it's new location. I'm not aware of any way to avoid this hugeness of
diff while doing renaming - which I think is why Thomas wanted all moves done in
one go.

As for the failure to apply - I'll double check things myself and see if I need
to resubmit any of the patches.

/Bruce

> 
> git apply 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_04_19__bond__Move_bonded_ethdev_pmd_to_drivers_net-20150515-1235382.txt
> [therbert at Fedora21 dpdk]$ git apply 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:322:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:325:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:329:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:339:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:28770:
> trailing whitespace.
>   Copyright (c) 2001-2014, Intel Corporation
> error: patch failed: lib/librte_pmd_e1000/e1000/e1000_phy.c:1
> error: lib/librte_pmd_e1000/e1000/e1000_phy.c: patch does not appl
> > ---
> >   drivers/net/Makefile   |2 +-
> >   drivers/net/e1000/Makefile |   99 +
> -- 
> Thomas F Herbert
> Principal Software Engineer
> Red Hat
> therbert at redhat.com


[dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors.

2015-05-18 Thread Xie, Huawei
On 5/4/2015 2:27 PM, Ouyang Changchun wrote:
> Vring enqueue need consider the 2 cases:
>  1. Vring descriptors chained together, the first one is for virtio header, 
> the rest are for real data;
>  2. Only one descriptor, virtio header and real data share one single 
> descriptor;
>
> So does vring dequeue.
>
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 60 
> +++
>  1 file changed, 44 insertions(+), 16 deletions(-)
>
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 510ffe8..3135883 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>   uint64_t buff_addr = 0;
>   uint64_t buff_hdr_addr = 0;
> - uint32_t head[MAX_PKT_BURST], packet_len = 0;
> + uint32_t head[MAX_PKT_BURST];
>   uint32_t head_idx, packet_success = 0;
>   uint16_t avail_idx, res_cur_idx;
>   uint16_t res_base_idx, res_end_idx;
> @@ -113,6 +113,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   rte_prefetch0(&vq->desc[head[packet_success]]);
>  
>   while (res_cur_idx != res_end_idx) {
> + uint32_t offset = 0;
> + uint32_t data_len, len_to_cpy;
> + uint8_t plus_hdr = 0;
> +
>   /* Get descriptor from available ring */
>   desc = &vq->desc[head[packet_success]];
>  
> @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>   /* Copy virtio_hdr to packet and increment buffer address */
>   buff_hdr_addr = buff_addr;
> - packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
>  
>   /*
>* If the descriptors are chained the header and data are
> @@ -136,24 +139,44 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   desc = &vq->desc[desc->next];
>   /* Buffer address translation. */
>   buff_addr = gpa_to_vva(dev, desc->addr);
> - desc->len = rte_pktmbuf_data_len(buff);
>   } else {
>   buff_addr += vq->vhost_hlen;
> - desc->len = packet_len;
> + plus_hdr = 1;
>   }
>  
> + data_len = rte_pktmbuf_data_len(buff);
> + len_to_cpy = RTE_MIN(data_len, desc->len);
> + do {
> + if (len_to_cpy > 0) {
> + /* Copy mbuf data to buffer */
> + rte_memcpy((void *)(uintptr_t)buff_addr,
> + (const void *)(rte_pktmbuf_mtod(buff, 
> const char *) + offset),
> + len_to_cpy);
> + PRINT_PACKET(dev, (uintptr_t)buff_addr,
> + len_to_cpy, 0);
> +
> + desc->len = len_to_cpy + (plus_hdr ? 
> vq->vhost_hlen : 0);

Do we really need to rewrite the desc->len again and again?  At least we
only have the possibility to change the value of desc->len of the last
descriptor.

> + offset += len_to_cpy;
> + if (desc->flags & VRING_DESC_F_NEXT) {
> + desc = &vq->desc[desc->next];
> + buff_addr = gpa_to_vva(dev, desc->addr);
> + len_to_cpy = RTE_MIN(data_len - offset, 
> desc->len);
> + } else
> + break;

Still there are two issues here.
a) If the data couldn't be fully copied to chain of guest buffers, we
shouldn't do any copy.
b) scatter mbuf isn't considered.

> + } else {
> + desc->len = 0;
> + if (desc->flags & VRING_DESC_F_NEXT)
> +desc = &vq->desc[desc->next];
> + else
> + break;
> + }
> + } while (1);
> +
>   /* Update used ring with desc information */
>   vq->used->ring[res_cur_idx & (vq->size - 1)].id =
>   head[packet_success];
> - vq->used->ring[res_cur_idx & (vq->size - 1)].len = packet_len;
> -
> - /* Copy mbuf data to buffer */
> - /* FIXME for sg mbuf and the case that desc couldn't hold the 
> mbuf data */
> - rte_memcpy((void *)(uintptr_t)buff_addr,
> - rte_pktmbuf_mtod(buff, const void *),
> - rte_pktmbuf_data_len(buff));
> - PRINT_PACKET(dev, (uintptr_t)buff_addr,
> - rte_pktmbuf_data_len(buff), 0);
> +

[dpdk-dev] How do you setup a VM in Promiscuous Mode using PCI Pass-Through (SR-IOV)?

2015-05-18 Thread Qiu, Michael
Hi, Sami

Could you mind to supply the syslog? Especially iommu related parts.

Also you could update the qemu or kernel to see if this issue still exists.


Thanks,
Michael

On 5/16/2015 3:31 AM, Assaad, Sami (Sami) wrote:
> On Fri, May 15, 2015 at 12:54:19PM +, Assaad, Sami (Sami) wrote:
>> Thanks Bruce for your reply.
>>
>> Yes, your idea of bringing the PF into the VM looks like an option. However, 
>> how do you configure the physical interfaces within the VM supporting SRIOV?
>> I always believed that the VM needed to be associated with a 
>> virtual/emulated interface card. With your suggestion, I would actually 
>> configure the physical interface card/non-emulated within the VM.
>>
>> If you could provide me some example configuration commands, it would be 
>> really appreciated. 
>>
> You'd pass in the PF in the same way as the VF, just skip all the steps 
> creating the VF on the host. To the system and hypervisor, both are just PCI 
> devices!
>
> As for configuration, the setup and configuration of the PF in the guest is 
> exactly the same as on the host - it's the same hardware with the same PCI 
> bars.
> It's the IOMMU on your platform that takes care of memory isolation and 
> address translation and that should work with either PF or VF.
>
> Regards,
> /Bruce
>
>> Thanks in advance.
>>
>> Best Regards,
>> Sami.
>>
>> -Original Message-
>> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
>> Sent: Friday, May 15, 2015 5:27 AM
>> To: Stephen Hemminger
>> Cc: Assaad, Sami (Sami); dev at dpdk.org
>> Subject: Re: [dpdk-dev] How do you setup a VM in Promiscuous Mode using PCI 
>> Pass-Through (SR-IOV)?
>>
>> On Thu, May 14, 2015 at 04:47:19PM -0700, Stephen Hemminger wrote:
>>> On Thu, 14 May 2015 21:38:24 +
>>> "Assaad, Sami (Sami)"  wrote:
>>>
 Hello,

 My Hardware consists of the following:
   - DL380 Gen 9 Server supporting two Haswell Processors (Xeon CPU E5-2680 
 v3 @ 2.50GHz)
   - An x540 Ethernet Controller Card supporting 2x10G ports.

 Software:
   - CentOS 7 (3.10.0-229.1.2.el7.x86_64)
   - DPDK 1.8

 I want all the network traffic received on the two 10G ports to be 
 transmitted to my VM. The issue is that the Virtual Function / Physical 
 Functions have setup the internal virtual switch to only route Ethernet 
 packets with destination MAC address matching the VM virtual interface 
 MAC. How can I configure my virtual environment to provide all network 
 traffic to the VM...i.e. set the virtual functions for both PCI devices in 
 Promiscuous mode?

 [ If a l2fwd-vf example exists, this would actually solve this 
 problem ... Is there a DPDK l2fwd-vf example available? ]


 Thanks in advance.

 Best Regards,
 Sami Assaad.
>>> This is a host side (not DPDK) issue.
>>>
>>> Intel PF driver will not allow guest (VF) to go into promiscious 
>>> mode since it would allow traffic stealing which is a security violation.
>> Could you maybe try passing the PF directly into the VM, rather than a VF 
>> based off it? Since you seem to want all traffic to go to the one VM, there 
>> seems little point in creating a VF on the device, and should let the VM 
>> control the whole NIC directly.
>>
>> Regards,
>> /Bruce
>
> Hi Bruce, 
>
> I was provided two options:
> 1. Pass the PF directly into the VM
> 2. Use ixgbe VF mirroring
>
> I decided to first try your proposal of passing the PF directly into the VM. 
> However, I ran into some issues. 
> But prior to providing the problem details, the following is my  server 
> environment:
> I'm using CentOS 7 KVM/QEMU
> [root at ni-nfvhost01 qemu]# uname -a
> Linux ni-nfvhost01 3.10.0-229.1.2.el7.x86_64 #1 SMP Fri Mar 27 03:04:26 UTC 
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> [root at ni-nfvhost01 qemu]# lspci -n -s 04:00.0
> 04:00.0 0200: 8086:1528 (rev 01)
>
> [root at ni-nfvhost01 qemu]# lspci | grep -i eth
> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit 
> Ethernet PCIe (rev 01)
> 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit 
> Ethernet PCIe (rev 01)
> 02:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit 
> Ethernet PCIe (rev 01)
> 02:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit 
> Ethernet PCIe (rev 01)
> 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit 
> X540-AT2 (rev 01)
> 04:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit 
> X540-AT2 (rev 01)
>
> - The following is my grub execution:
> [root at ni-nfvhost01 qemu]# cat  /proc/cmdline 
> BOOT_IMAGE=/vmlinuz-3.10.0-229.1.2.el7.x86_64 root=/dev/mapper/centos-root ro 
> rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun17 rd.lvm.lv=centos/root 
> crashkernel=auto vconsole.keymap=us rhgb quiet iommu=pt intel_iommu=on 
> hugepages=8192
>
>
> This is the error I'm obtaining when the VM has one of the P

[dpdk-dev] [PATCH 0/2] doc: refactored fig and table nums into references

2015-05-18 Thread Mcnamara, John


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, May 13, 2015 8:08 PM
> To: Mcnamara, John
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] doc: refactored fig and table nums
> into references
> 
> 
> Unfortunately it gives this error:
>   _pickle.PicklingError: Can't pickle : attribute
> lookup XNumRefNode on builtins failed
> 

Hi Thomas,

I ran into similar pickle.py issues trying to subclass the node.reference 
classes within sphinx.

I was able to workaround it without a subclass. I'll submit a rebased V2 of the 
Figure/Table numbering patchset with updates to conf.py. You can review the 
workaround there.

John.
-- 






[dpdk-dev] [PATCH v2 05/19] e1000: move e1000 pmd to drivers/net directory

2015-05-18 Thread Bruce Richardson
On Sat, May 16, 2015 at 02:11:14PM -0400, Thomas F Herbert wrote:
> On 5/15/15 11:56 AM, Bruce Richardson wrote:> Move e1000 pmd to drivers/net
> directory
> > As part of move, rename "e1000" subdirectory, which contains the code
> > from the "base driver", to "base".
> >
> > Signed-off-by: Bruce Richardson 
> Bruce,
> 
> Thanks!
> 
> I tried applying the series to master and everything was fine until I got to
> patch 5 which didn't apply. See below.
> 
> This patch for the e1000 seems extremely long. Is it trying to re-create all
> new files for the e1000  driver? It is 91000 lines long!!
> 
> --TFH

The e1000 patch seems to apply ok to latest head in my testing. However, the
base driver code for i40e has been applied which prevents patch 8 from applying.

/Bruce

> 
> git apply 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_04_19__bond__Move_bonded_ethdev_pmd_to_drivers_net-20150515-1235382.txt
> [therbert at Fedora21 dpdk]$ git apply 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:322:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:325:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:329:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:339:
> trailing whitespace.
> 
> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:28770:
> trailing whitespace.
>   Copyright (c) 2001-2014, Intel Corporation
> error: patch failed: lib/librte_pmd_e1000/e1000/e1000_phy.c:1
> error: lib/librte_pmd_e1000/e1000/e1000_phy.c: patch does not appl
> > ---
> >   drivers/net/Makefile   |2 +-
> >   drivers/net/e1000/Makefile |   99 +
> -- 
> Thomas F Herbert
> Principal Software Engineer
> Red Hat
> therbert at redhat.com


[dpdk-dev] [PATCH v2 0/3] doc: refactored fig and table nums into references

2015-05-18 Thread John McNamara
This patchset adds automatic figure and table references to the docs. The
figure and table numbers in the generated Html and PDF docs can now be
automatically numbered by the build system.

It replaces all hardcoded figure/table numbers and references.

The numfig/numref feature requires Sphinx >= 1.3.1. For backward compatibility
with older versions workaround handling is added to the sphinx conf.py file in
patch 3/3.

The workaround replaces the :numref: reference with a "Figure" or "Table" link
to the target (for all Sphinx doc types). It doesn't number the figures or
tables. This produces reasonable documentation links for users with older
versions of sphinx while allowing automatic numbering support for newer
versions.

Tested with Sphinx 1.2.3 and 1.3.1.


John McNamara (3):
  doc: refactored figure numbers into references
  doc: refactored table numbers into references
  doc: add sphinx numref compatibility workaround

 doc/guides/conf.py |   82 ++
 doc/guides/nics/index.rst  |   18 +-
 doc/guides/nics/intel_vf.rst   |   37 +-
 doc/guides/nics/virtio.rst |   18 +-
 doc/guides/nics/vmxnet3.rst|   18 +-
 doc/guides/prog_guide/env_abstraction_layer.rst|8 +-
 doc/guides/prog_guide/index.rst|  162 +-
 doc/guides/prog_guide/ivshmem_lib.rst  |8 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |   40 +-
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |   43 +-
 doc/guides/prog_guide/lpm6_lib.rst |8 +-
 doc/guides/prog_guide/lpm_lib.rst  |8 +-
 doc/guides/prog_guide/malloc_lib.rst   |9 +-
 doc/guides/prog_guide/mbuf_lib.rst |   20 +-
 doc/guides/prog_guide/mempool_lib.rst  |   32 +-
 doc/guides/prog_guide/multi_proc_support.rst   |9 +-
 doc/guides/prog_guide/overview.rst |9 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |   15 +-
 doc/guides/prog_guide/packet_framework.rst | 1275 
 doc/guides/prog_guide/qos_framework.rst| 1543 ++--
 doc/guides/prog_guide/ring_lib.rst |  159 +-
 doc/guides/sample_app_ug/dist_app.rst  |   20 +-
 doc/guides/sample_app_ug/exception_path.rst|8 +-
 doc/guides/sample_app_ug/index.rst |   64 +-
 doc/guides/sample_app_ug/intel_quickassist.rst |   11 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |9 +-
 doc/guides/sample_app_ug/l2_forward_job_stats.rst  |   23 +-
 .../sample_app_ug/l2_forward_real_virtual.rst  |   22 +-
 .../sample_app_ug/l3_forward_access_ctrl.rst   |   21 +-
 doc/guides/sample_app_ug/load_balancer.rst |9 +-
 doc/guides/sample_app_ug/multi_process.rst |   36 +-
 doc/guides/sample_app_ug/qos_metering.rst  |   46 +-
 doc/guides/sample_app_ug/qos_scheduler.rst |   55 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |   36 +-
 doc/guides/sample_app_ug/test_pipeline.rst |  313 ++--
 doc/guides/sample_app_ug/vhost.rst |   45 +-
 doc/guides/sample_app_ug/vm_power_management.rst   |   18 +-
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |   11 +-
 doc/guides/xen/pkt_switch.rst  |   30 +-
 39 files changed, 2141 insertions(+), 2157 deletions(-)

--
1.8.1.4


[dpdk-dev] [PATCH v2 3/3] doc: add sphinx numref compatibility workaround

2015-05-18 Thread John McNamara
From: John McNamara 

This change adds some simple handling for the :numref: directive
for Sphinx versions prior to 1.3.1. This allows the Guides
documentation to be built with older versions of Sphinx and still
produce reasonable results.

The patch replaces the :numref: reference with a link to the
target (for all Sphinx doc types). It doesn't try to label
figures/tables.

Full numref support with automatic figure/table numbering and
links can be obtained by upgrading to Sphinx 1.3.1 or later.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py | 80 ++
 1 file changed, 80 insertions(+)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 1bc031f..cab97ac 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -31,6 +31,9 @@
 import subprocess
 from sphinx.highlighting import PygmentsBridge
 from pygments.formatters.latex import LatexFormatter
+from docutils import nodes
+from distutils.version import LooseVersion
+from sphinx import __version__ as sphinx_version

 project = 'DPDK'

@@ -72,6 +75,7 @@ latex_elements = {
 'preamble': latex_preamble
 }

+
 # Override the default Latex formatter in order to modify the
 # code/verbatim blocks.
 class CustomLatexFormatter(LatexFormatter):
@@ -82,3 +86,79 @@ class CustomLatexFormatter(LatexFormatter):

 # Replace the default latex formatter.
 PygmentsBridge.latex_formatter = CustomLatexFormatter
+
+
+# The following hook functions add some simple handling for the :numref:
+# directive for Sphinx versions prior to 1.3.1. The functions replace the
+# :numref: reference with a link to the target (for all Sphinx doc types). It
+# doesn't try to label figures/tables.
+
+def numref_role(reftype, rawtext, text, lineno, inliner):
+"""
+Add a Sphinx role to handle numref references. Note, we can't convert the
+link here because the doctree isn't build and the target information isn't
+available.
+
+"""
+
+# Add an identifier to distinguish numref from other references.
+newnode = nodes.reference('',
+  '',
+  refuri='_local_numref_#%s' % text,
+  internal=True)
+
+return [newnode], []
+
+
+def process_numref(app, doctree, from_docname):
+"""
+Process the numref nodes once the doctree has been built and prior to
+writing the files. The processing involves replacing the numref with a
+link plus text to indicate if it is a Figure or Table link.
+
+"""
+env = app.builder.env
+
+# Iterate over the reference nodes in the doctree.
+for node in doctree.traverse(nodes.reference):
+target = node.get('refuri', '')
+
+# Look for numref nodes.
+if target.startswith('_local_numref_#'):
+target = target.replace('_local_numref_#', '')
+
+# Get the target label and link information from the Sphinx env.
+data = env.domains['std'].data
+docname, label, _ = data['labels'].get(target, ('', '', ''))
+relative_url = app.builder.get_relative_uri(from_docname, docname)
+
+# Add a text label to the link.
+if target.startswith('figure'):
+caption = 'Figure'
+elif target.startswith('table'):
+caption = 'Table'
+else:
+caption = 'Link'
+
+# Create a new reference node with the updated link information.
+newnode = nodes.reference('',
+  caption,
+  refuri='%s#%s' % (relative_url, label),
+  internal=True)
+
+# Replace the node.
+node.replace_self(newnode)
+
+
+def setup(app):
+
+if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
+
+print('[dpdk docs] Upgrade sphinx to version >= 1.3.1 for '
+  'improved Figure/Table number handling.')
+
+# Add a role to handle :numref: references.
+app.add_role('numref', numref_role)
+
+# Process the numref references once the doctree has been created.
+app.connect('doctree-resolved', process_numref)
-- 
1.8.1.4



[dpdk-dev] [PATCH 1/3] doc: refactored figure numbers into references

2015-05-18 Thread John McNamara
This change adds automatic figure references to the docs. The
figure numbers in the generated Html and PDF docs are now
automatically numbered based on section.

Requires Sphinx >= 1.3.1.

The patch makes the following changes.

* Changes image:: tag to figure:: and moves image caption
  to the figure.

* Adds captions to figures that didn't previously have any.

* Un-templates the |image-name| substitution definitions
  into explicit figure:: tags. They weren't used more
  than once anyway and Sphinx doesn't support them
  for figure.

* Adds a target to each image that didn't previously
  have one so that they can be cross-referenced.

* Renamed existing image target to match the image
  name for consistency.

* Replaces the Figures lists with automatic :numref:
  :ref: entries to generate automatic numbering
  and captions.

* Replaces "Figure" references with automatic :numref:
  references.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py |   2 +
 doc/guides/nics/index.rst  |  18 ++-
 doc/guides/nics/intel_vf.rst   |  37 ++---
 doc/guides/nics/virtio.rst |  18 ++-
 doc/guides/nics/vmxnet3.rst|  18 ++-
 doc/guides/prog_guide/env_abstraction_layer.rst|   8 +-
 doc/guides/prog_guide/index.rst|  92 +++-
 doc/guides/prog_guide/ivshmem_lib.rst  |   8 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |  40 ++---
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |  43 --
 doc/guides/prog_guide/lpm6_lib.rst |   8 +-
 doc/guides/prog_guide/lpm_lib.rst  |   8 +-
 doc/guides/prog_guide/malloc_lib.rst   |   9 +-
 doc/guides/prog_guide/mbuf_lib.rst |  20 +--
 doc/guides/prog_guide/mempool_lib.rst  |  32 ++--
 doc/guides/prog_guide/multi_proc_support.rst   |   9 +-
 doc/guides/prog_guide/overview.rst |   9 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |  15 +-
 doc/guides/prog_guide/packet_framework.rst |  81 +-
 doc/guides/prog_guide/qos_framework.rst| 163 +++--
 doc/guides/prog_guide/ring_lib.rst | 159 +++-
 doc/guides/sample_app_ug/dist_app.rst  |  20 ++-
 doc/guides/sample_app_ug/exception_path.rst|   8 +-
 doc/guides/sample_app_ug/index.rst |  58 
 doc/guides/sample_app_ug/intel_quickassist.rst |  11 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |   9 +-
 doc/guides/sample_app_ug/l2_forward_job_stats.rst  |  23 +--
 .../sample_app_ug/l2_forward_real_virtual.rst  |  22 +--
 .../sample_app_ug/l3_forward_access_ctrl.rst   |  21 ++-
 doc/guides/sample_app_ug/load_balancer.rst |   9 +-
 doc/guides/sample_app_ug/multi_process.rst |  36 ++---
 doc/guides/sample_app_ug/qos_scheduler.rst |   9 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |  36 ++---
 doc/guides/sample_app_ug/test_pipeline.rst |   9 +-
 doc/guides/sample_app_ug/vhost.rst |  45 ++
 doc/guides/sample_app_ug/vm_power_management.rst   |  18 +--
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |  11 +-
 doc/guides/xen/pkt_switch.rst  |  30 ++--
 38 files changed, 539 insertions(+), 633 deletions(-)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index b1ef323..1bc031f 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -41,6 +41,8 @@ release = version

 master_doc = 'index'

+numfig = True
+
 latex_documents = [
 ('index',
  'doc.tex',
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index aadbae3..1ee67fa 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -50,14 +50,20 @@ Network Interface Controller Drivers

 **Figures**

-:ref:`Figure 1. Virtualization for a Single Port NIC in SR-IOV Mode 
`
+:numref:`figure_single_port_nic` :ref:`figure_single_port_nic`

-:ref:`Figure 2. SR-IOV Performance Benchmark Setup `
+:numref:`figure_perf_benchmark` :ref:`figure_perf_benchmark`

-:ref:`Figure 3. Fast Host-based Packet Processing `
+:numref:`figure_fast_pkt_proc` :ref:`figure_fast_pkt_proc`

-:ref:`Figure 4. SR-IOV Inter-VM Communication `
+:numref:`figure_inter_vm_comms` :ref:`figure_inter_vm_comms`

-:ref:`Figure 5. Virtio Host2VM Communication Example Using KNI vhost Back End 
`
+:numref:`figure_host_vm_comms` :ref:`figure_host_vm_comms`

-:ref:`Figure 6. Virtio Host2VM Communication Example Using Qemu vhost Back End 
`
+:numref:`figure_host_vm_comms_qemu` :ref:`figure_host_vm_comms_qemu`
+
+:numref:`figure_vmxnet3_int` :ref:`figure_vmxnet3_int`
+
+:numref:`figure_vswitch_vm` :ref:`figure_vswitch_vm`
+
+:numref:`figure_vm_vm_comms` :ref:`figure_vm_vm_comms`
diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst
index eeca973..db86c64 100644
--- a/doc/guides/nics/intel_vf.rst
+++ b/

[dpdk-dev] [PATCH v3 0/3] doc: refactored fig and table nums into references

2015-05-18 Thread John McNamara
This patchset adds automatic figure and table references to the docs. The
figure and table numbers in the generated Html and PDF docs can now be
automatically numbered.

It replaces all hardcoded figure/table numbers and references.

The numfig/numref feature requires Sphinx >= 1.3.1. For backward compatibility
with older versions workaround handling is added to the sphinx conf.py file in
patch 3/3.

The workaround replaces the :numref: reference with a "Figure" or "Table" link
to the target (for all Sphinx doc types). It doesn't number the figures or
tables. This produces reasonable documentation links for users with older
versions of sphinx while allowing automatic numbering support for newer
versions.

Tested with Sphinx 1.2.3 and 1.3.1.


John McNamara (3):
  doc: refactored figure numbers into references
  doc: refactored table numbers into references
  doc: add sphinx numref compatibility workaround

 doc/guides/conf.py |   82 ++
 doc/guides/nics/index.rst  |   18 +-
 doc/guides/nics/intel_vf.rst   |   37 +-
 doc/guides/nics/virtio.rst |   18 +-
 doc/guides/nics/vmxnet3.rst|   18 +-
 doc/guides/prog_guide/env_abstraction_layer.rst|8 +-
 doc/guides/prog_guide/index.rst|  162 +-
 doc/guides/prog_guide/ivshmem_lib.rst  |8 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |   40 +-
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |   43 +-
 doc/guides/prog_guide/lpm6_lib.rst |8 +-
 doc/guides/prog_guide/lpm_lib.rst  |8 +-
 doc/guides/prog_guide/malloc_lib.rst   |9 +-
 doc/guides/prog_guide/mbuf_lib.rst |   20 +-
 doc/guides/prog_guide/mempool_lib.rst  |   32 +-
 doc/guides/prog_guide/multi_proc_support.rst   |9 +-
 doc/guides/prog_guide/overview.rst |9 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |   15 +-
 doc/guides/prog_guide/packet_framework.rst | 1275 
 doc/guides/prog_guide/qos_framework.rst| 1543 ++--
 doc/guides/prog_guide/ring_lib.rst |  159 +-
 doc/guides/sample_app_ug/dist_app.rst  |   20 +-
 doc/guides/sample_app_ug/exception_path.rst|8 +-
 doc/guides/sample_app_ug/index.rst |   64 +-
 doc/guides/sample_app_ug/intel_quickassist.rst |   11 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |9 +-
 doc/guides/sample_app_ug/l2_forward_job_stats.rst  |   23 +-
 .../sample_app_ug/l2_forward_real_virtual.rst  |   22 +-
 .../sample_app_ug/l3_forward_access_ctrl.rst   |   21 +-
 doc/guides/sample_app_ug/load_balancer.rst |9 +-
 doc/guides/sample_app_ug/multi_process.rst |   36 +-
 doc/guides/sample_app_ug/qos_metering.rst  |   46 +-
 doc/guides/sample_app_ug/qos_scheduler.rst |   55 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |   36 +-
 doc/guides/sample_app_ug/test_pipeline.rst |  313 ++--
 doc/guides/sample_app_ug/vhost.rst |   45 +-
 doc/guides/sample_app_ug/vm_power_management.rst   |   18 +-
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |   11 +-
 doc/guides/xen/pkt_switch.rst  |   30 +-
 39 files changed, 2141 insertions(+), 2157 deletions(-)

--
1.8.1.4


[dpdk-dev] [PATCH v3 1/3] doc: refactored figure numbers into references

2015-05-18 Thread John McNamara
This change adds automatic figure references to the docs. The
figure numbers in the generated Html and PDF docs are now
automatically numbered based on section.

Requires Sphinx >= 1.3.1.

The patch makes the following changes.

* Changes image:: tag to figure:: and moves image caption
  to the figure.

* Adds captions to figures that didn't previously have any.

* Un-templates the |image-name| substitution definitions
  into explicit figure:: tags. They weren't used more
  than once anyway and Sphinx doesn't support them
  for figure.

* Adds a target to each image that didn't previously
  have one so that they can be cross-referenced.

* Renamed existing image target to match the image
  name for consistency.

* Replaces the Figures lists with automatic :numref:
  :ref: entries to generate automatic numbering
  and captions.

* Replaces "Figure" references with automatic :numref:
  references.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py |   2 +
 doc/guides/nics/index.rst  |  18 ++-
 doc/guides/nics/intel_vf.rst   |  37 ++---
 doc/guides/nics/virtio.rst |  18 ++-
 doc/guides/nics/vmxnet3.rst|  18 ++-
 doc/guides/prog_guide/env_abstraction_layer.rst|   8 +-
 doc/guides/prog_guide/index.rst|  92 +++-
 doc/guides/prog_guide/ivshmem_lib.rst  |   8 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |  40 ++---
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |  43 --
 doc/guides/prog_guide/lpm6_lib.rst |   8 +-
 doc/guides/prog_guide/lpm_lib.rst  |   8 +-
 doc/guides/prog_guide/malloc_lib.rst   |   9 +-
 doc/guides/prog_guide/mbuf_lib.rst |  20 +--
 doc/guides/prog_guide/mempool_lib.rst  |  32 ++--
 doc/guides/prog_guide/multi_proc_support.rst   |   9 +-
 doc/guides/prog_guide/overview.rst |   9 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |  15 +-
 doc/guides/prog_guide/packet_framework.rst |  81 +-
 doc/guides/prog_guide/qos_framework.rst| 163 +++--
 doc/guides/prog_guide/ring_lib.rst | 159 +++-
 doc/guides/sample_app_ug/dist_app.rst  |  20 ++-
 doc/guides/sample_app_ug/exception_path.rst|   8 +-
 doc/guides/sample_app_ug/index.rst |  58 
 doc/guides/sample_app_ug/intel_quickassist.rst |  11 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |   9 +-
 doc/guides/sample_app_ug/l2_forward_job_stats.rst  |  23 +--
 .../sample_app_ug/l2_forward_real_virtual.rst  |  22 +--
 .../sample_app_ug/l3_forward_access_ctrl.rst   |  21 ++-
 doc/guides/sample_app_ug/load_balancer.rst |   9 +-
 doc/guides/sample_app_ug/multi_process.rst |  36 ++---
 doc/guides/sample_app_ug/qos_scheduler.rst |   9 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |  36 ++---
 doc/guides/sample_app_ug/test_pipeline.rst |   9 +-
 doc/guides/sample_app_ug/vhost.rst |  45 ++
 doc/guides/sample_app_ug/vm_power_management.rst   |  18 +--
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |  11 +-
 doc/guides/xen/pkt_switch.rst  |  30 ++--
 38 files changed, 539 insertions(+), 633 deletions(-)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index b1ef323..1bc031f 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -41,6 +41,8 @@ release = version

 master_doc = 'index'

+numfig = True
+
 latex_documents = [
 ('index',
  'doc.tex',
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index aadbae3..1ee67fa 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -50,14 +50,20 @@ Network Interface Controller Drivers

 **Figures**

-:ref:`Figure 1. Virtualization for a Single Port NIC in SR-IOV Mode 
`
+:numref:`figure_single_port_nic` :ref:`figure_single_port_nic`

-:ref:`Figure 2. SR-IOV Performance Benchmark Setup `
+:numref:`figure_perf_benchmark` :ref:`figure_perf_benchmark`

-:ref:`Figure 3. Fast Host-based Packet Processing `
+:numref:`figure_fast_pkt_proc` :ref:`figure_fast_pkt_proc`

-:ref:`Figure 4. SR-IOV Inter-VM Communication `
+:numref:`figure_inter_vm_comms` :ref:`figure_inter_vm_comms`

-:ref:`Figure 5. Virtio Host2VM Communication Example Using KNI vhost Back End 
`
+:numref:`figure_host_vm_comms` :ref:`figure_host_vm_comms`

-:ref:`Figure 6. Virtio Host2VM Communication Example Using Qemu vhost Back End 
`
+:numref:`figure_host_vm_comms_qemu` :ref:`figure_host_vm_comms_qemu`
+
+:numref:`figure_vmxnet3_int` :ref:`figure_vmxnet3_int`
+
+:numref:`figure_vswitch_vm` :ref:`figure_vswitch_vm`
+
+:numref:`figure_vm_vm_comms` :ref:`figure_vm_vm_comms`
diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst
index eeca973..db86c64 100644
--- a/doc/guides/nics/intel_vf.rst
+++ b/

[dpdk-dev] [PATCH v3 3/3] doc: add sphinx numref compatibility workaround

2015-05-18 Thread John McNamara
This change adds some simple handling for the :numref: directive
for Sphinx versions prior to 1.3.1. This allows the Guides
documentation to be built with older versions of Sphinx and still
produce reasonable results.

The patch replaces the :numref: reference with a link to the
target (for all Sphinx doc types). It doesn't try to label
figures/tables.

Full numref support with automatic figure/table numbering and
links can be obtained by upgrading to Sphinx 1.3.1 or later.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py | 80 ++
 1 file changed, 80 insertions(+)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 1bc031f..cab97ac 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -31,6 +31,9 @@
 import subprocess
 from sphinx.highlighting import PygmentsBridge
 from pygments.formatters.latex import LatexFormatter
+from docutils import nodes
+from distutils.version import LooseVersion
+from sphinx import __version__ as sphinx_version

 project = 'DPDK'

@@ -72,6 +75,7 @@ latex_elements = {
 'preamble': latex_preamble
 }

+
 # Override the default Latex formatter in order to modify the
 # code/verbatim blocks.
 class CustomLatexFormatter(LatexFormatter):
@@ -82,3 +86,79 @@ class CustomLatexFormatter(LatexFormatter):

 # Replace the default latex formatter.
 PygmentsBridge.latex_formatter = CustomLatexFormatter
+
+
+# The following hook functions add some simple handling for the :numref:
+# directive for Sphinx versions prior to 1.3.1. The functions replace the
+# :numref: reference with a link to the target (for all Sphinx doc types). It
+# doesn't try to label figures/tables.
+
+def numref_role(reftype, rawtext, text, lineno, inliner):
+"""
+Add a Sphinx role to handle numref references. Note, we can't convert the
+link here because the doctree isn't build and the target information isn't
+available.
+
+"""
+
+# Add an identifier to distinguish numref from other references.
+newnode = nodes.reference('',
+  '',
+  refuri='_local_numref_#%s' % text,
+  internal=True)
+
+return [newnode], []
+
+
+def process_numref(app, doctree, from_docname):
+"""
+Process the numref nodes once the doctree has been built and prior to
+writing the files. The processing involves replacing the numref with a
+link plus text to indicate if it is a Figure or Table link.
+
+"""
+env = app.builder.env
+
+# Iterate over the reference nodes in the doctree.
+for node in doctree.traverse(nodes.reference):
+target = node.get('refuri', '')
+
+# Look for numref nodes.
+if target.startswith('_local_numref_#'):
+target = target.replace('_local_numref_#', '')
+
+# Get the target label and link information from the Sphinx env.
+data = env.domains['std'].data
+docname, label, _ = data['labels'].get(target, ('', '', ''))
+relative_url = app.builder.get_relative_uri(from_docname, docname)
+
+# Add a text label to the link.
+if target.startswith('figure'):
+caption = 'Figure'
+elif target.startswith('table'):
+caption = 'Table'
+else:
+caption = 'Link'
+
+# Create a new reference node with the updated link information.
+newnode = nodes.reference('',
+  caption,
+  refuri='%s#%s' % (relative_url, label),
+  internal=True)
+
+# Replace the node.
+node.replace_self(newnode)
+
+
+def setup(app):
+
+if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
+
+print('[dpdk docs] Upgrade sphinx to version >= 1.3.1 for '
+  'improved Figure/Table number handling.')
+
+# Add a role to handle :numref: references.
+app.add_role('numref', numref_role)
+
+# Process the numref references once the doctree has been created.
+app.connect('doctree-resolved', process_numref)
-- 
1.8.1.4



[dpdk-dev] [PATCH v2] l3fwd: make destination mac address configurable

2015-05-18 Thread Andrey Chilikin
Add a command-line parameter to l3fwd, to allow the user to specify the 
destination mac address for each ethernet port used.

v2 changes:
- apply command-line parameter to fast path as well (val_eth)

Signed-off-by: Andrey Chilikin 
Signed-off-by: Bruce Richardson 
---
 examples/l3fwd/main.c |   99 -
 1 files changed, 65 insertions(+), 34 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index e32512e..be1ef95 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -72,6 +72,9 @@
 #include 
 #include 

+#include 
+#include 
+
 #define APP_LOOKUP_EXACT_MATCH  0
 #define APP_LOOKUP_LPM  1
 #define DO_RFC_1812_CHECKS
@@ -159,6 +162,7 @@ static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
 static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;

 /* ethernet addresses of ports */
+static uint64_t dest_eth_addr[RTE_MAX_ETHPORTS];
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];

 static __m128i val_eth[RTE_MAX_ETHPORTS];
@@ -738,7 +742,6 @@ simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint8_t 
portid, struct lcore_conf *
 {
struct ether_hdr *eth_hdr[4];
struct ipv4_hdr *ipv4_hdr[4];
-   void *d_addr_bytes[4];
uint8_t dst_port[4];
int32_t ret[4];
union ipv4_5tuple_host key[4];
@@ -823,16 +826,6 @@ simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint8_t 
portid, struct lcore_conf *
if (dst_port[3] >= RTE_MAX_ETHPORTS || (enabled_port_mask & 1 << 
dst_port[3]) == 0)
dst_port[3] = portid;

-   /* 02:00:00:00:00:xx */
-   d_addr_bytes[0] = ð_hdr[0]->d_addr.addr_bytes[0];
-   d_addr_bytes[1] = ð_hdr[1]->d_addr.addr_bytes[0];
-   d_addr_bytes[2] = ð_hdr[2]->d_addr.addr_bytes[0];
-   d_addr_bytes[3] = ð_hdr[3]->d_addr.addr_bytes[0];
-   *((uint64_t *)d_addr_bytes[0]) = 0x0002 + 
((uint64_t)dst_port[0] << 40);
-   *((uint64_t *)d_addr_bytes[1]) = 0x0002 + 
((uint64_t)dst_port[1] << 40);
-   *((uint64_t *)d_addr_bytes[2]) = 0x0002 + 
((uint64_t)dst_port[2] << 40);
-   *((uint64_t *)d_addr_bytes[3]) = 0x0002 + 
((uint64_t)dst_port[3] << 40);
-
 #ifdef DO_RFC_1812_CHECKS
/* Update time to live and header checksum */
--(ipv4_hdr[0]->time_to_live);
@@ -845,6 +838,12 @@ simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint8_t 
portid, struct lcore_conf *
++(ipv4_hdr[3]->hdr_checksum);
 #endif

+   /* dst addr */
+   *(uint64_t *)ð_hdr[0]->d_addr = dest_eth_addr[dst_port[0]];
+   *(uint64_t *)ð_hdr[1]->d_addr = dest_eth_addr[dst_port[1]];
+   *(uint64_t *)ð_hdr[2]->d_addr = dest_eth_addr[dst_port[2]];
+   *(uint64_t *)ð_hdr[3]->d_addr = dest_eth_addr[dst_port[3]];
+
/* src addr */
ether_addr_copy(&ports_eth_addr[dst_port[0]], ð_hdr[0]->s_addr);
ether_addr_copy(&ports_eth_addr[dst_port[1]], ð_hdr[1]->s_addr);
@@ -880,7 +879,6 @@ simple_ipv6_fwd_4pkts(struct rte_mbuf* m[4], uint8_t 
portid, struct lcore_conf *
 {
struct ether_hdr *eth_hdr[4];
__attribute__((unused)) struct ipv6_hdr *ipv6_hdr[4];
-   void *d_addr_bytes[4];
uint8_t dst_port[4];
int32_t ret[4];
union ipv6_5tuple_host key[4];
@@ -921,15 +919,11 @@ simple_ipv6_fwd_4pkts(struct rte_mbuf* m[4], uint8_t 
portid, struct lcore_conf *
if (dst_port[3] >= RTE_MAX_ETHPORTS || (enabled_port_mask & 1 << 
dst_port[3]) == 0)
dst_port[3] = portid;

-   /* 02:00:00:00:00:xx */
-   d_addr_bytes[0] = ð_hdr[0]->d_addr.addr_bytes[0];
-   d_addr_bytes[1] = ð_hdr[1]->d_addr.addr_bytes[0];
-   d_addr_bytes[2] = ð_hdr[2]->d_addr.addr_bytes[0];
-   d_addr_bytes[3] = ð_hdr[3]->d_addr.addr_bytes[0];
-   *((uint64_t *)d_addr_bytes[0]) = 0x0002 + 
((uint64_t)dst_port[0] << 40);
-   *((uint64_t *)d_addr_bytes[1]) = 0x0002 + 
((uint64_t)dst_port[1] << 40);
-   *((uint64_t *)d_addr_bytes[2]) = 0x0002 + 
((uint64_t)dst_port[2] << 40);
-   *((uint64_t *)d_addr_bytes[3]) = 0x0002 + 
((uint64_t)dst_port[3] << 40);
+   /* dst addr */
+   *(uint64_t *)ð_hdr[0]->d_addr = dest_eth_addr[dst_port[0]];
+   *(uint64_t *)ð_hdr[1]->d_addr = dest_eth_addr[dst_port[1]];
+   *(uint64_t *)ð_hdr[2]->d_addr = dest_eth_addr[dst_port[2]];
+   *(uint64_t *)ð_hdr[3]->d_addr = dest_eth_addr[dst_port[3]];

/* src addr */
ether_addr_copy(&ports_eth_addr[dst_port[0]], ð_hdr[0]->s_addr);
@@ -950,7 +944,6 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, 
struct lcore_conf *qcon
 {
struct ether_hdr *eth_hdr;
struct ipv4_hdr *ipv4_hdr;
-   void *d_addr_bytes;
uint8_t dst_port;

eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
@@ -974,16 +967,13 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, 
struct lcore_conf *qcon
(enabled_port_mask & 1 << dst_port) == 0)
 

[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Zoltan Kiss
Hi,

Any opinion on this patch?

Regards,

Zoltan

On 13/05/15 19:59, Zoltan Kiss wrote:
> Otherwise cache_flushthresh can be bigger than n, and
> a consumer can starve others by keeping every element
> either in use or in the cache.
>
> Signed-off-by: Zoltan Kiss 
> ---
>   lib/librte_mempool/rte_mempool.c | 3 ++-
>   lib/librte_mempool/rte_mempool.h | 2 +-
>   2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index cf7ed76..ca6cd9c 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> unsigned elt_size,
>   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
>
>   /* asked cache too big */
> - if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> + if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> + (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
>   rte_errno = EINVAL;
>   return NULL;
>   }
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index 9001312..a4a9610 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
> void *);
>*   If cache_size is non-zero, the rte_mempool library will try to
>*   limit the accesses to the common lockless pool, by maintaining a
>*   per-lcore object cache. This argument must be lower or equal to
> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
>*   cache_size to have "n modulo cache_size == 0": if this is
>*   not the case, some elements will always stay in the pool and will
>*   never be used. The access to the per-lcore table is of course
>


[dpdk-dev] [PATCH v2] pipeline: add statistics for librte_pipeline ports and tables

2015-05-18 Thread Thomas Monjalon
2015-05-05 15:11, Dumitrescu, Cristian:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> > From: Pawel Wodkowski 
> > 
> > This patch adds statistics collection for librte_pipeline.
> > Those statistics ale disabled by default during build time.
> > 
> > Signed-off-by: Pawel Wodkowski 
> > ---
> >  config/common_bsdapp   |1 +
> >  config/common_linuxapp |1 +
[...]
> >  # Compile librte_pipeline
> >  #
> >  CONFIG_RTE_LIBRTE_PIPELINE=y
> > +CONFIG_RTE_PIPELINE_STATS_COLLECT=n
[...]
> 
> Acked by: Cristian Dumitrescu 

Nack because of new config option.
The same problem appear for all series related to packet framework.


[dpdk-dev] [PATCH v2 02/10] table: added acl table stats

2015-05-18 Thread Thomas Monjalon
2015-04-30 08:55, Stephen Hemminger:
> > From: Maciej Gajdzica 
> > 
> > Added statistics for ACL table.
> > 
> > Signed-off-by: Maciej Gajdzica 
> > ---
> >  config/common_bsdapp |1 +
> >  config/common_linuxapp   |1 +
[...]
> >  # Compile librte_table
> >  #
> >  CONFIG_RTE_LIBRTE_TABLE=y
> > +CONFIG_RTE_TABLE_ACL_STATS_COLLECT=n
> 
> Sigh. More config options does not make DPDK better.
> It makes more unsupportable for distros

+1
It is the same comment as for port statistics.
Please stop trying to add new config options.



[dpdk-dev] [PATCH v2 02/13] port: added port_ethdev_reader stats

2015-05-18 Thread Thomas Monjalon
2015-04-30 14:07, Michal Jastrzebski:
> From: Maciej Gajdzica 
> 
> Added statistics for ethdev reader port.
> 
> Signed-off-by: Maciej Gajdzica 
> ---
>  config/common_bsdapp  |1 +
>  config/common_linuxapp|1 +
[...]
>  # Compile librte_port
>  #
>  CONFIG_RTE_LIBRTE_PORT=y
> +CONFIG_RTE_PORT_ETHDEV_READER_STATS_COLLECT=n

No, consider adding something to these files is forbidden.
We must remove compile-time options, not adding new ones.

CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT and CONFIG_RTE_SCHED_COLLECT_STATS should
be removed.


[dpdk-dev] [PATCH v2 01/13] port: added structures for port stats

2015-05-18 Thread Thomas Monjalon
2015-04-30 14:07, Michal Jastrzebski:
> From: Maciej Gajdzica 
[...]
>  struct rte_port_out_ops {
> - rte_port_out_op_create f_create;   /**< Create */
> - rte_port_out_op_free f_free;   /**< Free */
> - rte_port_out_op_tx f_tx;   /**< Packet TX (single packet) */
> - rte_port_out_op_tx_bulk f_tx_bulk; /**< Packet TX (packet burst) */
> - rte_port_out_op_flush f_flush; /**< Flush */
> + rte_port_out_op_create f_create;/**< Create */
> + rte_port_out_op_free f_free;/**< Free */
> + rte_port_out_op_tx f_tx;/**< Packet TX 
> (single packet) */
> + rte_port_out_op_tx_bulk f_tx_bulk;  /**< Packet TX (packet 
> burst) */
> + rte_port_out_op_flush f_flush;  /**< Flush */
> + rte_port_out_op_stats_read f_stats; /**< Stats */
>  };

Please avoid changing alignment if not really necessary.
Here it seems you want to have some space between "f_stats;" and its comment.
So you should use spaces for alignment of the comments and not hard-tabs,
even less hard-tabs used as 4-char like here.



[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 01:27:45PM +0100, Zoltan Kiss wrote:
> Hi,
> 
> Any opinion on this patch?
> 
> Regards,
> 
> Zoltan
> 
> On 13/05/15 19:59, Zoltan Kiss wrote:
> >Otherwise cache_flushthresh can be bigger than n, and
> >a consumer can starve others by keeping every element
> >either in use or in the cache.
> >
> >Signed-off-by: Zoltan Kiss 

Seems reasonable enough to me.

Acked-by: Bruce Richardson 

> >---
> >  lib/librte_mempool/rte_mempool.c | 3 ++-
> >  lib/librte_mempool/rte_mempool.h | 2 +-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> >diff --git a/lib/librte_mempool/rte_mempool.c 
> >b/lib/librte_mempool/rte_mempool.c
> >index cf7ed76..ca6cd9c 100644
> >--- a/lib/librte_mempool/rte_mempool.c
> >+++ b/lib/librte_mempool/rte_mempool.c
> >@@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> >unsigned elt_size,
> > mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
> >
> > /* asked cache too big */
> >-if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> >+if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> >+(uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
> > rte_errno = EINVAL;
> > return NULL;
> > }
> >diff --git a/lib/librte_mempool/rte_mempool.h 
> >b/lib/librte_mempool/rte_mempool.h
> >index 9001312..a4a9610 100644
> >--- a/lib/librte_mempool/rte_mempool.h
> >+++ b/lib/librte_mempool/rte_mempool.h
> >@@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
> >void *);
> >   *   If cache_size is non-zero, the rte_mempool library will try to
> >   *   limit the accesses to the common lockless pool, by maintaining a
> >   *   per-lcore object cache. This argument must be lower or equal to
> >- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> >+ *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
> >   *   cache_size to have "n modulo cache_size == 0": if this is
> >   *   not the case, some elements will always stay in the pool and will
> >   *   never be used. The access to the per-lcore table is of course
> >


[dpdk-dev] [PATCH v3 08/19] i40e: move i40e PMD to drivers/net directory

2015-05-18 Thread Bruce Richardson
Move i40e PMD to drivers/net directory.
As part of the move, rename the "i40e" directory, containing the "base
driver" code, from "i40e" to "base".

Signed-off-by: Bruce Richardson 
---

V3: Rebased post base-code update

---
 drivers/net/Makefile |2 +-
 drivers/net/i40e/Makefile|  106 +
 drivers/net/i40e/base/i40e_adminq.c  | 1074 +
 drivers/net/i40e/base/i40e_adminq.h  |  164 +
 drivers/net/i40e/base/i40e_adminq_cmd.h  | 2328 +++
 drivers/net/i40e/base/i40e_alloc.h   |   65 +
 drivers/net/i40e/base/i40e_common.c  | 5110 +++
 drivers/net/i40e/base/i40e_dcb.c |  734 
 drivers/net/i40e/base/i40e_dcb.h |  181 +
 drivers/net/i40e/base/i40e_diag.c|  178 +
 drivers/net/i40e/base/i40e_diag.h|   61 +
 drivers/net/i40e/base/i40e_hmc.c |  373 ++
 drivers/net/i40e/base/i40e_hmc.h |  243 ++
 drivers/net/i40e/base/i40e_lan_hmc.c | 1412 +++
 drivers/net/i40e/base/i40e_lan_hmc.h |  200 +
 drivers/net/i40e/base/i40e_nvm.c | 1231 ++
 drivers/net/i40e/base/i40e_osdep.h   |  197 +
 drivers/net/i40e/base/i40e_prototype.h   |  451 ++
 drivers/net/i40e/base/i40e_register.h| 3377 +++
 drivers/net/i40e/base/i40e_status.h  |  107 +
 drivers/net/i40e/base/i40e_type.h| 1470 +++
 drivers/net/i40e/base/i40e_virtchnl.h|  372 ++
 drivers/net/i40e/i40e_ethdev.c   | 5699 ++
 drivers/net/i40e/i40e_ethdev.h   |  567 +++
 drivers/net/i40e/i40e_ethdev_vf.c| 1893 +
 drivers/net/i40e/i40e_fdir.c | 1361 ++
 drivers/net/i40e/i40e_logs.h |   78 +
 drivers/net/i40e/i40e_pf.c   | 1063 +
 drivers/net/i40e/i40e_pf.h   |  127 +
 drivers/net/i40e/i40e_rxtx.c | 2709 
 drivers/net/i40e/i40e_rxtx.h |  211 +
 drivers/net/i40e/rte_pmd_i40e_version.map|4 +
 lib/Makefile |1 -
 lib/librte_pmd_i40e/Makefile |  106 -
 lib/librte_pmd_i40e/i40e/i40e_adminq.c   | 1074 -
 lib/librte_pmd_i40e/i40e/i40e_adminq.h   |  164 -
 lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h   | 2328 ---
 lib/librte_pmd_i40e/i40e/i40e_alloc.h|   65 -
 lib/librte_pmd_i40e/i40e/i40e_common.c   | 5110 ---
 lib/librte_pmd_i40e/i40e/i40e_dcb.c  |  734 
 lib/librte_pmd_i40e/i40e/i40e_dcb.h  |  181 -
 lib/librte_pmd_i40e/i40e/i40e_diag.c |  178 -
 lib/librte_pmd_i40e/i40e/i40e_diag.h |   61 -
 lib/librte_pmd_i40e/i40e/i40e_hmc.c  |  373 --
 lib/librte_pmd_i40e/i40e/i40e_hmc.h  |  243 --
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c  | 1412 ---
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h  |  200 -
 lib/librte_pmd_i40e/i40e/i40e_nvm.c  | 1231 --
 lib/librte_pmd_i40e/i40e/i40e_osdep.h|  197 -
 lib/librte_pmd_i40e/i40e/i40e_prototype.h|  451 --
 lib/librte_pmd_i40e/i40e/i40e_register.h | 3377 ---
 lib/librte_pmd_i40e/i40e/i40e_status.h   |  107 -
 lib/librte_pmd_i40e/i40e/i40e_type.h | 1470 ---
 lib/librte_pmd_i40e/i40e/i40e_virtchnl.h |  372 --
 lib/librte_pmd_i40e/i40e_ethdev.c| 5699 --
 lib/librte_pmd_i40e/i40e_ethdev.h|  567 ---
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 1893 -
 lib/librte_pmd_i40e/i40e_fdir.c  | 1361 --
 lib/librte_pmd_i40e/i40e_logs.h  |   78 -
 lib/librte_pmd_i40e/i40e_pf.c| 1063 -
 lib/librte_pmd_i40e/i40e_pf.h|  127 -
 lib/librte_pmd_i40e/i40e_rxtx.c  | 2709 
 lib/librte_pmd_i40e/i40e_rxtx.h  |  211 -
 lib/librte_pmd_i40e/rte_pmd_i40e_version.map |4 -
 64 files changed, 33147 insertions(+), 33148 deletions(-)
 create mode 100644 drivers/net/i40e/Makefile
 create mode 100644 drivers/net/i40e/base/i40e_adminq.c
 create mode 100644 drivers/net/i40e/base/i40e_adminq.h
 create mode 100644 drivers/net/i40e/base/i40e_adminq_cmd.h
 create mode 100644 drivers/net/i40e/base/i40e_alloc.h
 create mode 100644 drivers/net/i40e/base/i40e_common.c
 create mode 100644 drivers/net/i40e/base/i40e_dcb.c
 create mode 100644 drivers/net/i40e/base/i40e_dcb.h
 create mode 100644 drivers/net/i40e/base/i40e_diag.c
 create mode 100644 drivers/net/i40e/base/i40e_diag.h
 create mode 100644 drivers/net/i40e/base/i40e_hmc.c
 create mode 100644 drivers/net/i40e/base/i40e_hmc.h
 create mode 100644 drivers/net/i40e/base/i40e_lan_hmc.c
 create mode 100644 drivers/net/i40e/base/i40e_lan_hmc.h
 create mode 100644 drivers/net/i40e/base/i40e_nvm.c
 create mode 100644 drivers/net/i40e/base/i40e_osdep.h
 create mode 100644 drivers/net/i40e/base/i40e_prototype.h
 crea

[dpdk-dev] [PATCH v3 2/3] doc: refactored table numbers into references

2015-05-18 Thread John McNamara
This change adds automatic table references to the docs. The
table numbers in the generated Html and PDF docs are now
automatically numbered based on section.

Requires Sphinx >= 1.3.1.

This change:

* Adds a RST table:: directive to each table caption.

* Indents the tables to the required directive level.

Signed-off-by: John McNamara 
---
 doc/guides/prog_guide/index.rst|   70 +-
 doc/guides/prog_guide/packet_framework.rst | 1164 +++
 doc/guides/prog_guide/qos_framework.rst| 1372 ++--
 doc/guides/sample_app_ug/index.rst |6 +-
 doc/guides/sample_app_ug/qos_metering.rst  |   46 +-
 doc/guides/sample_app_ug/qos_scheduler.rst |   46 +-
 doc/guides/sample_app_ug/test_pipeline.rst |  304 +++---
 7 files changed, 1503 insertions(+), 1505 deletions(-)

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 9a1e337..3295661 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -174,72 +174,70 @@ Programmer's Guide

 **Tables**

-:ref:`Table 1. Packet Processing Pipeline Implementing QoS `
+:numref:`table_qos_1` :ref:`table_qos_1`

-:ref:`Table 2. Infrastructure Blocks Used by the Packet Processing Pipeline 
`
+:numref:`table_qos_2` :ref:`table_qos_2`

-:ref:`Table 3. Port Scheduling Hierarchy `
+:numref:`table_qos_3` :ref:`table_qos_3`

-:ref:`Table 4. Scheduler Internal Data Structures per Port `
+:numref:`table_qos_4` :ref:`table_qos_4`

-:ref:`Table 5. Ethernet Frame Overhead Fields `
+:numref:`table_qos_5` :ref:`table_qos_5`

-:ref:`Table 6. Token Bucket Generic Operations `
+:numref:`table_qos_6` :ref:`table_qos_6`

-:ref:`Table 7. Token Bucket Generic Parameters `
+:numref:`table_qos_7` :ref:`table_qos_7`

-:ref:`Table 8. Token Bucket Persistent Data Structure `
+:numref:`table_qos_8` :ref:`table_qos_8`

-:ref:`Table 9. Token Bucket Operations `
+:numref:`table_qos_9` :ref:`table_qos_9`

-:ref:`Table 10. Subport/Pipe Traffic Class Upper Limit Enforcement Persistent 
Data Structure `
+:numref:`table_qos_10` :ref:`table_qos_10`

-:ref:`Table 11. Subport/Pipe Traffic Class Upper Limit Enforcement Operations 
`
+:numref:`table_qos_11` :ref:`table_qos_11`

-:ref:`Table 12. Weighted Round Robin (WRR) `
+:numref:`table_qos_12` :ref:`table_qos_12`

-:ref:`Table 13. Subport Traffic Class Oversubscription `
+:numref:`table_qos_13` :ref:`table_qos_13`

-:ref:`Table 14. Watermark Propagation from Subport Level to Member Pipes at 
the Beginning of Each Traffic Class Upper Limit Enforcement Period 
`
+:numref:`table_qos_14` :ref:`table_qos_14`

-:ref:`Table 15. Watermark Calculation `
+:numref:`table_qos_15` :ref:`table_qos_15`

-:ref:`Table 16. RED Configuration Parameters `
+:numref:`table_qos_16` :ref:`table_qos_16`

-:ref:`Table 17. Relative Performance of Alternative Approaches `
+:numref:`table_qos_17` :ref:`table_qos_17`

-:ref:`Table 18. RED Configuration Corresponding to RED Configuration File 
`
+:numref:`table_qos_18` :ref:`table_qos_18`

-:ref:`Table 19. Port types `
+:numref:`table_qos_19` :ref:`table_qos_19`

-:ref:`Table 20. Port abstract interface `
+:numref:`table_qos_20` :ref:`table_qos_20`

-:ref:`Table 21. Table types `
+:numref:`table_qos_21` :ref:`table_qos_21`

-:ref:`Table 29. Table Abstract Interface `
+:numref:`table_qos_22` :ref:`table_qos_22`

-:ref:`Table 22. Configuration parameters common for all hash table types 
`
+:numref:`table_qos_23` :ref:`table_qos_23`

-:ref:`Table 23. Configuration parameters specific to extendable bucket hash 
table `
+:numref:`table_qos_24` :ref:`table_qos_24`

-:ref:`Table 24. Configuration parameters specific to pre-computed key 
signature hash table `
+:numref:`table_qos_25` :ref:`table_qos_25`

-:ref:`Table 25. The main large data structures (arrays) used for configurable 
key size hash tables `
+:numref:`table_qos_26` :ref:`table_qos_26`

-:ref:`Table 26. Field description for bucket array entry (configurable key 
size hash tables) `
+:numref:`table_qos_27` :ref:`table_qos_27`

-:ref:`Table 27. Description of the bucket search pipeline stages (configurable 
key size hash tables) `
+:numref:`table_qos_28` :ref:`table_qos_28`

-:ref:`Table 28. Lookup tables for match, match_many, match_pos `
+:numref:`table_qos_29` :ref:`table_qos_29`

-:ref:`Table 29. Collapsed lookup tables for match, match_many and match_pos 
`
+:numref:`table_qos_30` :ref:`table_qos_30`

-:ref:`Table 30. The main large data structures (arrays) used for 8-byte and 
16-byte key size hash tables `
+:numref:`table_qos_31` :ref:`table_qos_31`

-:ref:`Table 31. Field description for bucket array entry (8-byte and 16-byte 
key hash tables) `
+:numref:`table_qos_32` :ref:`table_qos_32`

-:ref:`Table 32. Description of the bucket search pipeline stages (8-byte and 
16-byte key hash tables) `
+:numref:`table_qos_33` :ref:`table_qos_33`

-:ref:`Table 33. Next hop actions (reserved) `
-
-:ref:`Table 34. User action examples `
+:numref:`table_qos_34` :ref:`table_qos_34

[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> Sent: Monday, May 18, 2015 1:28 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> 
> Hi,
> 
> Any opinion on this patch?
> 
> Regards,
> 
> Zoltan
> 
> On 13/05/15 19:59, Zoltan Kiss wrote:
> > Otherwise cache_flushthresh can be bigger than n, and
> > a consumer can starve others by keeping every element
> > either in use or in the cache.
> >
> > Signed-off-by: Zoltan Kiss 
> > ---
> >   lib/librte_mempool/rte_mempool.c | 3 ++-
> >   lib/librte_mempool/rte_mempool.h | 2 +-
> >   2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.c 
> > b/lib/librte_mempool/rte_mempool.c
> > index cf7ed76..ca6cd9c 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> > unsigned elt_size,
> > mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
> >
> > /* asked cache too big */
> > -   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> > +   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> > +   (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
> > rte_errno = EINVAL;
> > return NULL;
> > }

Why just no 'cache_size > n' then?
Konstantin

> > diff --git a/lib/librte_mempool/rte_mempool.h 
> > b/lib/librte_mempool/rte_mempool.h
> > index 9001312..a4a9610 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
> > void *);
> >*   If cache_size is non-zero, the rte_mempool library will try to
> >*   limit the accesses to the common lockless pool, by maintaining a
> >*   per-lcore object cache. This argument must be lower or equal to
> > - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> > + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
> >*   cache_size to have "n modulo cache_size == 0": if this is
> >*   not the case, some elements will always stay in the pool and will
> >*   never be used. The access to the per-lcore table is of course
> >


[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Zoltan Kiss


On 18/05/15 13:41, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>> Sent: Monday, May 18, 2015 1:28 PM
>> To: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
>>
>> Hi,
>>
>> Any opinion on this patch?
>>
>> Regards,
>>
>> Zoltan
>>
>> On 13/05/15 19:59, Zoltan Kiss wrote:
>>> Otherwise cache_flushthresh can be bigger than n, and
>>> a consumer can starve others by keeping every element
>>> either in use or in the cache.
>>>
>>> Signed-off-by: Zoltan Kiss 
>>> ---
>>>lib/librte_mempool/rte_mempool.c | 3 ++-
>>>lib/librte_mempool/rte_mempool.h | 2 +-
>>>2 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/librte_mempool/rte_mempool.c 
>>> b/lib/librte_mempool/rte_mempool.c
>>> index cf7ed76..ca6cd9c 100644
>>> --- a/lib/librte_mempool/rte_mempool.c
>>> +++ b/lib/librte_mempool/rte_mempool.c
>>> @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
>>> unsigned elt_size,
>>> mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
>>>
>>> /* asked cache too big */
>>> -   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
>>> +   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
>>> +   (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
>>> rte_errno = EINVAL;
>>> return NULL;
>>> }
>
> Why just no 'cache_size > n' then?

The commit message says: "Otherwise cache_flushthresh can be bigger than 
n, and a consumer can starve others by keeping every element either in 
use or in the cache."

> Konstantin
>
>>> diff --git a/lib/librte_mempool/rte_mempool.h 
>>> b/lib/librte_mempool/rte_mempool.h
>>> index 9001312..a4a9610 100644
>>> --- a/lib/librte_mempool/rte_mempool.h
>>> +++ b/lib/librte_mempool/rte_mempool.h
>>> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
>>> void *);
>>> *   If cache_size is non-zero, the rte_mempool library will try to
>>> *   limit the accesses to the common lockless pool, by maintaining a
>>> *   per-lcore object cache. This argument must be lower or equal to
>>> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
>>> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
>>> *   cache_size to have "n modulo cache_size == 0": if this is
>>> *   not the case, some elements will always stay in the pool and will
>>> *   never be used. The access to the per-lcore table is of course
>>>


[dpdk-dev] [PATCH v2 0/3] port: added frag_ipv6 and ras_ipv6 ports

2015-05-18 Thread Thomas Monjalon
2015-05-05 15:08, Dumitrescu, Cristian:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
> > From: Maciej Gajdzica 
> > 
> > Added ipv6 versions of ip fragmentation and ip reassembly ports.
> > 
> > Maciej Gajdzica (3):
> >   port: removed IPV4_MTU_DEFAULT define
> >   port: added ipv6 fragmentation port
> >   port: added ipv6 reassembly port
> 
> Acked by: Cristian Dumitrescu  

Applied, thanks


[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Ananyev, Konstantin


> -Original Message-
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Monday, May 18, 2015 1:50 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> 
> 
> 
> On 18/05/15 13:41, Ananyev, Konstantin wrote:
> >
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> >> Sent: Monday, May 18, 2015 1:28 PM
> >> To: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> >>
> >> Hi,
> >>
> >> Any opinion on this patch?
> >>
> >> Regards,
> >>
> >> Zoltan
> >>
> >> On 13/05/15 19:59, Zoltan Kiss wrote:
> >>> Otherwise cache_flushthresh can be bigger than n, and
> >>> a consumer can starve others by keeping every element
> >>> either in use or in the cache.
> >>>
> >>> Signed-off-by: Zoltan Kiss 
> >>> ---
> >>>lib/librte_mempool/rte_mempool.c | 3 ++-
> >>>lib/librte_mempool/rte_mempool.h | 2 +-
> >>>2 files changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/lib/librte_mempool/rte_mempool.c 
> >>> b/lib/librte_mempool/rte_mempool.c
> >>> index cf7ed76..ca6cd9c 100644
> >>> --- a/lib/librte_mempool/rte_mempool.c
> >>> +++ b/lib/librte_mempool/rte_mempool.c
> >>> @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> >>> unsigned elt_size,
> >>>   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, 
> >>> rte_mempool_list);
> >>>
> >>>   /* asked cache too big */
> >>> - if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> >>> + if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> >>> + (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
> >>>   rte_errno = EINVAL;
> >>>   return NULL;
> >>>   }
> >
> > Why just no 'cache_size > n' then?
> 
> The commit message says: "Otherwise cache_flushthresh can be bigger than
> n, and a consumer can starve others by keeping every element either in
> use or in the cache."

Ah yes, you right - your condition is more restrictive, which is better. 
Though here you implicitly convert cache_size and n to floats and compare 2 
floats :
(uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n)
Shouldn't it be:
(uint32_t)(cache_size * CACHE_FLUSHTHRESH_MULTIPLIER) > n)
So we do conversion back to uint32_t compare to unsigned integers instead?
Same as below:
mp->cache_flushthresh = (uint32_t)
(cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);
?

In fact, as we use it more than once, it probably makes sense to create a macro 
for it,
something like:
#define CALC_CACHE_FLUSHTHRESH(c)   ((uint32_t)((c) *  
CACHE_FLUSHTHRESH_MULTIPLIER)

Or even

#define CALC_CACHE_FLUSHTHRESH(c)   ((typeof (c))((c) *  
CACHE_FLUSHTHRESH_MULTIPLIER)


Konstantin

> 
> > Konstantin
> >
> >>> diff --git a/lib/librte_mempool/rte_mempool.h 
> >>> b/lib/librte_mempool/rte_mempool.h
> >>> index 9001312..a4a9610 100644
> >>> --- a/lib/librte_mempool/rte_mempool.h
> >>> +++ b/lib/librte_mempool/rte_mempool.h
> >>> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool 
> >>> *, void *);
> >>> *   If cache_size is non-zero, the rte_mempool library will try to
> >>> *   limit the accesses to the common lockless pool, by maintaining a
> >>> *   per-lcore object cache. This argument must be lower or equal to
> >>> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> >>> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to 
> >>> choose
> >>> *   cache_size to have "n modulo cache_size == 0": if this is
> >>> *   not the case, some elements will always stay in the pool and will
> >>> *   never be used. The access to the per-lcore table is of course
> >>>


[dpdk-dev] [PATCH v2 0/2] cmdline: add polling mode for command line

2015-05-18 Thread Thomas Monjalon
2015-05-13 15:20, Olivier MATZ:
> On 05/13/2015 01:59 PM, Pawel Wodkowski wrote:
> > This patchset adds the ability to process console input in the same thread
> > as packet processing by using poll() function and fixes some minor issues.
> >
> > v2 changes:
> >   - add doxygen documentation for cmdline_poll()
> >   - map file issue fixed
> >   - use proper email address.
> >   - add addtional missing include in cmdline_parse_ipaddr.h
> >
> > Pawel Wodkowski (2):
> >cmdline: fix missing include files
> >cmdline: add polling mode for command line
> 
> Acked-by: Olivier Matz 

Applied, thanks


[dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors.

2015-05-18 Thread Ouyang, Changchun
Hi Huawei,

> -Original Message-
> From: Xie, Huawei
> Sent: Monday, May 18, 2015 5:39 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle
> chained vring descriptors.
> 
> On 5/4/2015 2:27 PM, Ouyang Changchun wrote:
> > Vring enqueue need consider the 2 cases:
> >  1. Vring descriptors chained together, the first one is for virtio
> > header, the rest are for real data;  2. Only one descriptor, virtio
> > header and real data share one single descriptor;
> >
> > So does vring dequeue.
> >
> > Signed-off-by: Changchun Ouyang 
> > ---
> >  lib/librte_vhost/vhost_rxtx.c | 60
> > +++
> >  1 file changed, 44 insertions(+), 16 deletions(-)
> >
> > diff --git a/lib/librte_vhost/vhost_rxtx.c
> > b/lib/librte_vhost/vhost_rxtx.c index 510ffe8..3135883 100644
> > --- a/lib/librte_vhost/vhost_rxtx.c
> > +++ b/lib/librte_vhost/vhost_rxtx.c
> > @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
> > struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> > uint64_t buff_addr = 0;
> > uint64_t buff_hdr_addr = 0;
> > -   uint32_t head[MAX_PKT_BURST], packet_len = 0;
> > +   uint32_t head[MAX_PKT_BURST];
> > uint32_t head_idx, packet_success = 0;
> > uint16_t avail_idx, res_cur_idx;
> > uint16_t res_base_idx, res_end_idx;
> > @@ -113,6 +113,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
> > rte_prefetch0(&vq->desc[head[packet_success]]);
> >
> > while (res_cur_idx != res_end_idx) {
> > +   uint32_t offset = 0;
> > +   uint32_t data_len, len_to_cpy;
> > +   uint8_t plus_hdr = 0;
> > +
> > /* Get descriptor from available ring */
> > desc = &vq->desc[head[packet_success]];
> >
> > @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> > queue_id,
> >
> > /* Copy virtio_hdr to packet and increment buffer address */
> > buff_hdr_addr = buff_addr;
> > -   packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
> >
> > /*
> >  * If the descriptors are chained the header and data are @@
> > -136,24 +139,44 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
> > desc = &vq->desc[desc->next];
> > /* Buffer address translation. */
> > buff_addr = gpa_to_vva(dev, desc->addr);
> > -   desc->len = rte_pktmbuf_data_len(buff);
> > } else {
> > buff_addr += vq->vhost_hlen;
> > -   desc->len = packet_len;
> > +   plus_hdr = 1;
> > }
> >
> > +   data_len = rte_pktmbuf_data_len(buff);
> > +   len_to_cpy = RTE_MIN(data_len, desc->len);
> > +   do {
> > +   if (len_to_cpy > 0) {
> > +   /* Copy mbuf data to buffer */
> > +   rte_memcpy((void *)(uintptr_t)buff_addr,
> > +   (const void
> *)(rte_pktmbuf_mtod(buff, const char *) + offset),
> > +   len_to_cpy);
> > +   PRINT_PACKET(dev, (uintptr_t)buff_addr,
> > +   len_to_cpy, 0);
> > +
> > +   desc->len = len_to_cpy + (plus_hdr ? vq-
> >vhost_hlen : 0);
> 
> Do we really need to rewrite the desc->len again and again?  At least we only
> have the possibility to change the value of desc->len of the last descriptor.

Well, I think we need change each descriptor's len in the chain here,
If aggregate all len to the last descriptor's len, it is possibly the length 
will exceed its original len,
e.g. use 8 descriptor(each has len of 1024) chained to recv a 8K packet, then 
last descriptor's len
will be 8K, and all other descriptor is 0, I don't think this situation make 
sense.  

> 
> > +   offset += len_to_cpy;
> > +   if (desc->flags & VRING_DESC_F_NEXT) {
> > +   desc = &vq->desc[desc->next];
> > +   buff_addr = gpa_to_vva(dev, desc-
> >addr);
> > +   len_to_cpy = RTE_MIN(data_len -
> offset, desc->len);
> > +   } else
> > +   break;
> 
> Still there are two issues here.
> a) If the data couldn't be fully copied to chain of guest buffers, we 
> shouldn't
> do any copy.

Why don't copy any data is better than the current implementation?

> b) scatter mbuf isn't considered.

If we also consider scatter mbuf here, then this function will have exactly 
same logic with mergeable_rx,
Do you want to totally remove this function, just keep the mergeable rx 
function for all cases?

Changchun



[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Zoltan Kiss


On 18/05/15 14:14, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Monday, May 18, 2015 1:50 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
>>
>>
>>
>> On 18/05/15 13:41, Ananyev, Konstantin wrote:
>>>
>>>
 -Original Message-
 From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
 Sent: Monday, May 18, 2015 1:28 PM
 To: dev at dpdk.org
 Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size

 Hi,

 Any opinion on this patch?

 Regards,

 Zoltan

 On 13/05/15 19:59, Zoltan Kiss wrote:
> Otherwise cache_flushthresh can be bigger than n, and
> a consumer can starve others by keeping every element
> either in use or in the cache.
>
> Signed-off-by: Zoltan Kiss 
> ---
> lib/librte_mempool/rte_mempool.c | 3 ++-
> lib/librte_mempool/rte_mempool.h | 2 +-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index cf7ed76..ca6cd9c 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> unsigned elt_size,
>   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, 
> rte_mempool_list);
>
>   /* asked cache too big */
> - if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> + if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> + (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
>   rte_errno = EINVAL;
>   return NULL;
>   }
>>>
>>> Why just no 'cache_size > n' then?
>>
>> The commit message says: "Otherwise cache_flushthresh can be bigger than
>> n, and a consumer can starve others by keeping every element either in
>> use or in the cache."
>
> Ah yes, you right - your condition is more restrictive, which is better.
> Though here you implicitly convert cache_size and n to floats and compare 2 
> floats :
> (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n)
> Shouldn't it be:
> (uint32_t)(cache_size * CACHE_FLUSHTHRESH_MULTIPLIER) > n)
> So we do conversion back to uint32_t compare to unsigned integers instead?
> Same as below:
> mp->cache_flushthresh = (uint32_t)
>  (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);

To bring it further: how about ditching the whole cache_flushthresh 
member of the mempool structure, and use this:

#define CACHE_FLUSHTHRESH(mp) (uint32_t)((mp)->cache_size * 1.5)

Furthermore, do we want to expose the flush threshold multiplier through 
the config file?

> ?
>
> In fact, as we use it more than once, it probably makes sense to create a 
> macro for it,
> something like:
> #define CALC_CACHE_FLUSHTHRESH(c) ((uint32_t)((c) *  
> CACHE_FLUSHTHRESH_MULTIPLIER)
>
> Or even
>
> #define CALC_CACHE_FLUSHTHRESH(c) ((typeof (c))((c) *  
> CACHE_FLUSHTHRESH_MULTIPLIER)
>
>
> Konstantin
>
>>
>>> Konstantin
>>>
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index 9001312..a4a9610 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool 
> *, void *);
>  *   If cache_size is non-zero, the rte_mempool library will try to
>  *   limit the accesses to the common lockless pool, by maintaining a
>  *   per-lcore object cache. This argument must be lower or equal to
> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to 
> choose
>  *   cache_size to have "n modulo cache_size == 0": if this is
>  *   not the case, some elements will always stay in the pool and will
>  *   never be used. The access to the per-lcore table is of course
>


[dpdk-dev] [PATCH] i40e: compile fix on ICC 13.0.0

2015-05-18 Thread Helin Zhang
Below compile error can be found on ICC 13.0.0, which is a warning
treated as error. Forcedly disabling the warning can fix it.

Error log:
lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
type mixed with another type
hw->aq.asq_last_status = old_asq_status;
   ^

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 22f0716..911e4f5 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -48,7 +48,7 @@ LIBABIVER := 1
 # to disable warnings
 #
 ifeq ($(CC), icc)
-CFLAGS_BASE_DRIVER = -wd593
+CFLAGS_BASE_DRIVER = -wd593 -wd188
 else ifeq ($(CC), clang)
 CFLAGS_BASE_DRIVER += -Wno-sign-compare
 CFLAGS_BASE_DRIVER += -Wno-unused-value
-- 
1.8.1.4



[dpdk-dev] [PATCH 1/4] examples/bond: fix compilation with clang

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 10:17:58AM +0200, Olivier Matz wrote:
> Fix the following compilation error:
> 
> examples/bond/main.c:717:1: error: control reaches end of
>   non-void function [-Werror,-Wreturn-type]
> 
> The prompt() function does not return anything, so fix its prototype
> to be void.
> 
> Signed-off-by: Olivier Matz 

Out of interest, what version of clang throws up this error?

/Bruce

> ---
>  examples/bond/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/examples/bond/main.c b/examples/bond/main.c
> index e90dc1d..4622283 100644
> --- a/examples/bond/main.c
> +++ b/examples/bond/main.c
> @@ -705,7 +705,7 @@ cmdline_parse_ctx_t main_ctx[] = {
>  };
>  
>  /* prompt function, called from main on MASTER lcore */
> -static void *prompt(__attribute__((unused)) void *arg1)
> +static void prompt(__attribute__((unused)) void *arg1)
>  {
>   struct cmdline *cl;
>  
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH 1/4] examples/bond: fix compilation with clang

2015-05-18 Thread Olivier MATZ
Hi Bruce,

On 05/18/2015 03:53 PM, Bruce Richardson wrote:
> On Mon, May 18, 2015 at 10:17:58AM +0200, Olivier Matz wrote:
>> Fix the following compilation error:
>>
>> examples/bond/main.c:717:1: error: control reaches end of
>>   non-void function [-Werror,-Wreturn-type]
>>
>> The prompt() function does not return anything, so fix its prototype
>> to be void.
>>
>> Signed-off-by: Olivier Matz 
> 
> Out of interest, what version of clang throws up this error?

$ clang --version
Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: x86_64-pc-linux-gnu
Thread model: posix

And by the way, the gcc version I used for the other patches of the
series:

$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2


Regards,
Olivier

> 
> /Bruce
> 
>> ---
>>  examples/bond/main.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/examples/bond/main.c b/examples/bond/main.c
>> index e90dc1d..4622283 100644
>> --- a/examples/bond/main.c
>> +++ b/examples/bond/main.c
>> @@ -705,7 +705,7 @@ cmdline_parse_ctx_t main_ctx[] = {
>>  };
>>  
>>  /* prompt function, called from main on MASTER lcore */
>> -static void *prompt(__attribute__((unused)) void *arg1)
>> +static void prompt(__attribute__((unused)) void *arg1)
>>  {
>>  struct cmdline *cl;
>>  
>> -- 
>> 2.1.4
>>


[dpdk-dev] [PATCH 1/4] examples/bond: fix compilation with clang

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 03:57:00PM +0200, Olivier MATZ wrote:
> Hi Bruce,
> 
> On 05/18/2015 03:53 PM, Bruce Richardson wrote:
> > On Mon, May 18, 2015 at 10:17:58AM +0200, Olivier Matz wrote:
> >> Fix the following compilation error:
> >>
> >> examples/bond/main.c:717:1: error: control reaches end of
> >>   non-void function [-Werror,-Wreturn-type]
> >>
> >> The prompt() function does not return anything, so fix its prototype
> >> to be void.
> >>
> >> Signed-off-by: Olivier Matz 
> > 
> > Out of interest, what version of clang throws up this error?
> 
> $ clang --version
> Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> 
> And by the way, the gcc version I used for the other patches of the
> series:
> 
> $ gcc --version
> gcc (Debian 4.9.2-10) 4.9.2
> 
> 
> Regards,
> Olivier
> 

Thanks. I was just curious as I wasn't seeing issues with clang 3.6 on Fedora.

/Bruce

> > 
> > /Bruce
> > 
> >> ---
> >>  examples/bond/main.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/examples/bond/main.c b/examples/bond/main.c
> >> index e90dc1d..4622283 100644
> >> --- a/examples/bond/main.c
> >> +++ b/examples/bond/main.c
> >> @@ -705,7 +705,7 @@ cmdline_parse_ctx_t main_ctx[] = {
> >>  };
> >>  
> >>  /* prompt function, called from main on MASTER lcore */
> >> -static void *prompt(__attribute__((unused)) void *arg1)
> >> +static void prompt(__attribute__((unused)) void *arg1)
> >>  {
> >>struct cmdline *cl;
> >>  
> >> -- 
> >> 2.1.4
> >>


[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Ananyev, Konstantin


> -Original Message-
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Monday, May 18, 2015 2:31 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> 
> 
> 
> On 18/05/15 14:14, Ananyev, Konstantin wrote:
> >
> >
> >> -Original Message-
> >> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >> Sent: Monday, May 18, 2015 1:50 PM
> >> To: Ananyev, Konstantin; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> >>
> >>
> >>
> >> On 18/05/15 13:41, Ananyev, Konstantin wrote:
> >>>
> >>>
>  -Original Message-
>  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>  Sent: Monday, May 18, 2015 1:28 PM
>  To: dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
> 
>  Hi,
> 
>  Any opinion on this patch?
> 
>  Regards,
> 
>  Zoltan
> 
>  On 13/05/15 19:59, Zoltan Kiss wrote:
> > Otherwise cache_flushthresh can be bigger than n, and
> > a consumer can starve others by keeping every element
> > either in use or in the cache.
> >
> > Signed-off-by: Zoltan Kiss 
> > ---
> > lib/librte_mempool/rte_mempool.c | 3 ++-
> > lib/librte_mempool/rte_mempool.h | 2 +-
> > 2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.c 
> > b/lib/librte_mempool/rte_mempool.c
> > index cf7ed76..ca6cd9c 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned 
> > n, unsigned elt_size,
> > mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, 
> > rte_mempool_list);
> >
> > /* asked cache too big */
> > -   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> > +   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> > +   (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
> > rte_errno = EINVAL;
> > return NULL;
> > }
> >>>
> >>> Why just no 'cache_size > n' then?
> >>
> >> The commit message says: "Otherwise cache_flushthresh can be bigger than
> >> n, and a consumer can starve others by keeping every element either in
> >> use or in the cache."
> >
> > Ah yes, you right - your condition is more restrictive, which is better.
> > Though here you implicitly convert cache_size and n to floats and compare 2 
> > floats :
> > (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n)
> > Shouldn't it be:
> > (uint32_t)(cache_size * CACHE_FLUSHTHRESH_MULTIPLIER) > n)
> > So we do conversion back to uint32_t compare to unsigned integers instead?
> > Same as below:
> > mp->cache_flushthresh = (uint32_t)
> >  (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);
> 
> To bring it further: how about ditching the whole cache_flushthresh
> member of the mempool structure, and use this:
> 
> #define CACHE_FLUSHTHRESH(mp) (uint32_t)((mp)->cache_size * 1.5)

That's quite expensive and I think would slow down mempool_put() quite a lot .
So I'd suggest we keep cache_flushthresh as it is.

> 
> Furthermore, do we want to expose the flush threshold multiplier through
> the config file?

Hmm, my opinion is no - so far no one ask for that,
and as general tendency - we trying to reduce number of options in config file.
Do you have any good justification when current value is not good enough?
Anyway, that probably could be a subject of another patch/discussion.
Konstantin

> 
> > ?
> >
> > In fact, as we use it more than once, it probably makes sense to create a 
> > macro for it,
> > something like:
> > #define CALC_CACHE_FLUSHTHRESH(c)   ((uint32_t)((c) *  
> > CACHE_FLUSHTHRESH_MULTIPLIER)
> >
> > Or even
> >
> > #define CALC_CACHE_FLUSHTHRESH(c)   ((typeof (c))((c) *  
> > CACHE_FLUSHTHRESH_MULTIPLIER)
> >
> >
> > Konstantin
> >
> >>
> >>> Konstantin
> >>>
> > diff --git a/lib/librte_mempool/rte_mempool.h 
> > b/lib/librte_mempool/rte_mempool.h
> > index 9001312..a4a9610 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct 
> > rte_mempool *, void *);
> >  *   If cache_size is non-zero, the rte_mempool library will try to
> >  *   limit the accesses to the common lockless pool, by maintaining 
> > a
> >  *   per-lcore object cache. This argument must be lower or equal to
> > - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> > + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to 
> > choose
> >  *   cache_size to have "n modulo cache_size == 0": if this is
> >  *   not the case, some elements will always stay in the pool and 
> > will
> >  *   never be u

[dpdk-dev] [PATCH v2 2/2] i40e/base: compile fix on clang 3.3

2015-05-18 Thread Helin Zhang
Below compile error can be found on clang 3.3, which is a warning
treated as error. Forcedly disabling the warning can fix it.

Error log:
lib/librte_pmd_i40e/i40e/i40e_nvm.c:708:20: error: unused variable
'i40e_nvm_update_state_str' [-Werror,-Wunused-variable]
STATIC const char *i40e_nvm_update_state_str[] = {
   ^

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 911e4f5..4a5635b 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -58,6 +58,7 @@ CFLAGS_BASE_DRIVER += -Wno-format
 CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
 CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
 CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
+CFLAGS_BASE_DRIVER += -Wno-unused-variable
 else
 CFLAGS_BASE_DRIVER  = -Wno-sign-compare
 CFLAGS_BASE_DRIVER += -Wno-unused-value
-- 
1.8.1.4



[dpdk-dev] [PATCH v2 0/2] compile fixes on ICC and clang

2015-05-18 Thread Helin Zhang
Compile warnings on ICC and clang can be found, and treated as errors.
Disabling those warnings forcedly can fix them.

v2 changes:
Added the fix for the compile error on clang.

Helin Zhang (2):
  i40e: compile fix on ICC 13.0.0
  i40e: compile fix on clang 3.3

 lib/librte_pmd_i40e/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH v2 1/2] i40e/base: compile fix on ICC 13.0.0

2015-05-18 Thread Helin Zhang
Below compile error can be found on ICC 13.0.0, which is a warning
treated as error. Forcedly disabling the warning can fix it.

Error log:
lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
type mixed with another type
hw->aq.asq_last_status = old_asq_status;
   ^

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 22f0716..911e4f5 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -48,7 +48,7 @@ LIBABIVER := 1
 # to disable warnings
 #
 ifeq ($(CC), icc)
-CFLAGS_BASE_DRIVER = -wd593
+CFLAGS_BASE_DRIVER = -wd593 -wd188
 else ifeq ($(CC), clang)
 CFLAGS_BASE_DRIVER += -Wno-sign-compare
 CFLAGS_BASE_DRIVER += -Wno-unused-value
-- 
1.8.1.4



[dpdk-dev] [PATCH v2 1/2] i40e/base: compile fix on ICC 13.0.0

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 11:03:28PM +0800, Helin Zhang wrote:
> Below compile error can be found on ICC 13.0.0, which is a warning
> treated as error. Forcedly disabling the warning can fix it.
> 
> Error log:
> lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
> type mixed with another type
> hw->aq.asq_last_status = old_asq_status;
>^
> 
> Signed-off-by: Helin Zhang 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_i40e/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
> index 22f0716..911e4f5 100644
> --- a/lib/librte_pmd_i40e/Makefile
> +++ b/lib/librte_pmd_i40e/Makefile
> @@ -48,7 +48,7 @@ LIBABIVER := 1
>  # to disable warnings
>  #
>  ifeq ($(CC), icc)
> -CFLAGS_BASE_DRIVER = -wd593
> +CFLAGS_BASE_DRIVER = -wd593 -wd188
>  else ifeq ($(CC), clang)
>  CFLAGS_BASE_DRIVER += -Wno-sign-compare
>  CFLAGS_BASE_DRIVER += -Wno-unused-value
> -- 
> 1.8.1.4
> 


[dpdk-dev] [PATCH v2 2/2] i40e/base: compile fix on clang 3.3

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 11:03:29PM +0800, Helin Zhang wrote:
> Below compile error can be found on clang 3.3, which is a warning
> treated as error. Forcedly disabling the warning can fix it.
> 
> Error log:
> lib/librte_pmd_i40e/i40e/i40e_nvm.c:708:20: error: unused variable
> 'i40e_nvm_update_state_str' [-Werror,-Wunused-variable]
> STATIC const char *i40e_nvm_update_state_str[] = {
>^
> 
> Signed-off-by: Helin Zhang 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_pmd_i40e/Makefile | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
> index 911e4f5..4a5635b 100644
> --- a/lib/librte_pmd_i40e/Makefile
> +++ b/lib/librte_pmd_i40e/Makefile
> @@ -58,6 +58,7 @@ CFLAGS_BASE_DRIVER += -Wno-format
>  CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
>  CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
>  CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
> +CFLAGS_BASE_DRIVER += -Wno-unused-variable
>  else
>  CFLAGS_BASE_DRIVER  = -Wno-sign-compare
>  CFLAGS_BASE_DRIVER += -Wno-unused-value
> -- 
> 1.8.1.4
> 


[dpdk-dev] [PATCH v2 1/2] i40e/base: compile fix on ICC 13.0.0

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 04:11:06PM +0100, Bruce Richardson wrote:
> On Mon, May 18, 2015 at 11:03:28PM +0800, Helin Zhang wrote:
> > Below compile error can be found on ICC 13.0.0, which is a warning
> > treated as error. Forcedly disabling the warning can fix it.
> > 
> > Error log:
> > lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
> > type mixed with another type
> > hw->aq.asq_last_status = old_asq_status;
> >^
> > 
> > Signed-off-by: Helin Zhang 
> 
> Acked-by: Bruce Richardson 
> 

Fix works, but you probably should reword the title to start with "fix".

For future reference, it would also be nice to reference the commit that broke 
things.


> > ---
> >  lib/librte_pmd_i40e/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
> > index 22f0716..911e4f5 100644
> > --- a/lib/librte_pmd_i40e/Makefile
> > +++ b/lib/librte_pmd_i40e/Makefile
> > @@ -48,7 +48,7 @@ LIBABIVER := 1
> >  # to disable warnings
> >  #
> >  ifeq ($(CC), icc)
> > -CFLAGS_BASE_DRIVER = -wd593
> > +CFLAGS_BASE_DRIVER = -wd593 -wd188
> >  else ifeq ($(CC), clang)
> >  CFLAGS_BASE_DRIVER += -Wno-sign-compare
> >  CFLAGS_BASE_DRIVER += -Wno-unused-value
> > -- 
> > 1.8.1.4
> > 


[dpdk-dev] [PATCH v2 05/19] e1000: move e1000 pmd to drivers/net directory

2015-05-18 Thread Thomas F Herbert


On 5/18/15 6:54 AM, Bruce Richardson wrote:
> On Sat, May 16, 2015 at 02:11:14PM -0400, Thomas F Herbert wrote:
>> On 5/15/15 11:56 AM, Bruce Richardson wrote:> Move e1000 pmd to drivers/net
>> directory
>>> As part of move, rename "e1000" subdirectory, which contains the code
>>> from the "base driver", to "base".
>>>
>>> Signed-off-by: Bruce Richardson 
>> Bruce,
...
>>
>> --TFH
>
> The e1000 patch seems to apply ok to latest head in my testing.
Here is more information on the failure by running git apply in verbose 
mode.
Checking patch lib/librte_pmd_e1000/e1000/e1000_phy.c...
error: while searching for:
/***

Copyright (c) 2001-2014, Intel Corporation
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice,
 this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright
 notice, this list of conditions and the following disclaimer in the
 documentation and/or other materials provided with the distribution.

  3. Neither the name of the Intel Corporation nor the names of its
 contributors may be used to endorse or promote products derived from
 this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

***/

#include "e1000_api.h"

 /* Initialize function pointers */
 phy->ops.init_params = e1000_null_ops_generic;
 phy->ops.acquire = e1000_null_ops_generic;
 phy->ops.check_polarity = e1000_null_ops_generic;
 phy->ops.check_reset_block = e1000_null_ops_generic;
 phy->ops.commit = e1000_null_ops_generic;
 phy->ops.force_speed_duplex = e1000_null_ops_generic;
 phy->ops.get_cfg_done = e1000_null_ops_generic;
 phy->ops.get_cable_length = e1000_null_ops_generic;
 phy->ops.get_info = e1000_null_ops_generic;
 phy->ops.set_page = e1000_null_set_page;
 phy->ops.read_reg = e1000_null_read_reg;
 phy->ops.read_reg_locked = e1000_null_read_reg;
 phy->ops.read_reg_page = e1000_null_read_reg;
 phy->ops.release = e1000_null_phy_generic;
 phy->ops.reset = e1000_null_ops_generic;
 phy->ops.set_d0_lplu_state = e1000_null_lplu_state;
 phy->op
error: patch failed: lib/librte_pmd_e1000/e1000/e1000_phy.c:1
error: lib/librte_pmd_e1000/e1000/e1000_phy.c: patch does not apply
Checking patch lib/librte_pmd_e1000/e1000/e1000_phy.h...
Checking patch lib/librte_pmd_e1000/e1000/e1000_regs.h...
Checking patch lib/librte_pmd_e1000/e1000/e1000_vf.c...
Checking patch lib/librte_pmd_e1000/e1000/e1000_vf.h...
Checking patch lib/librte_pmd_e1000/e1000_ethdev.h...
Checking patch lib/librte_pmd_e1000/e1000_logs.h...
Checking patch lib/librte_pmd_e1000/em_ethdev.c...
Checking patch lib/librte_pmd_e1000/em_rxtx.c...
Checking patch lib/librte_pmd_e1000/igb_ethdev.c...
Checking patch lib/librte_pmd_e1000/igb_pf.c...
Checking patch lib/librte_pmd_e1000/igb_rxtx.c...
Checking patch lib/librte_pmd_e1000/rte_pmd_e1000_version.map...

--TFH


However, the
> base driver code for i40e has been applied which prevents patch 8 from 
> applying.
>
> /Bruce
>
>>
>> git apply 
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_04_19__bond__Move_bonded_ethdev_pmd_to_drivers_net-20150515-1235382.txt
>> [therbert at Fedora21 dpdk]$ git apply 
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:322:
>> trailing whitespace.
>>
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:325:
>> trailing whitespace.
>>
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_19__e1000__move_e1000_pmd_to_drivers_net_directory-20150515-5498503.txt:329:
>> trailing whitespace.
>>
>> ../dpdkPatch_20150516-1241/messages/_dpdk-dev___PATCH_v2_05_

[dpdk-dev] [PATCH v2] mempool: limit cache_size

2015-05-18 Thread Zoltan Kiss
Otherwise cache_flushthresh can be bigger than n, and
a consumer can starve others by keeping every element
either in use or in the cache.

Signed-off-by: Zoltan Kiss 
---
v2: use macro for calculation, with proper casting

 lib/librte_mempool/rte_mempool.c | 8 +---
 lib/librte_mempool/rte_mempool.h | 2 +-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index cf7ed76..5cfb96b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -68,6 +68,8 @@ static struct rte_tailq_elem rte_mempool_tailq = {
 EAL_REGISTER_TAILQ(rte_mempool_tailq)

 #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
+#define CALC_CACHE_FLUSHTHRESH(c)  \
+   ((typeof (c))((c) *  CACHE_FLUSHTHRESH_MULTIPLIER))

 /*
  * return the greatest common divisor between a and b (fast algorithm)
@@ -440,7 +442,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);

/* asked cache too big */
-   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
+   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
+   CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
rte_errno = EINVAL;
return NULL;
}
@@ -565,8 +568,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
unsigned elt_size,
mp->header_size = objsz.header_size;
mp->trailer_size = objsz.trailer_size;
mp->cache_size = cache_size;
-   mp->cache_flushthresh = (uint32_t)
-   (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);
+   mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
mp->private_data_size = private_data_size;

/* calculate address of the first element for continuous mempool. */
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 9001312..a4a9610 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
void *);
  *   If cache_size is non-zero, the rte_mempool library will try to
  *   limit the accesses to the common lockless pool, by maintaining a
  *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
+ *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
  *   cache_size to have "n modulo cache_size == 0": if this is
  *   not the case, some elements will always stay in the pool and will
  *   never be used. The access to the per-lcore table is of course
-- 
1.9.1



[dpdk-dev] [PATCH v3 1/2] i40e/base: fix compile with ICC 13.0.0

2015-05-18 Thread Helin Zhang
Below compile error can be found on ICC 13.0.0, which is a warning
treated as error. Forcedly disabling the warning can fix it.

Error log:
lib/librte_pmd_i40e/i40e/i40e_nvm.c(1022): error #188: enumerated
type mixed with another type
hw->aq.asq_last_status = old_asq_status;
   ^

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

v3 changes:
Reworded the commit title.

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 22f0716..911e4f5 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -48,7 +48,7 @@ LIBABIVER := 1
 # to disable warnings
 #
 ifeq ($(CC), icc)
-CFLAGS_BASE_DRIVER = -wd593
+CFLAGS_BASE_DRIVER = -wd593 -wd188
 else ifeq ($(CC), clang)
 CFLAGS_BASE_DRIVER += -Wno-sign-compare
 CFLAGS_BASE_DRIVER += -Wno-unused-value
-- 
1.8.1.4



[dpdk-dev] [PATCH v3 0/2] fix compile with ICC and clang

2015-05-18 Thread Helin Zhang
Compile warnings on ICC and clang can be found, and treated as errors.
Disabling those warnings forcedly can fix them.

v2 changes:
Added the fix for the compile error on clang.

v3 changes:
Reworded the commit titles.

Helin Zhang (2):
  i40e: compile fix on ICC 13.0.0
  i40e: compile fix on clang 3.3

 lib/librte_pmd_i40e/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH v3 2/2] i40e/base: fix compile with clang 3.3

2015-05-18 Thread Helin Zhang
Below compile error can be found on clang 3.3, which is a warning
treated as error. Forcedly disabling the warning can fix it.

Error log:
lib/librte_pmd_i40e/i40e/i40e_nvm.c:708:20: error: unused variable
'i40e_nvm_update_state_str' [-Werror,-Wunused-variable]
STATIC const char *i40e_nvm_update_state_str[] = {
   ^

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile | 1 +
 1 file changed, 1 insertion(+)

v3 changes:
Reworded the commit title.

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 911e4f5..4a5635b 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -58,6 +58,7 @@ CFLAGS_BASE_DRIVER += -Wno-format
 CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers
 CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
 CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
+CFLAGS_BASE_DRIVER += -Wno-unused-variable
 else
 CFLAGS_BASE_DRIVER  = -Wno-sign-compare
 CFLAGS_BASE_DRIVER += -Wno-unused-value
-- 
1.8.1.4



[dpdk-dev] [PATCH v3 0/2] fix compile with ICC and clang

2015-05-18 Thread Bruce Richardson
On Mon, May 18, 2015 at 11:40:54PM +0800, Helin Zhang wrote:
> Compile warnings on ICC and clang can be found, and treated as errors.
> Disabling those warnings forcedly can fix them.
> 
> v2 changes:
> Added the fix for the compile error on clang.
> 
> v3 changes:
> Reworded the commit titles.
> 
> Helin Zhang (2):
>   i40e: compile fix on ICC 13.0.0
>   i40e: compile fix on clang 3.3
> 
>  lib/librte_pmd_i40e/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> -- 
> 1.8.1.4
> 
Series Acked-by: Bruce Richardson 



[dpdk-dev] [PATCH] mempool: limit cache_size

2015-05-18 Thread Zoltan Kiss


On 18/05/15 15:13, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Monday, May 18, 2015 2:31 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
>>
>>
>>
>> On 18/05/15 14:14, Ananyev, Konstantin wrote:
>>>
>>>
 -Original Message-
 From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
 Sent: Monday, May 18, 2015 1:50 PM
 To: Ananyev, Konstantin; dev at dpdk.org
 Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size



 On 18/05/15 13:41, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>> Sent: Monday, May 18, 2015 1:28 PM
>> To: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] mempool: limit cache_size
>>
>> Hi,
>>
>> Any opinion on this patch?
>>
>> Regards,
>>
>> Zoltan
>>
>> On 13/05/15 19:59, Zoltan Kiss wrote:
>>> Otherwise cache_flushthresh can be bigger than n, and
>>> a consumer can starve others by keeping every element
>>> either in use or in the cache.
>>>
>>> Signed-off-by: Zoltan Kiss 
>>> ---
>>>  lib/librte_mempool/rte_mempool.c | 3 ++-
>>>  lib/librte_mempool/rte_mempool.h | 2 +-
>>>  2 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/librte_mempool/rte_mempool.c 
>>> b/lib/librte_mempool/rte_mempool.c
>>> index cf7ed76..ca6cd9c 100644
>>> --- a/lib/librte_mempool/rte_mempool.c
>>> +++ b/lib/librte_mempool/rte_mempool.c
>>> @@ -440,7 +440,8 @@ rte_mempool_xmem_create(const char *name, unsigned 
>>> n, unsigned elt_size,
>>> mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, 
>>> rte_mempool_list);
>>>
>>> /* asked cache too big */
>>> -   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
>>> +   if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
>>> +   (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n) {
>>> rte_errno = EINVAL;
>>> return NULL;
>>> }
>
> Why just no 'cache_size > n' then?

 The commit message says: "Otherwise cache_flushthresh can be bigger than
 n, and a consumer can starve others by keeping every element either in
 use or in the cache."
>>>
>>> Ah yes, you right - your condition is more restrictive, which is better.
>>> Though here you implicitly convert cache_size and n to floats and compare 2 
>>> floats :
>>> (uint32_t) cache_size * CACHE_FLUSHTHRESH_MULTIPLIER > n)
>>> Shouldn't it be:
>>> (uint32_t)(cache_size * CACHE_FLUSHTHRESH_MULTIPLIER) > n)
>>> So we do conversion back to uint32_t compare to unsigned integers instead?
>>> Same as below:
>>> mp->cache_flushthresh = (uint32_t)
>>>   (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);
>>
>> To bring it further: how about ditching the whole cache_flushthresh
>> member of the mempool structure, and use this:
>>
>> #define CACHE_FLUSHTHRESH(mp) (uint32_t)((mp)->cache_size * 1.5)
>
> That's quite expensive and I think would slow down mempool_put() quite a lot .
> So I'd suggest we keep cache_flushthresh as it is.
Ok, I have posted a v2 based on your suggestion.
>
>>
>> Furthermore, do we want to expose the flush threshold multiplier through
>> the config file?
>
> Hmm, my opinion is no - so far no one ask for that,
> and as general tendency - we trying to reduce number of options in config 
> file.
> Do you have any good justification when current value is not good enough?

Nothing special, just the arbitrary value choice seemed a bit odd.
> Anyway, that probably could be a subject of another patch/discussion.
> Konstantin
>
>>
>>> ?
>>>
>>> In fact, as we use it more than once, it probably makes sense to create a 
>>> macro for it,
>>> something like:
>>> #define CALC_CACHE_FLUSHTHRESH(c)   ((uint32_t)((c) *  
>>> CACHE_FLUSHTHRESH_MULTIPLIER)
>>>
>>> Or even
>>>
>>> #define CALC_CACHE_FLUSHTHRESH(c)   ((typeof (c))((c) *  
>>> CACHE_FLUSHTHRESH_MULTIPLIER)
>>>
>>>
>>> Konstantin
>>>

> Konstantin
>
>>> diff --git a/lib/librte_mempool/rte_mempool.h 
>>> b/lib/librte_mempool/rte_mempool.h
>>> index 9001312..a4a9610 100644
>>> --- a/lib/librte_mempool/rte_mempool.h
>>> +++ b/lib/librte_mempool/rte_mempool.h
>>> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct 
>>> rte_mempool *, void *);
>>>   *   If cache_size is non-zero, the rte_mempool library will try to
>>>   *   limit the accesses to the common lockless pool, by 
>>> maintaining a
>>>   *   per-lcore object cache. This argument must be lower or equal 
>>> to
>>> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
>>> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advise

[dpdk-dev] [PATCH v2] mempool: limit cache_size

2015-05-18 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> Sent: Monday, May 18, 2015 4:35 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] mempool: limit cache_size
> 
> Otherwise cache_flushthresh can be bigger than n, and
> a consumer can starve others by keeping every element
> either in use or in the cache.
> 
> Signed-off-by: Zoltan Kiss 

Acked-by: Konstantin Ananyev 

> ---
> v2: use macro for calculation, with proper casting
> 
>  lib/librte_mempool/rte_mempool.c | 8 +---
>  lib/librte_mempool/rte_mempool.h | 2 +-
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index cf7ed76..5cfb96b 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -68,6 +68,8 @@ static struct rte_tailq_elem rte_mempool_tailq = {
>  EAL_REGISTER_TAILQ(rte_mempool_tailq)
> 
>  #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5
> +#define CALC_CACHE_FLUSHTHRESH(c)\
> + ((typeof (c))((c) *  CACHE_FLUSHTHRESH_MULTIPLIER))
> 
>  /*
>   * return the greatest common divisor between a and b (fast algorithm)
> @@ -440,7 +442,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> unsigned elt_size,
>   mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
> 
>   /* asked cache too big */
> - if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> + if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> + CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
>   rte_errno = EINVAL;
>   return NULL;
>   }
> @@ -565,8 +568,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, 
> unsigned elt_size,
>   mp->header_size = objsz.header_size;
>   mp->trailer_size = objsz.trailer_size;
>   mp->cache_size = cache_size;
> - mp->cache_flushthresh = (uint32_t)
> - (cache_size * CACHE_FLUSHTHRESH_MULTIPLIER);
> + mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
>   mp->private_data_size = private_data_size;
> 
>   /* calculate address of the first element for continuous mempool. */
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index 9001312..a4a9610 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -468,7 +468,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
> void *);
>   *   If cache_size is non-zero, the rte_mempool library will try to
>   *   limit the accesses to the common lockless pool, by maintaining a
>   *   per-lcore object cache. This argument must be lower or equal to
> - *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
> + *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5. It is advised to choose
>   *   cache_size to have "n modulo cache_size == 0": if this is
>   *   not the case, some elements will always stay in the pool and will
>   *   never be used. The access to the per-lcore table is of course
> --
> 1.9.1



[dpdk-dev] [PATCH v4 0/6] update jhash function

2015-05-18 Thread Bruce Richardson
On Tue, May 12, 2015 at 12:02:32PM +0100, Pablo de Lara wrote:
> Jenkins hash function was developed originally in 1996,
> and was integrated in first versions of DPDK.
> The function has been improved in 2006,
> achieving up to 60% better performance, compared to the original one.
> 
> This patchset updates the current jhash in DPDK,
> including two new functions that generate two hashes from a single key.
> 
> It also separates the existing hash function performance tests to
> another file, to make it quicker to run.
> 
> changes in v4:
> - Simplify key alignment checks
> - Include missing x86 arch check
> 
> changes in v3:
> 
> - Update rte_jhash_1word, rte_jhash_2words and rte_jhash_3words
>   functions
> 
> changes in v2:
> 
> - Split single commit in three commits, one that updates the existing 
> functions
>   and another that adds two new functions and use one of those functions
>   as a base to be called by the other ones.
> - Remove some unnecessary ifdefs in the code.
> - Add new macros to help on the reutilization of constants
> - Separate hash function performance tests to another file
>   and improve cycle measurements.
> - Rename existing function rte_jhash2 to rte_jhash_32b
>   (something more meaninful) and mark rte_jhash2 as
>   deprecated
> 

Hi Pablo,

Patchset looks good to me, and unit tests all pass across the set. Some general
comments or suggestions though - particularly about testing.

1. The set of lengths used when testing the functions looks strange and rather
arbitrary. Perhaps we could have a set of key lengths which are documented. E.g.

lengths[] = {
4, 8, 16, 48, 64, /* standard key sizes */
9,/* IPv4 SRC + DST + protocol, unpadded */
13,   /* IPv4 5-tuple, unpadded */
37,   /* IPv6 5-tuple, unpadded */
40,   /* IPv6 5-tuple, padded to 8-byte boundary */
}

2. When testing multiple algorithms, it might be nice to change the order of the
loops so that we test all algorithms with the same key lengths first, and then
change length, rather than running the same algorithm with multiple lengths and
then changing algorithm. The output would be clearer and easier to see which
algorithm performs best for a given key-length.

3. For sanity checking across the patches making changes to the jhash functions,
I think it would be nice to have an initial sanity test with a set of known
keys and hash results in it. That way we can verify that the actual calculation
result never changes as the functions are modified. This would also be a big
help for future work changing the code. [As far as I can see, we don't ever 
check
in the algorithm checks that we are ever getting the right answer :-)]

All the above suggestions could perhaps go in a patch (or 2/3 patches) after the
first two, which splits out the algorithm tests, and before the actual changes
to the jhash implementation.

Regards,
/Bruce



[dpdk-dev] [PATCH 5/5] ixgbe: silence noisy log messages

2015-05-18 Thread Stephen Hemminger
On Mon, 18 May 2015 10:32:01 +0100
Bruce Richardson  wrote:

> For the most part, this looks fine. However, I'm unsure about changing the log
> level of the messages stating what the RX and TX burst functions in use are. I
> would view this as important information that should generally be displayed as
> the performance impacts of using a sub-optimal RX/TX code path are large.
> 
> /Bruce


At INFO level it shows up in log files that customers read.
This is an issue where DPDK has to grow up and be ready for real world
use, rather than being developer friendly.

Our customers ask about every log message (believe me). So if there
is no problem the drivers must be absolutely silent (STFU)


[dpdk-dev] [PATCH 0/5] receive IRQ related patches

2015-05-18 Thread Stephen Hemminger
These are some of the patches to enhance the still as not yet
merged receive interrupt functionality.

The big piece is support of UIO-MSI interrupts which is required
to make the virtio and vmxnet3 receive IRQ functionality work.
After this piece is reviewed, I will send those bits.

Stephen Hemminger (5):
  ethdev: check for rxq interrupt support
  ethdev: remove unnecessary checks
  ethdev: fix errors if RTE_ETHDEV_DEBUG enabled
  uio: new driver with MSI-X support
  uio: integrate MSI-X support

 config/common_linuxapp |   1 +
 lib/librte_eal/common/include/rte_pci.h|   1 +
 lib/librte_eal/linuxapp/Makefile   |   3 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   |  94 +-
 lib/librte_eal/linuxapp/eal/eal_pci.c  |   4 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  |  59 +++-
 lib/librte_eal/linuxapp/eal/eal_uio_msi.h  |  26 ++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   1 +
 lib/librte_eal/linuxapp/uio_msi/Makefile   |  13 +
 lib/librte_eal/linuxapp/uio_msi/uio_msi.c  | 365 +
 lib/librte_eal/linuxapp/uio_msi/uio_msi.h  |  22 ++
 lib/librte_ether/rte_ethdev.c  |  29 +-
 tools/dpdk_nic_bind.py |   2 +-
 13 files changed, 580 insertions(+), 40 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_uio_msi.h
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/Makefile
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.c
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.h

-- 
2.1.4



[dpdk-dev] [PATCH 1/5] ethdev: check for rxq interrupt support

2015-05-18 Thread Stephen Hemminger
Not all devices support rxq interrupt yet.
It is better to check for interrupt support in driver at configuration
time than waiting for later failures.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_ether/rte_ethdev.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index cb586ff..ad15837 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1183,6 +1183,14 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
}

/*
+* If Receive Queue interrupt is enabled, check that
+* the device supports interrupt control.
+*/
+   if (dev_conf->intr_conf.rxq == 1)
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable,
+   -EINVAL);
+
+   /*
 * If jumbo frames are enabled, check that the maximum RX packet
 * length is supported by the configured device.
 */
-- 
2.1.4



[dpdk-dev] [PATCH 2/5] ethdev: remove unnecessary checks

2015-05-18 Thread Stephen Hemminger
Since the code has just called rte_eth_dev_is_valid_port
the following checks are unnecessary.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_ether/rte_ethdev.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ad15837..7789338 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3305,10 +3305,6 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int 
op, void *data)
}

dev = &rte_eth_devices[port_id];
-   if (dev == NULL) {
-   PMD_DEBUG_TRACE("Invalid port device\n");
-   return -ENODEV;
-   }

intr_handle = &dev->pci_dev->intr_handle;
if (!intr_handle->intr_vec) {
@@ -3350,10 +3346,6 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t 
queue_id,
}

dev = &rte_eth_devices[port_id];
-   if (dev == NULL) {
-   PMD_DEBUG_TRACE("Invalid port device\n");
-   return -ENODEV;
-   }

if (queue_id >= dev->data->nb_rx_queues) {
PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id);
@@ -3391,10 +3383,6 @@ rte_eth_dev_rx_intr_enable(uint8_t port_id,
}

dev = &rte_eth_devices[port_id];
-   if (dev == NULL) {
-   PMD_DEBUG_TRACE("Invalid port device\n");
-   return -ENODEV;
-   }

FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
@@ -3412,10 +3400,6 @@ rte_eth_dev_rx_intr_disable(uint8_t port_id,
}

dev = &rte_eth_devices[port_id];
-   if (dev == NULL) {
-   PMD_DEBUG_TRACE("Invalid port device\n");
-   return -ENODEV;
-   }

FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
-- 
2.1.4



[dpdk-dev] [PATCH 3/5] ethdev: fix errors if RTE_ETHDEV_DEBUG enabled

2015-05-18 Thread Stephen Hemminger
The interrupt mode patches introduced some obvious errors
if RTE_ETHDEV_DEBUG is defined.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_ether/rte_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7789338..cf9a79a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3348,13 +3348,13 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t 
queue_id,
dev = &rte_eth_devices[port_id];

if (queue_id >= dev->data->nb_rx_queues) {
-   PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id);
+   PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
return -EINVAL;
}

intr_handle = &dev->pci_dev->intr_handle;
if (!intr_handle->intr_vec || intr_handle->intr_vec[queue_id] < 0) {
-   PMD_DEBUG_TRACE("RX Intr vector unset on %d\n", rx_queue_id);
+   PMD_DEBUG_TRACE("RX Intr vector unset on %d\n", queue_id);
return -EPERM;
}

-- 
2.1.4



[dpdk-dev] [PATCH 4/5] uio: new driver with MSI-X support

2015-05-18 Thread Stephen Hemminger
This is a merge of igb_uio with the MSI-X support through
eventfd (similar to VFIO). The driver requires a small change to
upstream UIO driver to allow UIO drivers to support ioctl's.

See:
http://marc.info/?l=linux-kernel&m=143197030217434&w=2
http://www.spinics.net/lists/kernel/msg1993359.html

Signed-off-by: Stephen Hemminger 
---
 config/common_linuxapp|   1 +
 lib/librte_eal/linuxapp/Makefile  |   3 +
 lib/librte_eal/linuxapp/uio_msi/Makefile  |  13 ++
 lib/librte_eal/linuxapp/uio_msi/uio_msi.c | 365 ++
 lib/librte_eal/linuxapp/uio_msi/uio_msi.h |  22 ++
 5 files changed, 404 insertions(+)
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/Makefile
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.c
 create mode 100644 lib/librte_eal/linuxapp/uio_msi/uio_msi.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0078dc9..8299efe 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -100,6 +100,7 @@ CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
 CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
+CONFIG_RTE_EAL_UIO_MSI=y

 #
 # Special configurations in PCI Config Space for high performance
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index 8fcfdf6..d283952 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -34,6 +34,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 ifeq ($(CONFIG_RTE_EAL_IGB_UIO),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += igb_uio
 endif
+ifeq ($(CONFIG_RTE_EAL_UIO_MSI),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += uio_msi
+endif
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
diff --git a/lib/librte_eal/linuxapp/uio_msi/Makefile 
b/lib/librte_eal/linuxapp/uio_msi/Makefile
new file mode 100644
index 000..275174c
--- /dev/null
+++ b/lib/librte_eal/linuxapp/uio_msi/Makefile
@@ -0,0 +1,13 @@
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+MODULE = uio_msi
+MODULE_PATH = drivers/uio/uio_msi
+
+MODULE_CFLAGS += -I$(SRCDIR)
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -Winline -Wall -Werror
+
+SRCS-y := uio_msi.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/uio_msi/uio_msi.c 
b/lib/librte_eal/linuxapp/uio_msi/uio_msi.c
new file mode 100644
index 000..7b1dcea
--- /dev/null
+++ b/lib/librte_eal/linuxapp/uio_msi/uio_msi.c
@@ -0,0 +1,365 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright (c) 2015 by Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "uio_msi.h"
+
+#define DRIVER_VERSION "0.1.0"
+#define NON_Q_VECTORS  1
+
+/* MSI-X vector information */
+struct uio_msi_pci_dev {
+   struct uio_info info;   /* UIO driver info */
+   struct pci_dev *pdev;   /* PCI device */
+   struct mutexmutex;  /* open/release/ioctl mutex */
+   int ref_cnt;/* references to device */
+   u16 num_vectors;/* How many MSI-X slots are used */
+   struct msix_entry *msix;/* MSI-x vector table */
+   struct uio_msi_irq_ctx {
+   struct eventfd_ctx *trigger; /* MSI-x vector to eventfd */
+   char *name; /* name in /proc/interrupts */
+   } *ctx;
+};
+
+static unsigned int max_vectors = 33;
+module_param(max_vectors, uint, 0);
+MODULE_PARM_DESC(max_vectors, "Upper limit on # of MSI-X vectors used");
+
+static irqreturn_t uio_msi_irqhandler(int irq, void *arg)
+{
+   struct eventfd_ctx *trigger = arg;
+
+   pr_devel("irq %u trigger %p\n", irq, trigger);
+
+   eventfd_signal(trigger, 1);
+   return IRQ_HANDLED;
+}
+
+/* set the mapping between vector # and existing eventfd. */
+static int set_irq_eventfd(struct uio_msi_pci_dev *udev, u32 vec, int fd)
+{
+   struct uio_msi_irq_ctx *ctx;
+   struct eventfd_ctx *trigger;
+   int irq, err;
+
+   if (vec >= udev->num_vectors) {
+   dev_notice(&udev->pdev->dev, "vec %u >= num_vec %u\n",
+  vec, udev->num_vectors);
+   return -ERANGE;
+   }
+
+   irq = udev->msix[vec].vector;
+
+   /* Clearup existing irq mapping */
+   ctx = &udev->ctx[vec];
+   if (ctx->trigger) {
+   free_irq(irq, ctx->trigger);
+   eventfd_ctx_put(ctx->trigger);
+   ctx->trigger = NULL;
+   }
+
+   /* Passing -1 is used to disable interrupt */
+   if (fd < 0)
+   return 0;
+
+
+   trigger = eventfd_ctx_fdget(fd);
+   if (IS_ERR(

[dpdk-dev] [PATCH 5/5] uio: integrate MSI-X support

2015-05-18 Thread Stephen Hemminger
Add the new uio_msi as a supported driver model.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_eal/common/include/rte_pci.h|  1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 94 +++---
 lib/librte_eal/linuxapp/eal/eal_pci.c  |  4 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 59 --
 lib/librte_eal/linuxapp/eal/eal_uio_msi.h  | 26 ++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  1 +
 lib/librte_ether/rte_ethdev.c  |  1 +
 tools/dpdk_nic_bind.py |  2 +-
 8 files changed, 166 insertions(+), 22 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_uio_msi.h

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 223d3cd..106f4f7 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -147,6 +147,7 @@ enum rte_kernel_driver {
RTE_KDRV_IGB_UIO,
RTE_KDRV_VFIO,
RTE_KDRV_UIO_GENERIC,
+   RTE_KDRV_UIO_MSIX,
 };

 /**
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index fd97fc4..8cdab58 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -66,6 +66,7 @@

 #include "eal_private.h"
 #include "eal_vfio.h"
+#include "eal_uio_msi.h"

 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)

@@ -89,9 +90,7 @@ union intr_pipefds{
  */
 union rte_intr_read_buffer {
int uio_intr_count;  /* for uio device */
-#ifdef VFIO_PRESENT
-   uint64_t vfio_intr_count;/* for vfio device */
-#endif
+   uint64_t eventfd_count;  /* for vfio and uio-msi */
uint64_t timerfd_num;/* for timerfd */
char charbuf[16];/* for others */
 };
@@ -356,6 +355,67 @@ vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 }
 #endif

+/* enable MSI-X interrupts */
+static int
+uio_msix_enable(struct rte_intr_handle *intr_handle)
+{
+   int i, max_intr;
+
+   if (!intr_handle->max_intr ||
+   intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+   max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+   else
+   max_intr = intr_handle->max_intr;
+
+   /* Actual number of MSI-X interrupts might be less than requested */
+   for (i = 0; i < max_intr; i++) {
+   struct uio_msi_irq_set irqs = {
+   .vec = i,
+   .fd = intr_handle->efds[i],
+   };
+
+   if (i == max_intr - 1)
+   irqs.fd = intr_handle->fd;
+
+   if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 
0) {
+   RTE_LOG(ERR, EAL,
+   "Error enabling MSI-X event %u fd %d (%s)\n",
+   irqs.vec, irqs.fd, strerror(errno));
+   return -1;
+   }
+   }
+
+   return 0;
+}
+
+/* disable MSI-X interrupts */
+static int
+uio_msix_disable(struct rte_intr_handle *intr_handle)
+{
+   int i, max_intr;
+
+   if (!intr_handle->max_intr ||
+   intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+   max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+   else
+   max_intr = intr_handle->max_intr;
+
+   for (i = 0; i < max_intr; i++) {
+   struct uio_msi_irq_set irqs = {
+   .vec = i,
+   .fd = -1,
+   };
+
+   if (ioctl(intr_handle->vfio_dev_fd, UIO_MSI_IRQ_SET, &irqs) < 
0) {
+   RTE_LOG(ERR, EAL,
+   "Error disabling MSI-X event %u (%s)\n",
+   i, strerror(errno));
+   return -1;
+   }
+   }
+   return 0;
+}
+
 static int
 uio_intx_intr_disable(struct rte_intr_handle *intr_handle)
 {
@@ -584,6 +644,10 @@ rte_intr_enable(struct rte_intr_handle *intr_handle)
if (uio_intx_intr_enable(intr_handle))
return -1;
break;
+   case RTE_INTR_HANDLE_UIO_MSIX:
+   if (uio_msix_enable(intr_handle))
+   return -1;
+   break;
/* not used at this moment */
case RTE_INTR_HANDLE_ALARM:
return -1;
@@ -628,6 +692,10 @@ rte_intr_disable(struct rte_intr_handle *intr_handle)
if (uio_intx_intr_disable(intr_handle))
return -1;
break;
+   case RTE_INTR_HANDLE_UIO_MSIX:
+   if (uio_msix_disable(intr_handle))
+   return -1;
+   break;
/* not used at this moment */
case RTE_INTR_HANDLE_ALARM:
return -1;
@@ -696,16 +764,19 @@ eal_intr_process_interrupts(struct epoll_event *events, 
int nfds)
case RTE_INTR_HANDLE_UIO:
 

[dpdk-dev] [PATCH v3] Implement memcmp using SIMD intrinsics

2015-05-18 Thread Ravi Kerur
Background:
After preliminary discussion with John (Zhihong) and Tim from Intel it was
decided that it would be beneficial to use AVX/SSE intrinsics for memcmp
similar to memcpy that had been implemeneted. In addition, we decided to use
librte_hash as a test candidate to test both functionality and performance.

Further discussions lead to complete functionality implementation of memory
comparison and v3 code reflects that.

Test was conducted on Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz, Ubuntu 14.04,
x86_64, 16GB DDR3 system.

Ravi Kerur (1):
  Implement memcmp using Intel SIMD instrinsics.

 app/test/Makefile  |   5 +-
 app/test/autotest_data.py  |  19 +
 app/test/test_hash_perf.c  |  36 +-
 app/test/test_memcmp.c | 229 ++
 app/test/test_memcmp_perf.c| 339 
 .../common/include/arch/ppc_64/rte_memcmp.h|  62 ++
 .../common/include/arch/x86/rte_memcmp.h   | 900 +
 lib/librte_eal/common/include/generic/rte_memcmp.h | 175 
 lib/librte_hash/rte_hash.c |  59 +-
 9 files changed, 1789 insertions(+), 35 deletions(-)
 create mode 100644 app/test/test_memcmp.c
 create mode 100644 app/test/test_memcmp_perf.c
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcmp.h

-- 
1.9.1



[dpdk-dev] [PATCH v3] Implement memcmp using Intel SIMD instrinsics.

2015-05-18 Thread Ravi Kerur
This patch implements memcmp and use librte_hash as the first candidate
to use rte_memcmp which is implemented using AVX/SSE intrinsics.

Tested with GCC(4.8.2) and Clang(3.4-1) compilers and both tests show better
performance on Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz, Ubuntu 14.04
x86_64 shows when compared to memcmp.

Changes in v3:
Implement complete memcmp functionality.
Implement functional and performance tests and add it to
"make test" infrastructure code.

Changes in v2:
Modified code to support only upto 64 bytes as that's the max bytes
used by hash for comparison.

Changes in v1:
Initial changes to support memcmp with support upto 128 bytes.

Signed-off-by: Ravi Kerur 
---
 app/test/Makefile  |   5 +-
 app/test/autotest_data.py  |  19 +
 app/test/test_hash_perf.c  |  36 +-
 app/test/test_memcmp.c | 229 ++
 app/test/test_memcmp_perf.c| 339 
 .../common/include/arch/ppc_64/rte_memcmp.h|  62 ++
 .../common/include/arch/x86/rte_memcmp.h   | 900 +
 lib/librte_eal/common/include/generic/rte_memcmp.h | 175 
 lib/librte_hash/rte_hash.c |  59 +-
 9 files changed, 1789 insertions(+), 35 deletions(-)
 create mode 100644 app/test/test_memcmp.c
 create mode 100644 app/test/test_memcmp_perf.c
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcmp.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_memcmp.h

diff --git a/app/test/Makefile b/app/test/Makefile
index 4aca77c..957e4f1 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -81,6 +81,9 @@ SRCS-y += test_logs.c
 SRCS-y += test_memcpy.c
 SRCS-y += test_memcpy_perf.c

+SRCS-y += test_memcmp.c
+SRCS-y += test_memcmp_perf.c
+
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c

@@ -150,7 +153,7 @@ CFLAGS_test_kni.o += -Wno-deprecated-declarations
 endif
 CFLAGS += -D_GNU_SOURCE

-# Disable VTA for memcpy test
+# Disable VTA for memcpy tests
 ifeq ($(CC), gcc)
 ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1)
 CFLAGS_test_memcpy.o += -fno-var-tracking-assignments
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 618a946..e07f087 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -187,6 +187,12 @@ parallel_test_group_list = [
 "Report" : None,
},
{
+"Name" :   "Memcmp autotest",
+"Command" :"memcmp_autotest",
+"Func" :   default_autotest,
+"Report" : None,
+   },
+   {
 "Name" :   "Memzone autotest",
 "Command" :"memzone_autotest",
 "Func" :   default_autotest,
@@ -399,6 +405,19 @@ non_parallel_test_group_list = [
]
 },
 {
+   "Prefix":   "memcmp_perf",
+   "Memory" :  all_sockets(512),
+   "Tests" :
+   [
+   {
+"Name" :   "Memcmp performance autotest",
+"Command" :"memcmp_perf_autotest",
+"Func" :   default_autotest,
+"Report" : None,
+   },
+   ]
+},
+{
"Prefix":   "hash_perf",
"Memory" :  all_sockets(512),
"Tests" :   
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index 6eabb21..6887629 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -440,7 +440,7 @@ run_single_tbl_perf_test(const struct rte_hash *h, 
hash_operation func,
uint32_t *invalid_pos_count)
 {
uint64_t begin, end, ticks = 0;
-   uint8_t *key = NULL;
+   uint8_t * volatile key = NULL;
uint32_t *bucket_occupancies = NULL;
uint32_t num_buckets, i, j;
int32_t pos;
@@ -547,30 +547,30 @@ run_tbl_perf_test(struct tbl_perf_test_params *params)
case ADD_UPDATE:
num_iterations = params->num_iterations;
params->num_iterations = params->entries;
-   run_single_tbl_perf_test(handle, rte_hash_add_key, params,
-   &avg_occupancy, &invalid_pos);
-   params->num_iterations = num_iterations;
ticks = run_single_tbl_perf_test(handle, rte_hash_add_key,
params, &avg_occupancy, &invalid_pos);
+   params->num_iterations = num_iterations;
+   ticks += run_single_tbl_perf_test(handle, rte_hash_add_key,
+   params, &avg_occupancy, &invalid_pos);
break;
case DELETE:
num_iterations = params->num_iterations;
params->num_iterations = params->entries;
-   run_single_tbl_perf_tes

[dpdk-dev] [PATCH] kni: Add link status update

2015-05-18 Thread Vijayakumar Muthuvel Manickam
Add an ioctl command in rte_kni module to enable
DPDK applications to propagate link state changes to
kni virtual interfaces.

Signed-off-by: Vijayakumar Muthuvel Manickam 
---
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  2 ++
 lib/librte_eal/linuxapp/kni/kni_misc.c | 39 ++
 lib/librte_kni/rte_kni.c   | 18 ++
 lib/librte_kni/rte_kni.h   | 17 ++
 4 files changed, 76 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 1e55c2d..b68001d 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -163,6 +163,7 @@ struct rte_kni_device_info {

/* mbuf size */
unsigned mbuf_size;
+   uint8_t link_state;
 };

 #define KNI_DEVICE "kni"
@@ -170,5 +171,6 @@ struct rte_kni_device_info {
 #define RTE_KNI_IOCTL_TEST_IOWR(0, 1, int)
 #define RTE_KNI_IOCTL_CREATE  _IOWR(0, 2, struct rte_kni_device_info)
 #define RTE_KNI_IOCTL_RELEASE _IOWR(0, 3, struct rte_kni_device_info)
+#define RTE_KNI_IOCTL_LINK_UPDATE _IOWR(0, 4, struct rte_kni_device_info)

 #endif /* _RTE_KNI_COMMON_H_ */
diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c 
b/lib/librte_eal/linuxapp/kni/kni_misc.c
index 1935d32..b1015cd 100644
--- a/lib/librte_eal/linuxapp/kni/kni_misc.c
+++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
@@ -548,6 +548,42 @@ kni_ioctl_release(unsigned int ioctl_num, unsigned long 
ioctl_param)
 }

 static int
+kni_ioctl_update_link_state(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+   int ret = -EINVAL;
+   struct kni_dev *dev, *n;
+   struct rte_kni_device_info dev_info;
+
+   if (_IOC_SIZE(ioctl_num) > sizeof(dev_info))
+   return -EINVAL;
+
+   ret = copy_from_user(&dev_info, (void *)ioctl_param, sizeof(dev_info));
+   if (ret) {
+   KNI_ERR("copy_from_user in kni_ioctl_update_link_status");
+   return -EIO;
+   }
+
+   if (strlen(dev_info.name) == 0)
+   return ret;
+
+   down_write(&kni_list_lock);
+   list_for_each_entry_safe(dev, n, &kni_list_head, list) {
+   if (strncmp(dev->name, dev_info.name, RTE_KNI_NAMESIZE) != 0)
+   continue;
+
+   if (dev_info.link_state == 0)
+   netif_carrier_off(dev->net_dev);
+   else
+   netif_carrier_on(dev->net_dev);
+   ret = 0;
+   break;
+   }
+   up_write(&kni_list_lock);
+
+   return ret;
+}
+
+static int
 kni_ioctl(struct inode *inode,
unsigned int ioctl_num,
unsigned long ioctl_param)
@@ -569,6 +605,9 @@ kni_ioctl(struct inode *inode,
case _IOC_NR(RTE_KNI_IOCTL_RELEASE):
ret = kni_ioctl_release(ioctl_num, ioctl_param);
break;
+   case _IOC_NR(RTE_KNI_IOCTL_LINK_UPDATE):
+   ret = kni_ioctl_update_link_state(ioctl_num, ioctl_param);
+   break;
default:
KNI_DBG("IOCTL default \n");
break;
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 4e70fa0..b6eda8a 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -512,6 +512,24 @@ rte_kni_release(struct rte_kni *kni)
 }

 int
+rte_kni_update_link_state(struct rte_kni *kni, uint8_t if_up)
+{
+   struct rte_kni_device_info dev_info;
+
+   if (!kni || !kni->in_use)
+   return -1;
+
+   snprintf(dev_info.name, sizeof(dev_info.name), "%s", kni->name);
+   dev_info.link_state = if_up;
+   if (ioctl(kni_fd, RTE_KNI_IOCTL_LINK_UPDATE, &dev_info) < 0) {
+   RTE_LOG(ERR, KNI, "Fail to update link state\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+int
 rte_kni_handle_request(struct rte_kni *kni)
 {
unsigned ret;
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index 98edd72..a1bafd9 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -167,6 +167,23 @@ extern struct rte_kni *rte_kni_create(uint8_t port_id,
 extern int rte_kni_release(struct rte_kni *kni);

 /**
+ * Send link state changes to KNI interface in kernel space
+ *
+ * rte_kni_update_link_state is thread safe.
+ *
+ * @param kni
+ *  The pointer to the context of an existent KNI interface.
+ * @param if_up
+ *  interface link status
+ *
+ * @return
+ *  - 0 indicates success.
+ *  - negative value indicates failure.
+ */
+
+extern int rte_kni_update_link_state(struct rte_kni *kni, uint8_t if_up);
+
+/**
  * It is used to handle the request mbufs sent from kernel space.
  * Then analyzes it and calls the specific actions for the specific requests.
  * Finally constructs the response mbuf and puts it back to the resp_q.
-- 
1.8.1.4



[dpdk-dev] [PATCH] kni: Add link status update

2015-05-18 Thread Stephen Hemminger
I agree that this looks like a good facility to have but this is not
the right way to do it.

There are already several facilities to accomplish the same thing.

1. You can use the operstate functionality in kernel to manipulate carrier
   from another application (read Documentation/operstate.txt)

2. It is possible with sysfs to change carrier state. The KNI driver just
   has to provide an .ndo_change_carrier callback.

Introducing more ioctl's is not a good thing. KNI needs to get rid of
the quick and dirty ioctl approach and follow current best practices
for Linux network drivers. It should really be using netlink instead of ioctl().
Ioctl's have several compatibility issues that make them unacceptable upstream.
There is a standard API for create/destroy with netlink and the custom
ioctl's could be banished into the attic of dead API's.

PS: Has anybody even attempted to get KNI upstream? I can see lots
 of style (and security) issues that would need to be fixed.