[dpdk-dev] [PATCH 0/6] fix build warnings and errors in SUSE11 SP3

2015-03-23 Thread Thomas Monjalon
> From: Marvin Liu 
> 
> SUSE11 SP3 default gcc version is 4.3.4. Some options not supported in this
> version. This patch set add gcc version check for those options and fix other
> build warning in Suse11 SP3.
> 
> Marvin Liu (6):
>   fix sse3 functions not found with gcc 4.3.4
>   fix fm10k driver build error when gcc version elder than 4.4
>   fix build error when initialized structure in enic driver
>   fix build error in app/test with gcc4.3
>   fix build error in app/test with gcc4.3
>   fix build error for implicit declaration of function pread

Applied, thanks


[dpdk-dev] [PATCH] ixgbe: fix buffer overrun bug in non-bulk alloc mode setup

2015-03-23 Thread Jiajia, SunX
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Saturday, March 21, 2015 5:46 AM
> To: Jiajia, SunX
> Cc: dev at dpdk.org; Wodkowski, PawelX
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix buffer overrun bug in non-
> bulk alloc mode setup
> 
> 2015-03-20 06:50, Jiajia, SunX:
> > Tested-by: Jiajia, SunX 
> > - Tested Commit: fe4810a01e57645ad92577d628f562791408ce21
> 
> This commit id is related to a PDF doc change.

I verify this issue based on this commit id which is not the commit id I found, 
Sorry for making you confused. BTW, I found this issue in the 
dpdk-2.0.0-rc2, which commit id is 1a5994ac2ce11c112d9eed69c08311b31ac7f3d2.

> 
> > - OS: Fedora20 3.11.10-301.fc20.x86_64
> > - GCC: gcc version 4.8.3
> > - CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> > - NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network
> Connection [8086:10fb]
> > - Target x86_64-native-linuxapp-gcc
> > - Total 22 cases, 22 passed, 0 failed
> 
> It seems this test is related to bonding, not ixgbe buffer overrun in
> non-bulk alloc mode.

Because I found this issue when doing the test of bonding, so I think doing
Bonding test should work on verifying this issue. 

> 
> > TOPO:
> > * Connections ports between tester/ixia and DUT
> >   - TESTER(Or IXIA)---DUT
> >   - portA--port0
> >   - portB--port1
> >   - portC--port2
> >   - portD--port3
> >
> > Test Setup#1 for Functional test
> > 
> >
> > Tester has 4 ports(portA--portD), and DUT has 4 ports(port0--port3),
> then connect portA to port0, portB to port1, portC to port2, portD to
> port3.
> >
> >
> >
> > - Case: Basic bonding--Create bonded devices and slaves
> >   Description:
> > Use Setup#1.
> > Create bonded device and add some ports as salve of bonded
> device,
> > Then removed slaves or added slaves or change the bonding
> primary slave
> > Or change bonding mode and so on.
> >   Expected test result:
> > Verify the basic functions are normal.
> >
> > - Case: Basic bonding--MAC Address Test
> >   Description:
> > Use Setup#1.
> > Create bonded device and add some ports as slaves of bonded
> device,
> > Check that the changes of  the bonded device and slave
> MAC
> >   Expected test result:
> > Verify the behavior of bonded device and slave according
> to the mode.
> >
> > - Case: Basic bonding--Device Promiscuous Mode Test
> >   Description:
> > Use Setup#1.
> > Create bonded device and add some ports as slaves of bonded
> device,
> > Set promiscuous mode on or off, then send packets to the
> bonded device
> > Or slaves.
> >   Expected test result:
> > Verify the RX/TX status of bonded device and slaves
> according to the mode.
> >
> > - Case: Mode 0(Round Robin) TX/RX test
> >   Description:
> > Use Setup#1.
> > Create bonded device with mode 0 and add 3 ports as slaves
> of bonded device,
> > Forward packets between bonded device and unbounded
> device, start to forward,
> > And send packets to unbound device or slaves.
> >   Expected test result:
> > Verify the RX/TX status of bonded device and slaves in
> mode 0.
> >
> > - Case: Mode 0(Round Robin) Bring one slave link down
> >   Description:
> > Use Setup#1.
> > Create bonded device with mode 0 and add 3 ports as slaves
> of bonded device,
> > Forward packets between bonded device and unbounded
> device, start to forward,
> > Bring the link on either port 0, 1 or 2 down. And send
> packets to unbound
> > device or slaves.
> >   Expected test result:
> > Verify the RX/TX status of bonded device and slaves in
> mode 0.
> >
> > - Case: Mode 0(Round Robin) Bring all slave links down
> >   Description:
> > Use Setup#1.
> > Create bonded device with mode 0 and add 3 ports as slaves
> of bonded device,
> > Forward packets between bonded device and unbounded
> device, start to forward,
> > Bring the links down on all bonded ports. And send
> packets to unbound
> > device or slaves.
> >   Expected test result:
> > Verify the RX/TX status of bonded device and slaves in
> mode 0.
> >
> > - Case: Mode 1(Active Backup) TX/RX Test
> >   Description:
> > Use Setup#1.
> > Create bonded device with mode 1 and add 3 ports as slaves
> of bonded device,
> > Forward packets between bonded device and unbounded
> device, start to forward,
> > And send packets to unbound device or slaves.
> >   Expected test result:
> > Verify the RX/TX status of bonded device and slaves in
> mode 1.
> >
> > - Case: Mode 1(Active Backup) Change active slave, RX/TX te

[dpdk-dev] [PATCH] cast used->idx to volatile

2015-03-23 Thread Linhaifeng
cc changchun.ouyang at intel.com
cc huawei.xie at intel.com

On 2015/3/21 16:07, linhaifeng wrote:
> From: Linhaifeng 
> 
> Same as rte_vhost_enqueue_burst we should cast used->idx
> to volatile before notify guest.
> 
> Signed-off-by: Linhaifeng 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 535c7a1..8d674d1 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
> queue_id,
>   }
>  
>   rte_compiler_barrier();
> - vq->used->idx += entry_success;
> + *(volatile uint16_t *)&vq->used->idx += entry_success;
>   /* Kick guest if required. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
>   eventfd_write((int)vq->callfd, 1);
> 

-- 
Regards,
Haifeng



[dpdk-dev] [PATCH] virtio: Fix stats issue

2015-03-23 Thread Ouyang Changchun
It need clear/reset the stats information before count in all queues data.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 603be2d..e4cb55e 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -572,6 +572,10 @@ virtio_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
 {
unsigned i;

+   stats->opackets = 0;
+   stats->obytes = 0;
+   stats->oerrors = 0;
+
for (i = 0; i < dev->data->nb_tx_queues; i++) {
const struct virtqueue *txvq = dev->data->tx_queues[i];
if (txvq == NULL)
@@ -587,6 +591,10 @@ virtio_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
}
}

+   stats->ipackets = 0;
+   stats->ibytes = 0;
+   stats->ierrors = 0;
+
for (i = 0; i < dev->data->nb_rx_queues; i++) {
const struct virtqueue *rxvq = dev->data->rx_queues[i];
if (rxvq == NULL)
-- 
1.8.4.2



[dpdk-dev] [PATCH] virtio: Fix stats issue

2015-03-23 Thread David Marchand
Hello,

Hello,

On Mon, Mar 23, 2015 at 7:56 AM, Ouyang Changchun <
changchun.ouyang at intel.com> wrote:

> It need clear/reset the stats information before count in all queues data.
>
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_pmd_virtio/virtio_ethdev.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
> b/lib/librte_pmd_virtio/virtio_ethdev.c
> index 603be2d..e4cb55e 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -572,6 +572,10 @@ virtio_dev_stats_get(struct rte_eth_dev *dev, struct
> rte_eth_stats *stats)
>  {
> unsigned i;
>
> +   stats->opackets = 0;
> +   stats->obytes = 0;
> +   stats->oerrors = 0;
>

stats are supposed to be zero'd in generic rte_ethdev.c before this pmd
function is called, so this patch seems useless to me.
Can you give some context ?

Same comment for the i* part.

-- 
David Marchand


[dpdk-dev] [PATCH] vhost library doc update

2015-03-23 Thread Xie, Huawei
On 3/22/2015 10:00 PM, Butler, Siobhan A wrote:

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Huawei Xie
Sent: Wednesday, March 11, 2015 4:22 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] vhost library doc update

add vhost user documentation

Signed-off-by: Huawei Xie 
---
 doc/guides/prog_guide/vhost_lib.rst | 52
++---
 1 file changed, 42 insertions(+), 10 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst
b/doc/guides/prog_guide/vhost_lib.rst
index 0b6eda7..ab35b74 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -31,25 +31,28 @@
 Vhost Library
 =

-The vhost cuse (cuse: user space character device driver) library implements
a -vhost cuse driver. It also creates, manages and destroys vhost devices for -
corresponding virtio devices in the guest. Vhost supported vSwitch could
register -callbacks to this library, which will be called when a vhost device is
activated -or deactivated by guest virtual machine.
+The vhost library implements a user space vhost driver. It supports
+both vhost-cuse
+(cuse: user space character device) and vhost-user(user space socket
server).
+It also creates, manages and destroys vhost devices for corresponding
+virtio devices in the guest. Vhost supported vSwitch could register
+callbacks to this library, which will be called when a vhost device is
+activated or deactivated by guest virtual machine.

 Vhost API Overview
 --

 *   Vhost driver registration

-  rte_vhost_driver_register registers the vhost cuse driver into the
system.
-  Character device file will be created in the /dev directory.
+  rte_vhost_driver_register registers the vhost driver into the system.
+  For vhost-cuse, character device file will be created under the /dev
directory.
   Character device name is specified as the parameter.
+  For vhost-user, a unix domain socket server will be created with the
parameter as
+  the local socket path.

 *   Vhost session start

   rte_vhost_driver_session_start starts the vhost session loop.
-  Vhost cuse session is an infinite blocking loop.
+  Vhost session is an infinite blocking loop.
   Put the session in a dedicate DPDK thread.

 *   Callback register
@@ -73,6 +76,8 @@ Vhost API Overview
 Vhost Implementation
 

+Vhost cuse implementation
+~
 When vSwitch registers the vhost driver, it will register a cuse device driver
into the system and creates a character device file. This cuse driver will
receive vhost open/release/IOCTL message from QEMU simulator.
@@ -89,13 +94,40 @@ which means vhost could access the shared virtio ring
and the guest physical  address specified in the entry of the ring.

 The guest virtual machine tells the vhost whether the virtio device is ready -
for processing or is de-activated through VHOST_SET_BACKEND message.
+for processing or is de-activated through VHOST_NET_SET_BACKEND
message.
 The registered callback from vSwitch will be called.

 When the release call is released, vhost will destroy the device.

+Vhost user implementation
+~
+When vSwitch registers a vhost driver, it will create a unix domain
+socket server into the system. This server will listen for a connection
+and process the vhost message from QEMU simulator.
+
+When there is a new socket connection, it means a new virtio device has
+been created in the guest virtual machine, and the vhost driver will create a
vhost device for this virtio device.
+
+For messages with a file descriptor, the file descriptor could be
+directly used in the vhost process as it is already installed by unix domain
socket.
+ * VHOST_SET_MEM_TABLE
+ * VHOST_SET_VRING_KICK
+ * VHOST_SET_VRING_CALL
+ * VHOST_SET_LOG_FD
+ * VHOST_SET_VRING_ERR
+
+For VHOST_SET_MEM_TABLE message, QEMU will send us information for
each
+memory region and its file descriptor in the ancillary data of the message.
The fd is used to map that region.
+
+There is no VHOST_NET_SET_BACKEND message as in vhost cuse to signal
us
+whether virtio device is ready or should be stopped.
+VHOST_SET_VRING_KICK is used as the signal to put the vhost device onto
data plane.
+VHOST_GET_VRING_BASE is used as the signal to remove vhost device
from data plane.
+
+When the socket connection is closed, vhost will destroy the device.
+
 Vhost supported vSwitch reference
 -

-For how to support vhost in vSwitch, please refer to vhost example in the
+For more vhost details and how to support vhost in vSwitch, please
+refer to vhost example in the
 DPDK Sample Applications Guide.
--
1.8.1.4



Hi Huawei,

Having some issues with trailing white space in this patch can you please check 
it?

Thanks
Siobhan



Hi Siobhan:

I check the patch, and don't know if this is your question. This is a


[dpdk-dev] [PATCH] eal_common_options.c: set create_uio_dev option to no argument

2015-03-23 Thread gaohaifeng
From: Haifeng Gao 

eal options OPT_CREATE_UIO_DEV does not need argument so set it to zero.
It needs to reset create_uio_dev explicitly.

Signed-off-by: Haifeng Gao 
---
 lib/librte_eal/common/eal_common_options.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 4319549..8fcb1ab 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -70,7 +70,7 @@ eal_short_options[] =
 const struct option
 eal_long_options[] = {
{OPT_BASE_VIRTADDR, 1, NULL, OPT_BASE_VIRTADDR_NUM},
-   {OPT_CREATE_UIO_DEV,1, NULL, OPT_CREATE_UIO_DEV_NUM   },
+   {OPT_CREATE_UIO_DEV,0, NULL, OPT_CREATE_UIO_DEV_NUM   },
{OPT_FILE_PREFIX,   1, NULL, OPT_FILE_PREFIX_NUM  },
{OPT_HELP,  0, NULL, OPT_HELP_NUM },
{OPT_HUGE_DIR,  1, NULL, OPT_HUGE_DIR_NUM },
@@ -131,6 +131,7 @@ eal_reset_internal_config(struct internal_config 
*internal_cfg)
internal_cfg->no_hpet = 1;
 #endif
internal_cfg->vmware_tsc_map = 0;
+   internal_cfg->create_uio_dev = 0;
 }

 /*
-- 
1.7.12.4




[dpdk-dev] [PATCH] virtio: Fix stats issue

2015-03-23 Thread Ouyang, Changchun


On 3/23/2015 3:20 PM, David Marchand wrote:
> Hello,
>
> Hello,
>
> On Mon, Mar 23, 2015 at 7:56 AM, Ouyang Changchun 
> mailto:changchun.ouyang at intel.com>> wrote:
>
> It need clear/reset the stats information before count in all
> queues data.
>
> Signed-off-by: Changchun Ouyang  >
> ---
>  lib/librte_pmd_virtio/virtio_ethdev.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
> b/lib/librte_pmd_virtio/virtio_ethdev.c
> index 603be2d..e4cb55e 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -572,6 +572,10 @@ virtio_dev_stats_get(struct rte_eth_dev *dev,
> struct rte_eth_stats *stats)
>  {
> unsigned i;
>
> +   stats->opackets = 0;
> +   stats->obytes = 0;
> +   stats->oerrors = 0;
>
>
> stats are supposed to be zero'd in generic rte_ethdev.c before this 
> pmd function is called, so this patch seems useless to me.
> Can you give some context ?
>
> Same comment for the i* part.
>
2 reasons:
1. this change could keep the stats_get has consistent behavior with the 
one in other drivers;
2. we don't rely on the assumption of caller always zero'd the stats, 
and still can return correct value;
thanks
Changchun



[dpdk-dev] tools brainstorming

2015-03-23 Thread Cao, Waterman
On 2015/3/20 22:52, Thomas Monjalon wrote:
> Hi,
>
> As you probably know, a MAINTAINERS file is being filled, which is a great
> help to request patch reviews and discuss design with the knowledgeable people
> of this young DPDK community:
>   http://dpdk.org/browse/dpdk/tree/MAINTAINERS
>
> The next step is to clearly define what are the guidelines to review a patch
> and accept it. So let's write a new document CONTRIBUTING (or another
> capitalized file ;). It will help contributors to do the right checks
> before submitting, and will help reviewers.
>
> As we are lazy developers, writing guidelines is not enough. It must be
> coupled with the integration of some tools. Let's work on these ones:
>   - make autotests easier and faster to run for smoke testing
>   - automated basic testpmd check
>   - build check with various options combinations
>   - abi check (started with validate-abi.sh)
>   - static analyze (clang, free online coverity)
>   - comment check (doxygen, codespell, kerspell)
>   - format check (customized checkpatch)
>
> I'm sure this last item will trigger a lot of debate.
> Actually, format checking can be of two kinds:
>   - commit message formatting (how to write the title, how and when adding
>   Fixes tag, Signed-off-by tag, etc);
>   - coding style might deserve its own document.
>
> At the end, we should be able to pass a "make check" on the whole code and
> a "make checkpatch" before submitting.
> Then the result of these tools could be automatically checked and displayed
> in patchwork or in an adapted version of qemu's patchew. But this is
> obviously a later step.
> When all automatic lights are green and human design review is properly done,
> the patch can be acknowledged by one or many reviewers. Speaking about that,
> it would be helpful to have a column in our patchwork to summarize the counts
> of tests, reviews and acknowledgements.
>
> Comments and contributions are more than welcome!
>
Hi Thomas,

 That's good idea to check patch before merging it into branch.
 We can perform basic test per each patch and improve the quality of
patch.

As you knew, currently Intel DPDK test team maintained automation
test tool to perform build check and smoke test on a lot of mainstream
platforms.
It will a good chance to share these knowledge with whole DPDK
community.

- Daily Build Test
 So far, Intel test team run daily build test on CentOS6.5, Fedora
18/20/21, RedHat 6.5/7.0, SUSE 11 SP2/SP3, Ubuntu 12/14, Oracle Linux
6.4 and FreeBSD 10.
 In addition, we also verified with different compilers, kernels and
DPDK build options.
 Since Our daily build test is focus on master branch and only
monitor latest code changes.
 Maybe we don't need to check so much OS per each patch, just make
quick build check with short list.
 We can share our build script with contributors/maintainer. they
can use it to verify their patch set.

- Automated Smoke Test
   Based on DTS (DPDK test suite), we already built up automated smoke
test on FC16/18/20/21/ , Ubuntu and Redhat. it's composed of unit test
and function test for dpdk sample application.
I think that it's easy to build up automated smoke test based on
patch, we just need to define which test cases should include in the
list, and make sure if it can achieve at shortest time.

- Bug Tracking
During our test cycle, we found some defects in release candidates. 
But it's difficult to track/report them without public bug tool.
It's really helpful to get one formal tool to manage these
information and speed up bug fixing.

In addition, I think that patchwork is a good tool, which provides a
place to show test result for each patch.
But patchwork is focus on patch level, we need to think how to test
latest code branch in package level.
Finally, we are eager to share our experience of validation with DPDK
community.
We would like to contribute tool and script,  and help to improve
quality of DPDK release.

regards

Waterman




[dpdk-dev] [PATCH] librte_hash: Fix crc32 error when complie i686 in x86_64

2015-03-23 Thread Michael Qiu
When compile target i686 in platform x86_64, the stud fuction will
be called, and return zero.

This patch fix this issue.

Signed-off-by: Michael Qiu 
---
 lib/librte_hash/rte_hash_crc.h | 35 ++-
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h
index 3dcd362..1cd626c 100644
--- a/lib/librte_hash/rte_hash_crc.h
+++ b/lib/librte_hash/rte_hash_crc.h
@@ -366,7 +366,6 @@ crc32c_2words(uint64_t data, uint32_t init_val)
 }

 #if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
-
 static inline uint32_t
 crc32c_sse42_u32(uint32_t data, uint32_t init_val)
 {
@@ -390,27 +389,9 @@ crc32c_sse42_u64_mimic(uint64_t data, uint64_t init_val)
init_val = crc32c_sse42_u32(d.u32[1], init_val);
return init_val;
 }
-
-#else
-
-static inline uint32_t
-crc32c_sse42_u32(__rte_unused uint32_t data,
- __rte_unused uint32_t init_val)
-{
-   return 0;
-}
-
-static inline uint32_t
-crc32c_sse42_u64_mimic(__rte_unused uint32_t data,
-   __rte_unused uint32_t init_val)
-{
-   return 0;
-}
-
 #endif

 #ifdef RTE_ARCH_X86_64
-
 static inline uint32_t
 crc32c_sse42_u64(uint64_t data, uint64_t init_val)
 {
@@ -420,16 +401,6 @@ crc32c_sse42_u64(uint64_t data, uint64_t init_val)
: [data] "rm" (data));
return init_val;
 }
-
-#else
-
-static inline uint32_t
-crc32c_sse42_u64(__rte_unused uint64_t data,
- __rte_unused uint64_t init_val)
-{
-   return 0;
-}
-
 #endif

 #define CRC32_SW(1U << 0)
@@ -489,8 +460,10 @@ rte_hash_crc_init_alg(void)
 static inline uint32_t
 rte_hash_crc_4byte(uint32_t data, uint32_t init_val)
 {
+#if defined RTE_ARCH_I686 || defined RTE_ARCH_X86_64
if (likely(crc32_alg & CRC32_SSE42))
return crc32c_sse42_u32(data, init_val);
+#endif

return crc32c_1word(data, init_val);
 }
@@ -510,11 +483,15 @@ rte_hash_crc_4byte(uint32_t data, uint32_t init_val)
 static inline uint32_t
 rte_hash_crc_8byte(uint64_t data, uint32_t init_val)
 {
+#ifdef RTE_ARCH_X86_64
if (likely(crc32_alg == CRC32_SSE42_x64))
return crc32c_sse42_u64(data, init_val);
+#endif

+#if defined RTE_ARCH_I686 || defined RTE_ARCH_X86_64
if (likely(crc32_alg & CRC32_SSE42))
return crc32c_sse42_u64_mimic(data, init_val);
+#endif

return crc32c_2words(data, init_val);
 }
-- 
1.9.3



[dpdk-dev] Need some info on DPDK

2015-03-23 Thread Shankari Vaidyalingam
Hi

Need some info on:
(1) How to capture packets received on the DPDK interface?
(2) Determine the MAC address corresponding to the PCI interface


Regards
Shankari.V


[dpdk-dev] Need some info on DPDK

2015-03-23 Thread Olivier Deme
It's all in the docs.
http://dpdk.org/doc/guides/prog_guide/libpcap_ring_based_poll_mode_drv.html
http://dpdk.org/doc/api/rte__ethdev_8h.html#a5686df2817980236f2c4f1cc72dd2c30



On 23/03/15 09:00, Shankari Vaidyalingam wrote:
> Hi
>
> Need some info on:
> (1) How to capture packets received on the DPDK interface?
> (2) Determine the MAC address corresponding to the PCI interface
>
>
> Regards
> Shankari.V

-- 
*Olivier Dem?*
*Druid Software Ltd.*
*Tel: +353 1 202 1831*
*Email: odeme at druidsoftware.com *
*URL: http://www.druidsoftware.com*
Druid Software: Monetising enterprise small cells solutions.



[dpdk-dev] [PATCH] virtio: Fix stats issue

2015-03-23 Thread Thomas Monjalon
2015-03-23 16:38, Ouyang, Changchun:
> On 3/23/2015 3:20 PM, David Marchand wrote:
> > On Mon, Mar 23, 2015 at 7:56 AM, Ouyang Changchun 
> > mailto:changchun.ouyang at intel.com>> 
> > wrote:
> >
> > It need clear/reset the stats information before count in all
> > queues data.
> >
> > Signed-off-by: Changchun Ouyang  > >
> > ---
> >  lib/librte_pmd_virtio/virtio_ethdev.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
> > b/lib/librte_pmd_virtio/virtio_ethdev.c
> > index 603be2d..e4cb55e 100644
> > --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> > @@ -572,6 +572,10 @@ virtio_dev_stats_get(struct rte_eth_dev *dev,
> > struct rte_eth_stats *stats)
> >  {
> > unsigned i;
> >
> > +   stats->opackets = 0;
> > +   stats->obytes = 0;
> > +   stats->oerrors = 0;
> >
> >
> > stats are supposed to be zero'd in generic rte_ethdev.c before this 
> > pmd function is called, so this patch seems useless to me.
> > Can you give some context ?
> >
> > Same comment for the i* part.
> >
> 2 reasons:
> 1. this change could keep the stats_get has consistent behavior with the 
> one in other drivers;

If there are some useless reset in other drivers, they should be removed.

> 2. we don't rely on the assumption of caller always zero'd the stats, 
> and still can return correct value;

The caller is ethdev and it is zero'ing the structure before the call:
http://dpdk.org/browse/dpdk/commit/?id=02331c16ec0




[dpdk-dev] Testpmd application failing with Cause: No probed ethernet device with DPDK-1.8.0

2015-03-23 Thread Rapelly, Varun
Hi Bruce,

Thanks for your reply.

I used dpdk_nic_bind.py script to bind igb_uio driver for the specific NIC 
ports. 

[root at SSBC swe]# ./ dpdk_nic_bind.py  --status

Network devices using IGB_UIO driver

:03:00.0 'VMXNET3 Ethernet Controller' drv=igb_uio unused=vmxnet3
:13:00.0 'VMXNET3 Ethernet Controller' drv=igb_uio unused=vmxnet3
:1b:00.0 'VMXNET3 Ethernet Controller' drv=igb_uio unused=vmxnet3

Network devices using kernel driver
===
:0b:00.0 'VMXNET3 Ethernet Controller' if=ha0 drv=vmxnet3 unused=igb_uio

When we ported to DPDK1.7.0, we didn't face this kind of issues.

When I googled, found similar problem. But didn't get any solution from that.
http://patchwork.dpdk.org/ml/archives/dev/2014-February/001437.html

Please give some directions to resolve this issue.

Regards,
Varun
-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Friday, March 20, 2015 7:33 PM
To: Rapelly, Varun
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Testpmd application failing with Cause: No probed 
ethernet device with DPDK-1.8.0

On Fri, Mar 20, 2015 at 07:09:00AM +, Rapelly, Varun wrote:
> 
> Hi All,
> 
> I'm facing the following issue when testing testpmd application with 
> DPDK-1.8.0.
> 
> EAL: PCI device :03:00.0 on NUMA socket -1
> EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
> EAL:   :03:00.0 not managed by UIO driver, skipping
> EAL: PCI device :0b:00.0 on NUMA socket -1
> EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
> EAL:   :0b:00.0 not managed by UIO driver, skipping
> EAL: PCI device :13:00.0 on NUMA socket -1
> EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
> EAL:   :13:00.0 not managed by UIO driver, skipping
> EAL: PCI device :1b:00.0 on NUMA socket -1
> EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
> EAL:   :1b:00.0 not managed by UIO driver, skipping
> EAL: Error - exiting with code: 1
>   Cause: No probed ethernet device
> 
> Please let me know, what could be the issue.
> 
> FYI:
> Igb_uio and rte_kni ko modules are inserted successfully.

After you load the uio driver, you still need to bind some NIC devices to use 
it, otherwise DPDK will ignore those devices as being used by the kernel.
The script "dpdk_nic_bind.py" in the tools directory can help with the binding 
and unbinding to the different drivers.

/Bruce

> 
> Regards,
> Varun
> 


[dpdk-dev] [PATCH] librte_hash: Fix crc32 error when complie i686 in x86_64

2015-03-23 Thread Thomas Monjalon
2015-03-23 16:43, Michael Qiu:
> When compile target i686 in platform x86_64, the stud fuction will
> be called, and return zero.
> 
> This patch fix this issue.
> 
> Signed-off-by: Michael Qiu 

Acked-by: Thomas Monjalon 

Applied, thanks


[dpdk-dev] [PATCH] tools: Fix some strings and functions regarding VFIO support

2015-03-23 Thread Thomas Monjalon
Hi,

Your patch looks good.
Please could you resend it with a Signed-off-by line?
Procedure is described here:
http://dpdk.org/dev#send

Thanks

2015-03-18 16:05, Andre Richter:
> This patch fixes several minor issues in setup.sh:
> 
>  - show_nics() would not display the current Ethernet settings if
>the user only loads the vfio-pci module, b/c it only checks for
>presence of igb_uio. Fix this by adding a check for vfio-pci.
> 
>  - unbind_nics(): Fix option naming and string inside function.
> 
>  - Exchange a forgotten "igb_uio" with "vfio-pci" in a comment.




[dpdk-dev] [PATCH v2] Fix `eventfd_link' module leakages and races

2015-03-23 Thread Thomas Monjalon
Hi Pavel,

You forgot the Signed-off-by line.

Huawei, Changchun, any comment on this patch?

2015-03-18 15:16, Pavel Boldin:
> The `eventfd_link' module provides an API to "steal" fd from another
> process had been written with a bug that leaks `struct file' because
> of the extra reference counter increment and missing `fput' call.
> 
> The other bug is using another process' `task_struct' without incrementing
> a reference counter.

As there are 2 bugs, you should provide 2 patches.

> Fix these bugs and refactor the module.

Why refactoring along with the bug fixes?
You'd have more chances to have your fixes integrated in 2.0 without
refactoring.

Thanks





[dpdk-dev] [PATCH] eal: remove unnecessary #ifdef CONFIG_BQL

2015-03-23 Thread Thomas Monjalon
Helin, any comment about this trivial patch?

2015-03-13 12:21, Alex Gartrell:
> I couldn't figure out why this #ifdef existed so I looked around upstream
> and it's not there.  It seems to build just fine without it.
> 
> Signed-off-by: Alex Gartrell 
> ---
>  lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h 
> b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h
> index a582f52..d58e1f3 100644
> --- a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h
> +++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h
> @@ -472,12 +472,10 @@ static inline u16 igb_desc_unused(const struct igb_ring 
> *ring)
>   return ((ntc > ntu) ? 0 : ring->count) + ntc - ntu - 1;
>  }
>  
> -#ifdef CONFIG_BQL
>  static inline struct netdev_queue *txring_txq(const struct igb_ring *tx_ring)
>  {
>   return netdev_get_tx_queue(tx_ring->netdev, tx_ring->queue_index);
>  }
> -#endif /* CONFIG_BQL */
>  
>  // #ifdef EXT_THERMAL_SENSOR_SUPPORT
>  // #ifdef IGB_PROCFS
> 




[dpdk-dev] [PATCH] librte_pmd_bond: remove memory alloc for rte_pci_driver

2015-03-23 Thread Thomas Monjalon
Declan,
Any comment?

2015-03-12 14:38, Jia Yu:
> eth_driver already contains rte_pci_driver data structure.
> Allocating rte_pci_driver without referencing it after bond
> creation causes memory leakage.

Jia, Signed-off is missing.


[dpdk-dev] [PATCH v2] Fix `eventfd_link' module leakages and races

2015-03-23 Thread Pavel Boldin
Hi Thomas,

There is by far more than 2 issues, but these are the only ones that
appeared to us.

The list of the issues that motivated the refactoring:

   1. Only one error code for every fault (-EFAULT).
   2. Copy-paste code from the `fget' function.
   3. Ambiguous variable names. The `files' variable mean different things
   under the same block. This ruins readability of the code.
   4. Overall placing of the all the code inside one `switch/case'. Modern
   kernel code style suggests placing these in the inline functions.
   5. Usage of the copy-pasted `close_fd' function code. (I really should
   use `sys_close' here though).


I can rewrite the refactoring in a set of small patches if this will aid
reviewers.

The dangerous bug that definitely must be fixed before the next release is
the `struct file*' leakage. I can provide a simple patch for this as a
separate submission and continue working on the refactoring patches. Is
this plan OK for you?


Pavel

On Mon, Mar 23, 2015 at 1:15 PM, Thomas Monjalon 
wrote:

> Hi Pavel,
>
> You forgot the Signed-off-by line.
>
> Huawei, Changchun, any comment on this patch?
>
> 2015-03-18 15:16, Pavel Boldin:
> > The `eventfd_link' module provides an API to "steal" fd from another
> > process had been written with a bug that leaks `struct file' because
> > of the extra reference counter increment and missing `fput' call.
> >
> > The other bug is using another process' `task_struct' without
> incrementing
> > a reference counter.
>
> As there are 2 bugs, you should provide 2 patches.
>
> > Fix these bugs and refactor the module.
>
> Why refactoring along with the bug fixes?
> You'd have more chances to have your fixes integrated in 2.0 without
> refactoring.
>
> Thanks
>
>
>
>


[dpdk-dev] [PATCH] app/testpmd: fix potential out of bounds read

2015-03-23 Thread Thomas Monjalon
> After the last enabled port has been seen, and the last time we
> evaluate the loop condition, there is an out of bounds read in
> ports[p].enabled because p is equal to size, which is the length of
> ports.
> 
> Signed-off-by: Julien Cretin 

Acked-by: Thomas Monjalon 

Applied, thanks


[dpdk-dev] [PATCH] librte_pmd_e1000: power down the serdes link

2015-03-23 Thread Thomas Monjalon
There is no maintainer declared for e1000.
Maybe that an Intel expert could check this patch, please?

2015-03-07 11:57, Shelton Chia:
> Signed-off-by: Shelton Chia 
> ---
>  lib/librte_pmd_e1000/igb_ethdev.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
> b/lib/librte_pmd_e1000/igb_ethdev.c
> index 504ae74..314ef2a 100644
> --- a/lib/librte_pmd_e1000/igb_ethdev.c
> +++ b/lib/librte_pmd_e1000/igb_ethdev.c
> @@ -948,7 +948,10 @@ eth_igb_stop(struct rte_eth_dev *dev)
>   }
>  
>   /* Power down the phy. Needed to make the link go Down */
> - e1000_power_down_phy(hw);
> + if (hw->phy.media_type == e1000_media_type_copper)
> + e1000_power_down_phy(hw);
> + else
> + e1000_shutdown_fiber_serdes_link(hw);
>  
>   igb_dev_clear_queues(dev);
>  
> 




[dpdk-dev] [PATCH v2] Fix `eventfd_link' module leakages and races

2015-03-23 Thread Thomas Monjalon
2015-03-23 13:36, Pavel Boldin:
> Hi Thomas,
> 
> There is by far more than 2 issues, but these are the only ones that
> appeared to us.
> 
> The list of the issues that motivated the refactoring:
> 
>1. Only one error code for every fault (-EFAULT).
>2. Copy-paste code from the `fget' function.
>3. Ambiguous variable names. The `files' variable mean different things
>under the same block. This ruins readability of the code.
>4. Overall placing of the all the code inside one `switch/case'. Modern
>kernel code style suggests placing these in the inline functions.
>5. Usage of the copy-pasted `close_fd' function code. (I really should
>use `sys_close' here though).
> 
> 
> I can rewrite the refactoring in a set of small patches if this will aid
> reviewers.
> 
> The dangerous bug that definitely must be fixed before the next release is
> the `struct file*' leakage. I can provide a simple patch for this as a
> separate submission and continue working on the refactoring patches. Is
> this plan OK for you?

Yes, please separate important fixes and refactoring.
Then we'll be able to merge the fixes in 2.0.

PS: please answer below the question to ease thread reading.

Thanks


> On Mon, Mar 23, 2015 at 1:15 PM, Thomas Monjalon  6wind.com>
> wrote:
> 
> > Hi Pavel,
> >
> > You forgot the Signed-off-by line.
> >
> > Huawei, Changchun, any comment on this patch?
> >
> > 2015-03-18 15:16, Pavel Boldin:
> > > The `eventfd_link' module provides an API to "steal" fd from another
> > > process had been written with a bug that leaks `struct file' because
> > > of the extra reference counter increment and missing `fput' call.
> > >
> > > The other bug is using another process' `task_struct' without
> > incrementing
> > > a reference counter.
> >
> > As there are 2 bugs, you should provide 2 patches.
> >
> > > Fix these bugs and refactor the module.
> >
> > Why refactoring along with the bug fixes?
> > You'd have more chances to have your fixes integrated in 2.0 without
> > refactoring.
> >
> > Thanks




[dpdk-dev] [PATCH] cmdline: fix type format from unsigned to size_t for buffer size

2015-03-23 Thread Thomas Monjalon
Daniel,
Without answer to the question from Olivier, this patch is going to be ignored.

2015-02-24 12:03, Olivier MATZ:
> Hi Daniel,
> 
> On 02/20/2015 05:18 PM, Daniel Mrzyglod wrote:
> > Function match_inst is used to take buffor using sizeof() which is size_t 
> > type.
> > This modification also involved changing '%u' to '%zu' in printf function.
> >
> > Signed-off-by: Daniel Mrzyglod 
> > ---
> >   lib/librte_cmdline/cmdline_parse.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_cmdline/cmdline_parse.c 
> > b/lib/librte_cmdline/cmdline_parse.c
> > index dfc885c..0821791 100644
> > --- a/lib/librte_cmdline/cmdline_parse.c
> > +++ b/lib/librte_cmdline/cmdline_parse.c
> > @@ -138,7 +138,7 @@ nb_common_chars(const char * s1, const char * s2)
> >*/
> >   static int
> >   match_inst(cmdline_parse_inst_t *inst, const char *buf,
> > -  unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
> > +  unsigned int nb_match_token, void *resbuf, size_t resbuf_size)
> >   {
> > unsigned int token_num=0;
> > cmdline_parse_token_hdr_t * token_p;
> > @@ -169,7 +169,7 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
> >
> > if (token_hdr.offset > resbuf_size) {
> > printf("Parse error(%s:%d): Token offset(%u) "
> > -   "exceeds maximum size(%u)\n",
> > +   "exceeds maximum size(%zu)\n",
> > __FILE__, __LINE__,
> > token_hdr.offset, resbuf_size);
> > return -ENOBUFS;
> >
> 
> 
> Did you see a specific issue with the current code? (maybe a compilation
> issue or a klocwork issue?)
> 
> I think this patch is ok, but there are many places where this kind
> of fixes should be applied in cmdline (cmdline_parse_*(),
> cmdline_get_help_*(), etc.). Is there a motivation for changing it
> only there?
> 
> Regards,
> Olivier
> 




[dpdk-dev] [PATCH] tools: Fix some strings and functions regarding VFIO support

2015-03-23 Thread Andre Richter
Ah sorry, I wonder where that went missing.
I'll resend.

Thomas Monjalon  schrieb am Mo., 23. M?rz 2015
um 12:10 Uhr:

> Hi,
>
> Your patch looks good.
> Please could you resend it with a Signed-off-by line?
> Procedure is described here:
> http://dpdk.org/dev#send
>
> Thanks
>
> 2015-03-18 16:05, Andre Richter:
> > This patch fixes several minor issues in setup.sh:
> >
> >  - show_nics() would not display the current Ethernet settings if
> >the user only loads the vfio-pci module, b/c it only checks for
> >presence of igb_uio. Fix this by adding a check for vfio-pci.
> >
> >  - unbind_nics(): Fix option naming and string inside function.
> >
> >  - Exchange a forgotten "igb_uio" with "vfio-pci" in a comment.
>
>
>


[dpdk-dev] [PATCH] scripts: enable extended tag of PCIe

2015-03-23 Thread Thomas Monjalon
Hi,

This patch needs review and documentation.
It's going to be dropped if nobody cares.

There were some previous discussions about it:
http://dpdk.org/ml/archives/dev/2015-February/012708.html


2015-01-30 12:57, zhida zang:
> As 'extended tag' of PCIe needs to be enabled for i40e high performance,
> Linux command of 'setpci' can be used to check and set the corresponding 
> bit of 'extended tag' of PCIe configuration space. The script is to check
> and set the right bit in PCIe configuration space to enable 'extended tag'.
> 
> Signed-off-by: Zhida Zang 
> ---
>  tools/set_pci.py | 124 
> +++
>  1 file changed, 124 insertions(+)
>  create mode 100755 tools/set_pci.py
> 
> diff --git a/tools/set_pci.py b/tools/set_pci.py
> new file mode 100755
> index 000..e242efb
> --- /dev/null
> +++ b/tools/set_pci.py
> @@ -0,0 +1,124 @@
> +#! /usr/bin/python
> +import sys
> +import os
> +import subprocess
> +import getopt
> +from os.path import basename
> +
> +# The register to check if extended tag is supported or not.
> +PCI_DEV_CAP_REG = 0xA4
> +# The control register which contains the bit to enable/disable 'extended 
> tag'.
> +PCI_DEV_CTRL_REG = 0xA8
> +# The mask of 'extended tag' in capability register.
> +PCI_DEV_CAP_EXT_TAG_MASK = 0x20
> +# The mask of 'extended tag' in control register.
> +PCI_DEV_CTRL_EXT_TAG_MASK = 0x100
> +
> +dev_ids = {}
> +flag = "Set"
> +
> +
> +def usage():
> +'''Print usage information for the program'''
> +argv0 = basename(sys.argv[0])
> +print """
> +Usage:
> +--
> +
> +%(argv0)s [options] DEVICE1 DEVICE2 
> +
> +where DEVICE1, DEVICE2 etc, are specified via PCI "domain:bus:slot.func" 
> syntax
> +or "bus:slot.func" syntax. For devices bound to Linux kernel drivers, they 
> may
> +also be referred to by Linux interface name e.g. eth0, eth1, em0, em1, etc.
> +
> +Options:
> +--help, --usage:
> +Display usage information and quit
> +
> +-s --set:
> +Set the following pci device
> +
> +-u --Unset:
> +Unset the following pci device
> +
> +Examples:
> +-
> +To set pci 0a:00.0
> +%(argv0)s -s 0a:00.0
> +%(argv0)s --set 0a:00.0
> +
> +To unset :01:00.0
> +%(argv0)s -u :01:00.0
> +%(argv0)s --unset :01:00.0
> +
> +To set :02:00.0 and :02:00.1
> +%(argv0)s -s 02:00.0 02:00.1
> +
> +""" % locals()  # replace items from local variables
> +
> +
> +def parse_args():
> +global flag
> +global dev_ids
> +if len(sys.argv) <= 1:
> +usage()
> +sys.exit(0)
> +try:
> +opts, dev_ids = getopt.getopt(
> +sys.argv[1:],
> +"su",
> +["help", "usage", "set", "unset"]
> +)
> +except getopt.GetoptError, error:
> +print str(error)
> +print "Run '%s --usage' for further information" % sys.argv[0]
> +sys.exit(1)
> +
> +for opt, arg in opts:
> +if opt == "--help" or opt == "--usage":
> +usage()
> +sys.exit(0)
> +if opt == "-s" or opt == "--set":
> +flag = "Set"
> +if opt == "-u" or opt == "--unset":
> +flag = "Unset"
> +
> +
> +def check_output(args, stderr=None):
> +'''Run a command and capture its output'''
> +return subprocess.Popen(
> +args,
> +stdout=subprocess.PIPE,
> +stderr=stderr
> +).communicate()[0]
> +
> +
> +def set_pci():
> +if len(dev_ids) == 0:
> +print "Error: No devices specified."
> +print "Run '%s --usage' for further information" % sys.argv[0]
> +sys.exit(1)
> +param_cap = "%x.W" % PCI_DEV_CAP_REG
> +for k in range(len(dev_ids)):
> +val = check_output(["setpci", "-s", dev_ids[k], param_cap])
> +if (not (int(val, 16) & PCI_DEV_CAP_EXT_TAG_MASK)):
> +print dev_ids[k], "Not supported"
> +continue
> +if (int(val, 16) & PCI_DEV_CTRL_EXT_TAG_MASK):
> +continue
> +param_ctrl = "%x.W" % PCI_DEV_CTRL_REG
> +val = check_output(["setpci", "-s", dev_ids[k], param_ctrl])
> +if flag == "Set":
> +val = int(val, 16) | PCI_DEV_CTRL_EXT_TAG_MASK
> +else:
> +val = int(val, 16) & ~PCI_DEV_CTRL_EXT_TAG_MASK
> +param_ctrl = "%x.W=%x" % (PCI_DEV_CTRL_REG, val)
> +check_output(["setpci", "-s", dev_ids[k], param_ctrl])
> +
> +
> +def main():
> +parse_args()
> +set_pci()
> +
> +if __name__ == "__main__":
> +main()
> 




[dpdk-dev] dpdk kni ping answer

2015-03-23 Thread Yaron Illouz
Hi 



I want to give an ip to a dpdk port, because I need to receive traffic
by gre replication.

I though I could do it with dpdk kni
(http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html)



I am able to give an ip to interface vEth0, but I can't ping or ssh to
the ip address.

I set the ip with the following command: ifconfig vEth0 192.168.69.4
netmask 255.255.255.0 up



Is this something possible to do with dpdk kni?

If not, what could be another solution for this.





Regards





[dpdk-dev] [PATCH 1/1] tools: Fix some strings and functions regarding VFIO support

2015-03-23 Thread Andre Richter
This patch fixes several minor issues in setup.sh:

 - show_nics() would not display the current Ethernet settings if
   the user only loads the vfio-pci module, b/c it only checks for
   presence of igb_uio. Fix this by adding a check for vfio-pci.

 - unbind_nics(): Fix option naming and string inside function.

 - Exchange a forgotten "igb_uio" with "vfio-pci" in a comment.

Signed-off-by: Andre Richter 
---
 tools/setup.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/setup.sh b/tools/setup.sh
index ac438c2..5a8b2f3 100755
--- a/tools/setup.sh
+++ b/tools/setup.sh
@@ -427,16 +427,16 @@ grep_meminfo()
 #
 show_nics()
 {
-   if  /sbin/lsmod  | grep -q igb_uio ; then
+   if  /sbin/lsmod | grep -q -e igb_uio -e vfio_pci; then
${RTE_SDK}/tools/dpdk_nic_bind.py --status
else
-   echo "# Please load the 'igb_uio' kernel module before querying 
or "
-   echo "# adjusting NIC device bindings"
+   echo "# Please load the 'igb_uio' or 'vfio-pci' kernel module 
before "
+   echo "# querying or adjusting NIC device bindings"
fi
 }

 #
-# Uses dpdk_nic_bind.py to move devices to work with igb_uio
+# Uses dpdk_nic_bind.py to move devices to work with vfio-pci
 #
 bind_nics_to_vfio()
 {
@@ -477,7 +477,7 @@ unbind_nics()
 {
${RTE_SDK}/tools/dpdk_nic_bind.py --status
echo ""
-   echo -n "Enter PCI address of device to bind to IGB UIO driver: "
+   echo -n "Enter PCI address of device to unbind: "
read PCI_PATH
echo ""
echo -n "Enter name of kernel driver to bind the device to: "
@@ -574,7 +574,7 @@ step5_func()
TEXT[1]="Uninstall all targets"
FUNC[1]="uninstall_targets"

-   TEXT[2]="Unbind NICs from IGB UIO driver"
+   TEXT[2]="Unbind NICs from IGB UIO or VFIO driver"
FUNC[2]="unbind_nics"

TEXT[3]="Remove IGB UIO module"
-- 
1.9.1



[dpdk-dev] [PATCH] cast used->idx to volatile

2015-03-23 Thread Linhaifeng


On 2015/3/21 16:07, linhaifeng wrote:
> From: Linhaifeng 
> 
> Same as rte_vhost_enqueue_burst we should cast used->idx
> to volatile before notify guest.
> 
> Signed-off-by: Linhaifeng 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 535c7a1..8d674d1 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
> queue_id,
>   }
>  
>   rte_compiler_barrier();
> - vq->used->idx += entry_success;
> + *(volatile uint16_t *)&vq->used->idx += entry_success;
>   /* Kick guest if required. */
>   if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
>   eventfd_write((int)vq->callfd, 1);
> 

-- 
Regards,
Haifeng



[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Pavel Boldin
Due to increased `struct file's reference counter subsequent call
to `filp_close' does not free the `struct file'. Prepend `fput' call
to decrease the reference counter.

Signed-off-by: Pavel Boldin 
---
 lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
b/lib/librte_vhost/eventfd_link/eventfd_link.c
index 7755dd6..62c45c8 100644
--- a/lib/librte_vhost/eventfd_link/eventfd_link.c
+++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
@@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
unsigned long arg)
 * Release the existing eventfd in the source process
 */
spin_lock(&files->file_lock);
+   fput(file);
filp_close(file, files);
fdt = files_fdtable(files);
fdt->fd[eventfd_copy.source_fd] = NULL;
-- 
1.9.1



[dpdk-dev] [PATCH] cast used->idx to volatile

2015-03-23 Thread Xie, Huawei


> -Original Message-
> From: Linhaifeng [mailto:haifeng.lin at huawei.com]
> Sent: Monday, March 23, 2015 8:24 PM
> To: dev at dpdk.org
> Cc: Ouyang, Changchun; Xie, Huawei
> Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile
> 
> 
> 
> On 2015/3/21 16:07, linhaifeng wrote:
> > From: Linhaifeng 
> >
> > Same as rte_vhost_enqueue_burst we should cast used->idx
> > to volatile before notify guest.
> >
> > Signed-off-by: Linhaifeng 
> > ---
> >  lib/librte_vhost/vhost_rxtx.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> > index 535c7a1..8d674d1 100644
> > --- a/lib/librte_vhost/vhost_rxtx.c
> > +++ b/lib/librte_vhost/vhost_rxtx.c
> > @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
> > }
> >
> > rte_compiler_barrier();
> > -   vq->used->idx += entry_success;
> > +   *(volatile uint16_t *)&vq->used->idx += entry_success;


Haifeng:
We have compiler barrier before and an external function call behind, so we 
don't need volatile  here.
Do you meet issue?

> > /* Kick guest if required. */
> > if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> > eventfd_write((int)vq->callfd, 1);
> >
> 
> --
> Regards,
> Haifeng



[dpdk-dev] [PATCH] dpdk: fix a crash during rte_table_hash_key16_ext overload

2015-03-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of
> miroslaw.walukiewicz at intel.com
> Sent: Tuesday, March 3, 2015 2:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] dpdk: fix a crash during
> rte_table_hash_key16_ext overload
> 
> From: Miroslaw Walukiewicz 
> 
> The hash_key16_ext table allocates a cache entries to support
> table overload cases.
> 
> The crash can occur when cache entry is free after use. The problem
> is with computing the index of the free cache entry.
> 
> The patch fixes a problem.
> 
> Signed-off-by: Mirek Walukiewicz 
> ---
>  lib/librte_table/rte_table_hash_key16.c |5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_key16.c
> b/lib/librte_table/rte_table_hash_key16.c
> index ee5f639..e0c99bd 100644
> --- a/lib/librte_table/rte_table_hash_key16.c
> +++ b/lib/librte_table/rte_table_hash_key16.c
> @@ -535,9 +535,8 @@ rte_table_hash_entry_delete_key16_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_16));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_16 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 

Acked by: Cristian Dumitrescu 

Mirek, identical issue is found at identical place in rte_table_hash_key8.c and 
rte_table_hash_key32.c, would you please submit the same fix for those as well?

Thanks,
Cristian



[dpdk-dev] [PATCH 1/1] tools: Fix some strings and functions regarding VFIO support

2015-03-23 Thread Thomas Monjalon
> This patch fixes several minor issues in setup.sh:
> 
>  - show_nics() would not display the current Ethernet settings if
>the user only loads the vfio-pci module, b/c it only checks for
>presence of igb_uio. Fix this by adding a check for vfio-pci.
> 
>  - unbind_nics(): Fix option naming and string inside function.
> 
>  - Exchange a forgotten "igb_uio" with "vfio-pci" in a comment.
> 
> Signed-off-by: Andre Richter 

Applied, thanks


[dpdk-dev] [PATCH] testpmd: Fix wrong message when no port started

2015-03-23 Thread Thomas Monjalon
Pablo, what is your opinion on this patch?

2015-02-03 16:37, Michael Qiu:
> The log message is wrong when no port started.
> 
> Signed-off-by: Michael Qiu 
> ---
>  app/test-pmd/testpmd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 773b8af..ebf9448 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1423,7 +1423,7 @@ start_port(portid_t pid)
>   if (need_check_link_status && !no_link_check)
>   check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
>   else
> - printf("Please stop the ports first\n");
> + printf("Please start at least one port first\n");

Why the word "first"?
What could lead to this situation? Wrong pid?
Shouldn't be an error returned?

>  
>   printf("Done\n");
>   return 0;




[dpdk-dev] [PATCH] kni/ethtool/ixgbe: enforce access between ixgbe PCI and CPU

2015-03-23 Thread Thomas Monjalon
Helin, any opinion?
If nothing wrong is seen, it will be merged in few days.

2015-02-20 11:55, Thomas Monjalon:
> Anyone to review this patch?
> 
> 2015-02-11 14:49, xuelin.shi at freescale.com:
> > From: Xuelin Shi 
> > 
> > make sure:
> > CPU read from ixgbe with IXGBE_LE32_TO_CPUS
> > CPU write to ixgbe with IXGBE_CPU_TO_LE32
> > 
> > otherwise, there is endian issue for ixgbe on BIG_ENDIAN CPU.
> > 
> > Signed-off-by: Xuelin Shi 
> > ---
> >  .../linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h   | 24 
> > --
> >  1 file changed, 18 insertions(+), 6 deletions(-)
> > 
> > diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h 
> > b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h
> > index d161600..0612632 100644
> > --- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h
> > +++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h
> > @@ -53,6 +53,16 @@
> >  
> >  #undef ASSERT
> >  
> > +static inline uint32_t ixgbe_read_addr(volatile void* addr)
> > +{
> > +   return IXGBE_LE32_TO_CPUS(*((volatile uint32_t *)addr));
> > +}
> > +
> > +static inline uint32_t ixgbe_write_addr(u32 value, volatile void* addr)
> > +{
> > +   return writel(IXGBE_CPU_TO_LE32(value), addr);
> > +}
> > +
> >  #ifdef DBG
> >  #define hw_dbg(hw, S, A...)printk(KERN_DEBUG S, ## A)
> >  #else
> > @@ -91,19 +101,20 @@
> > default: \
> > break; \
> > } \
> > -   writel((value), ((a)->hw_addr + (reg))); \
> > +   ixgbe_write_addr((value), ((a)->hw_addr + (reg))); \
> >  } while (0)
> >  #else
> > -#define IXGBE_WRITE_REG(a, reg, value) writel((value), ((a)->hw_addr + 
> > (reg)))
> > +#define IXGBE_WRITE_REG(a, reg, value) \
> > +   ixgbe_write_addr((value), ((a)->hw_addr + (reg)))
> >  #endif
> >  
> > -#define IXGBE_READ_REG(a, reg) readl((a)->hw_addr + (reg))
> > +#define IXGBE_READ_REG(a, reg) ixgbe_read_addr((a)->hw_addr + (reg))
> >  
> >  #define IXGBE_WRITE_REG_ARRAY(a, reg, offset, value) ( \
> > -   writel((value), ((a)->hw_addr + (reg) + ((offset) << 2
> > +   ixgbe_write_addr((value), ((a)->hw_addr + (reg) + ((offset) << 2
> >  
> >  #define IXGBE_READ_REG_ARRAY(a, reg, offset) ( \
> > -   readl((a)->hw_addr + (reg) + ((offset) << 2)))
> > +   ixgbe_read_addr((a)->hw_addr + (reg) + ((offset) << 2)))
> >  
> >  #ifndef writeq
> >  #define writeq(val, addr)  do { writel((u32) (val), addr); \
> > @@ -111,7 +122,8 @@
> > } while (0);
> >  #endif
> >  
> > -#define IXGBE_WRITE_REG64(a, reg, value) writeq((value), ((a)->hw_addr + 
> > (reg)))
> > +#define IXGBE_WRITE_REG64(a, reg, value) \
> > +   writeq((cpu_to_le64(value)), ((a)->hw_addr + (reg)))
> >  
> >  #define IXGBE_WRITE_FLUSH(a) IXGBE_READ_REG(a, IXGBE_STATUS)
> >  struct ixgbe_hw;
> > 
> 
> 




[dpdk-dev] [PATCH] ixgbe: fix data access on big endian cpu

2015-03-23 Thread Thomas Monjalon
2015-03-03 16:27, xuelin.shi at freescale.com:
> From: Xuelin Shi 
> 
> enforce rules for cpu and ixgbe exchanging data.
> 1. cpu use data owned by ixgbe must use rte_le_to_cpu_xx(...)
> 2. cpu fill data to ixgbe must use rte_cpu_to_le_xx(...)
> 
> Signed-off-by: Xuelin Shi 

Please Xuelin, could you rebase on HEAD and fix these checkpatch errors?

ERROR:SPACING: space prohibited after that '!' (ctx:BxW)

ERROR:CODE_INDENT: code indent should use tabs where possible
+^I^I ^I   ^I  IXGBE_RXDADV_STAT_DD)) {$

Thanks


[dpdk-dev] [PATCH] librte_lpm: define tbl entry reversely for big endian

2015-03-23 Thread Thomas Monjalon
2015-03-09 14:02, Bruce Richardson:
> On Wed, Mar 04, 2015 at 02:34:12PM +0800, xuelin.shi at freescale.com wrote:
> > From: Xuelin Shi 
> > 
> > This module uses type conversion between struct and int.
> > Also truncation and comparison is used with this int.
> > It is not safe for different endian arch.
> > 
> > Add ifdef for big endian struct to fix this issue.
> > 
> > Signed-off-by: Xuelin Shi 
> 
> Get an error compiling this up (using clang on FreeBSD).
> 
>   CC rte_lpm.o
>   In file included from /usr/home/bruce/dpdk.org/lib/librte_lpm/rte_lpm.c:57:
>   /usr/home/bruce/dpdk.org/lib/librte_lpm/rte_lpm.h:99:5: fatal error: 
> 'RTE_BYTE_ORDER' is not defined, evaluates to 0 [-Wundef]
> #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
>   ^
>   1 error generated.
> 
> Adding "#include " should fix the issue.

Please Xuelin, could you submit a v2?
Thanks

> Existing unit tests on IA (little endian) pass fine there-after, but I think 
> for
> this patch it would be good to have an ack from someone who can validate on
> a big endian system, since this is what this patch is meant to enable.
> 
> /Bruce
> 




[dpdk-dev] [PATCH] ixgbe: do not include CRC in Tx byte count

2015-03-23 Thread Thomas Monjalon
Konstantin, Helin,
your review would be appreciate here.
Thanks

> From: Stephen Hemminger 
> 
> The ixgbe driver was including CRC in the transmit packet byte
> count, but not for packets received. This was notice when forwarding and
> the number of bytes received was greater than the number of bytes transmitted
> for the same number of packets. Make the driver behave like other
> virtual devices and not include CRC in byte count. Use the same queue
> counters already computed and used for Rx.
> 
> Signed-off-by: Stephen Hemminger 



[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Xie, Huawei
On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> Due to increased `struct file's reference counter subsequent call
> to `filp_close' does not free the `struct file'. Prepend `fput' call
> to decrease the reference counter.
>
> Signed-off-by: Pavel Boldin 
> ---
>  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> index 7755dd6..62c45c8 100644
> --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
> unsigned long arg)
>* Release the existing eventfd in the source process
>*/
>   spin_lock(&files->file_lock);
> + fput(file);
Could we just call atomic_long_dec here?
>   filp_close(file, files);
>   fdt = files_fdtable(files);
>   fdt->fd[eventfd_copy.source_fd] = NULL;



[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst

2015-03-23 Thread Dor Green
I'm running a small app which captures packets on a single lcore and
then passes it to other workers for processing.

Before even sending it to processing, when checking some minor
information in the packet mbuf's data I get a segfault.

This code, for example gets a segfault:

struct rte_mbuf *pkts[PKTS_BURST_SIZE];

for (p = 0; p < portnb; ++p) {
nbrx = rte_eth_rx_burst(p, 0, pkts, PKTS_BURST_SIZE);

if (unlikely(nbrx == 0)) {
continue;
}

for (i = 0; likely(i < nbrx); i++) {
printf("Pkt %c\n", pkts[i]->pkt->data[0]);
rte_mempool_put(pktmbuf_pool, (void *const)pkts[i]);
}
}

This doesn't happen on most packets, but when I used packets from a
certain cap it happened often (SSL traffic). In gdb the packet objects
looked like this:
{next = 0x0, data = 0x62132136406a6f6, data_len = 263, nb_segs = 1
'\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 55111,
f = {l3_len = 327, l2_len = 107, vlan_tci = 0}}, hash = {
rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
311317915}} (Invalid)
 {next = 0x0, data = 0x7ffe43d8f640, data_len = 73, nb_segs = 1
'\001', in_port = 0 '\000', pkt_len = 73, vlan_macip = {data = 0, f =
{l3_len = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
{next = 0x0, data = 0x7ffe43d7fa40, data_len = 74, nb_segs = 1 '\001',
in_port = 0 '\000', pkt_len = 74, vlan_macip = {data = 0, f = {l3_len
= 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
{next = 0x0, data = 0x7ffe43d7ff80, data_len = 66, nb_segs = 1 '\001',
in_port = 0 '\000', pkt_len = 66, vlan_macip = {data = 0, f = {l3_len
= 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
{next = 0x0, data = 0x28153a8e63b3afc4, data_len = 263, nb_segs = 1
'\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 59535,
f = {l3_len = 143, l2_len = 116, vlan_tci = 0}}, hash = {
rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
311317915}} (Invalid)

Note that in the first packet, the length does not match the actual
packet length (it does in the last though). The rest of the packets
are placed in the hugemem range as they should be.

I'm running on Linux 3.2.0-77, the NIC is "10G 2P X520", I have 4 1GB
huge pages.

Any ideas will be appreciated.


[dpdk-dev] Rx/Tx callbacks in debug mode

2015-03-23 Thread Thomas Monjalon
Hi,

As you may know, rte_eth_rx_burst() and rte_eth_tx_burst() have a
debug-specific implementation enabled with RTE_LIBRTE_ETHDEV_DEBUG.
I'm afraid these implementations have been forgotten when adding
optional Rx/Tx callbacks. Or is it intended?

What do you think of removing these debug functions and insert extra
checks with ifdef in an unique function?


[dpdk-dev] [PATCH] virtio: Fix crash issue for secondary process

2015-03-23 Thread Xie, Huawei
On 3/21/2015 5:59 AM, Thomas Monjalon wrote:

2015-03-19 09:45, Ouyang Changchun:


It definitely needs Rx function even in the case of secondary process, so put
the assignment a bit earlier to make sure of it.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 603be2d..ad24cf2 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1113,6 +1113,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));

eth_dev->dev_ops = &virtio_eth_dev_ops;
+   eth_dev->rx_pkt_burst = &virtio_recv_pkts;
eth_dev->tx_pkt_burst = &virtio_xmit_pkts;

if (rte_eal_process_type() == RTE_PROC_SECONDARY)
@@ -1148,10 +1149,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
eth_dev->rx_pkt_burst = &virtio_recv_mergeable_pkts;



Why the mergeable buffers case is not handled for secondary processes?

Forgot to CC my comment to dpdk.org.
There are many parts of eth_dev for the secondary process still

uninitialized, even like eth_dev->data->mac_addrs isn't allocated., also

like mergeable, feature, etc.

Secondary process will not work unless they never touch those fields.



Prefer we have a clean fix. Customer could apply that one line of fix if

they need this urgently.

I am wondering whether other PMDS have the same issue for the second process.





hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
-   } else {
-   eth_dev->rx_pkt_burst = &virtio_recv_pkts;
+   } else
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
-   }

/* Copy the permanent MAC address to: virtio_hw */
virtio_get_hwaddr(hw);









[dpdk-dev] [PATCH] librte_eal: Allow combining --no-huge with -m XXX

2015-03-23 Thread Simon Kagstrom
Useful to run applications in usermode via a test driver.

Signed-off-by: Simon Kagstrom 
---
Not sure if there are other implications of this, so please check!

 lib/librte_eal/common/eal_common_options.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 4319549..af865f5 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -850,7 +850,7 @@ eal_check_common_options(struct internal_config 
*internal_cfg)
return -1;
}
if (internal_cfg->no_hugetlbfs &&
-   (mem_parsed || internal_cfg->force_sockets == 1)) {
+   (internal_cfg->force_sockets == 1)) {
RTE_LOG(ERR, EAL, "Options -m or --"OPT_SOCKET_MEM" cannot "
"be specified together with --"OPT_NO_HUGE"\n");
return -1;
-- 
1.9.1



[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Pavel Boldin
On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei  wrote:

> On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> > Due to increased `struct file's reference counter subsequent call
> > to `filp_close' does not free the `struct file'. Prepend `fput' call
> > to decrease the reference counter.
> >
> > Signed-off-by: Pavel Boldin 
> > ---
> >  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > index 7755dd6..62c45c8 100644
> > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int
> ioctl, unsigned long arg)
> >* Release the existing eventfd in the source process
> >*/
> >   spin_lock(&files->file_lock);
> > + fput(file);
> Could we just call atomic_long_dec here?
>

We can but I don't like breaking encapsulation (which is broken anyway by
the code). So, there is a special method and we should use it in my opinion.

Pavel

>   filp_close(file, files);
> >   fdt = files_fdtable(files);
> >   fdt->fd[eventfd_copy.source_fd] = NULL;
>
>


[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Xie, Huawei
On 3/23/2015 10:37 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei mailto:huawei.xie at intel.com>> wrote:
On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> Due to increased `struct file's reference counter subsequent call
> to `filp_close' does not free the `struct file'. Prepend `fput' call
> to decrease the reference counter.
>
> Signed-off-by: Pavel Boldin mailto:pboldin at 
> mirantis.com>>
> ---
>  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> index 7755dd6..62c45c8 100644
> --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
> unsigned long arg)
>* Release the existing eventfd in the source process
>*/
>   spin_lock(&files->file_lock);
> + fput(file);
Could we just call atomic_long_dec here?

We can but I don't like breaking encapsulation (which is broken anyway by the 
code). So, there is a special method and we should use it in my opinion.
it is increased by atomic_long_inc_not_zero so why don't we use the symmetric 
function?


Pavel

>   filp_close(file, files);
>   fdt = files_fdtable(files);
>   fdt->fd[eventfd_copy.source_fd] = NULL;





[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Pavel Boldin
On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei  wrote:

> On 3/23/2015 10:37 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei  huawei.xie at intel.com>> wrote:
> On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> > Due to increased `struct file's reference counter subsequent call
> > to `filp_close' does not free the `struct file'. Prepend `fput' call
> > to decrease the reference counter.
> >
> > Signed-off-by: Pavel Boldin  pboldin at mirantis.com>>
> > ---
> >  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > index 7755dd6..62c45c8 100644
> > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int
> ioctl, unsigned long arg)
> >* Release the existing eventfd in the source process
> >*/
> >   spin_lock(&files->file_lock);
> > + fput(file);
> Could we just call atomic_long_dec here?
>
> We can but I don't like breaking encapsulation (which is broken anyway by
> the code). So, there is a special method and we should use it in my opinion.
> it is increased by atomic_long_inc_not_zero so why don't we use the
> symmetric function?
>
The code with `atomic_long_inc_not_zero' call is a copy-paste of the `fget'
function. If we want to make it clear we should make a separate function
and name it so: `fget_from_files'.

Second thing is: another thread of the same processor can call the
`sys_close' on the `fd' and this will dereference counter so `fput' will
correctly free the `struct file'. Using `atomic_long_dec' will leak a
`struct file' and print a KERN_ERR message by `filp_close'.

So, the common thing is to use appropriate functions and don't reinvent the
wheel.

Pavel


>
>
> Pavel
>
> >   filp_close(file, files);
> >   fdt = files_fdtable(files);
> >   fdt->fd[eventfd_copy.source_fd] = NULL;
>
>
>
>


[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst

2015-03-23 Thread Bruce Richardson
On Mon, Mar 23, 2015 at 04:24:18PM +0200, Dor Green wrote:
> I'm running a small app which captures packets on a single lcore and
> then passes it to other workers for processing.
> 
> Before even sending it to processing, when checking some minor
> information in the packet mbuf's data I get a segfault.
> 
> This code, for example gets a segfault:
> 
> struct rte_mbuf *pkts[PKTS_BURST_SIZE];
> 
> for (p = 0; p < portnb; ++p) {
> nbrx = rte_eth_rx_burst(p, 0, pkts, PKTS_BURST_SIZE);
> 
> if (unlikely(nbrx == 0)) {
> continue;
> }
> 
> for (i = 0; likely(i < nbrx); i++) {
> printf("Pkt %c\n", pkts[i]->pkt->data[0]);
> rte_mempool_put(pktmbuf_pool, (void *const)pkts[i]);
> }
> }
> 
> This doesn't happen on most packets, but when I used packets from a
> certain cap it happened often (SSL traffic). In gdb the packet objects
> looked like this:
> {next = 0x0, data = 0x62132136406a6f6, data_len = 263, nb_segs = 1
> '\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 55111,
> f = {l3_len = 327, l2_len = 107, vlan_tci = 0}}, hash = {
> rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
> 311317915}} (Invalid)
>  {next = 0x0, data = 0x7ffe43d8f640, data_len = 73, nb_segs = 1
> '\001', in_port = 0 '\000', pkt_len = 73, vlan_macip = {data = 0, f =
> {l3_len = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> {next = 0x0, data = 0x7ffe43d7fa40, data_len = 74, nb_segs = 1 '\001',
> in_port = 0 '\000', pkt_len = 74, vlan_macip = {data = 0, f = {l3_len
> = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> {next = 0x0, data = 0x7ffe43d7ff80, data_len = 66, nb_segs = 1 '\001',
> in_port = 0 '\000', pkt_len = 66, vlan_macip = {data = 0, f = {l3_len
> = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> {next = 0x0, data = 0x28153a8e63b3afc4, data_len = 263, nb_segs = 1
> '\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 59535,
> f = {l3_len = 143, l2_len = 116, vlan_tci = 0}}, hash = {
> rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
> 311317915}} (Invalid)
> 
> Note that in the first packet, the length does not match the actual
> packet length (it does in the last though). The rest of the packets
> are placed in the hugemem range as they should be.
> 
> I'm running on Linux 3.2.0-77, the NIC is "10G 2P X520", I have 4 1GB
> huge pages.
> 
> Any ideas will be appreciated.

What version of DPDK are you using? If you update the code to work with the
latest code (or 2.0.0-rc2), does the problem still occur? Also, does it make
any difference calling rte_pktmbuf_free rather thatn calling mempool_put 
directly?

/Bruce


[dpdk-dev] Rx/Tx callbacks in debug mode

2015-03-23 Thread Bruce Richardson
On Mon, Mar 23, 2015 at 03:31:48PM +0100, Thomas Monjalon wrote:
> Hi,
> 
> As you may know, rte_eth_rx_burst() and rte_eth_tx_burst() have a
> debug-specific implementation enabled with RTE_LIBRTE_ETHDEV_DEBUG.
> I'm afraid these implementations have been forgotten when adding
> optional Rx/Tx callbacks. Or is it intended?
> 
> What do you think of removing these debug functions and insert extra
> checks with ifdef in an unique function?

The omission was unintentional. I think removing duplicate functions and using a
common version, if possible, is always a good idea, or else things like this
keep happening! :-)

/Bruce


[dpdk-dev] Symmetric RSS Hashing, Part 2

2015-03-23 Thread Matt Laswell
Hey Folks,

I have essentially the same question as Matthew.  Has there been progress
in this area?

--
Matt Laswell
infinite io, inc.
laswell at infiniteio.com


On Sat, Mar 14, 2015 at 3:47 PM, Matthew Hall  wrote:

> A few months ago we had this thread about symmetric hashing of TCP in RSS:
>
> http://dpdk.org/ml/archives/dev/2014-December/010148.html
>
> I was wondering if we ever did figure out how to get the 0x6d5a hash key
> mentioned in there to work, or another alternative one.
>
> Thanks,
> Matthew.


[dpdk-dev] [PATCH] table: fix a crash during key8 and key32 overload

2015-03-23 Thread Maciej Gajdzica
hash_key8_ext and hash_key32_ext tables allocate cache entries to
support table overload cases. The crash can occur when cache entry is
free after use. The problem is with computing the index of the free
cache entry. The same case for key16 was fixed with earlier patch.

Signed-off-by: Maciej Gajdzica 
---
 lib/librte_table/rte_table_hash_key32.c |5 ++---
 lib/librte_table/rte_table_hash_key8.c  |5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/lib/librte_table/rte_table_hash_key32.c 
b/lib/librte_table/rte_table_hash_key32.c
index da0ce6a..6790594 100644
--- a/lib/librte_table/rte_table_hash_key32.c
+++ b/lib/librte_table/rte_table_hash_key32.c
@@ -540,9 +540,8 @@ rte_table_hash_entry_delete_key32_ext(

memset(bucket, 0,
sizeof(struct rte_bucket_4_32));
-   bucket_index = (bucket -
-   ((struct rte_bucket_4_32 *)
-   f->memory)) - f->n_buckets;
+   bucket_index = (((uint8_t *)bucket -
+   (uint8_t 
*)f->memory)/f->bucket_size) - f->n_buckets;
f->stack[f->stack_pos++] = bucket_index;
}

diff --git a/lib/librte_table/rte_table_hash_key8.c 
b/lib/librte_table/rte_table_hash_key8.c
index 443ca7d..6803eb2 100644
--- a/lib/librte_table/rte_table_hash_key8.c
+++ b/lib/librte_table/rte_table_hash_key8.c
@@ -528,9 +528,8 @@ rte_table_hash_entry_delete_key8_ext(

memset(bucket, 0,
sizeof(struct rte_bucket_4_8));
-   bucket_index = (bucket -
-   ((struct rte_bucket_4_8 *)
-   f->memory)) - f->n_buckets;
+   bucket_index = (((uint8_t *)bucket -
+   (uint8_t 
*)f->memory)/f->bucket_size) - f->n_buckets;
f->stack[f->stack_pos++] = bucket_index;
}

-- 
1.7.9.5



[dpdk-dev] [PATCH] table: fix a crash during key8 and key32 overload

2015-03-23 Thread Maciej Gajdzica
hash_key8_ext and hash_key32_ext tables allocate cache entries to
support table overload cases. The crash can occur when cache entry is
free after use. The problem is with computing the index of the free
cache entry. The same case for key16 was fixed with earlier patch.

Signed-off-by: Maciej Gajdzica 
---
 lib/librte_table/rte_table_hash_key32.c |5 ++---
 lib/librte_table/rte_table_hash_key8.c  |5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/lib/librte_table/rte_table_hash_key32.c 
b/lib/librte_table/rte_table_hash_key32.c
index da0ce6a..6790594 100644
--- a/lib/librte_table/rte_table_hash_key32.c
+++ b/lib/librte_table/rte_table_hash_key32.c
@@ -540,9 +540,8 @@ rte_table_hash_entry_delete_key32_ext(

memset(bucket, 0,
sizeof(struct rte_bucket_4_32));
-   bucket_index = (bucket -
-   ((struct rte_bucket_4_32 *)
-   f->memory)) - f->n_buckets;
+   bucket_index = (((uint8_t *)bucket -
+   (uint8_t 
*)f->memory)/f->bucket_size) - f->n_buckets;
f->stack[f->stack_pos++] = bucket_index;
}

diff --git a/lib/librte_table/rte_table_hash_key8.c 
b/lib/librte_table/rte_table_hash_key8.c
index 443ca7d..6803eb2 100644
--- a/lib/librte_table/rte_table_hash_key8.c
+++ b/lib/librte_table/rte_table_hash_key8.c
@@ -528,9 +528,8 @@ rte_table_hash_entry_delete_key8_ext(

memset(bucket, 0,
sizeof(struct rte_bucket_4_8));
-   bucket_index = (bucket -
-   ((struct rte_bucket_4_8 *)
-   f->memory)) - f->n_buckets;
+   bucket_index = (((uint8_t *)bucket -
+   (uint8_t 
*)f->memory)/f->bucket_size) - f->n_buckets;
f->stack[f->stack_pos++] = bucket_index;
}

-- 
1.7.9.5



[dpdk-dev] [PATCH v3] vhost: Refactor module `eventfd_link'

2015-03-23 Thread Pavel Boldin
Changes:
 * Remove unnecessary #include's.
 * Deindent by moving the code to an inline function.
 * Fix return codes. Use appropriate return code for each fault cause.
 * Remove copy-pasted `close_fd', call `sys_close' instead.
 * Use `get_pid_task' to correctly reference the `task_target'.

Signed-off-by: Pavel Boldin 
---
Changes since last submission:
 * Using `sys_close' to close fd.
 * Removing unnecessary #include's.

 lib/librte_vhost/eventfd_link/eventfd_link.c | 193 ++-
 1 file changed, 98 insertions(+), 95 deletions(-)

diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
b/lib/librte_vhost/eventfd_link/eventfd_link.c
index 7755dd6..d10e25f 100644
--- a/lib/librte_vhost/eventfd_link/eventfd_link.c
+++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
@@ -22,18 +22,11 @@
  *   Intel Corporation
  */

-#include 
 #include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
+#include 

 #include "eventfd_link.h"

@@ -65,100 +58,110 @@ put_files_struct(struct files_struct *files)
BUG();
 }

+static struct file *
+fget_from_files(struct files_struct *files, unsigned fd)
+{
+   struct file *file;

-static long
-eventfd_link_ioctl(struct file *f, unsigned int ioctl, unsigned long arg)
+   rcu_read_lock();
+   file = fcheck_files(files, fd);
+   if (file)
+   {
+   if (file->f_mode & FMODE_PATH
+   || !atomic_long_inc_not_zero(&file->f_count))
+   file = NULL;
+   }
+   rcu_read_unlock();
+
+   return file;
+}
+
+static inline long
+eventfd_link_ioctl_copy(unsigned long arg)
 {
-   void __user *argp = (void __user *) arg;
+   long ret = -EFAULT;
struct task_struct *task_target = NULL;
-   struct file *file;
-   struct files_struct *files;
-   struct fdtable *fdt;
+   struct file *target_file = NULL;
+   struct files_struct *target_files = NULL;
struct eventfd_copy eventfd_copy;
+   struct pid *pid;
+
+   if (copy_from_user(&eventfd_copy, (void __user*)arg,
+   sizeof(struct eventfd_copy)))
+   goto out;
+
+   /*
+* Find the task struct for the target pid
+*/
+   ret = -ESRCH;
+
+   pid = find_vpid(eventfd_copy.target_pid);
+   if (pid == NULL) {
+   pr_info("Unable to find pid %d\n", eventfd_copy.target_pid);
+   goto out;
+   }
+
+   task_target = get_pid_task(pid, PIDTYPE_PID);
+   if (task_target == NULL) {
+   pr_info("Failed to get task for pid %d\n",
+   eventfd_copy.target_pid);
+   goto out;
+   }
+
+   ret = sys_close(eventfd_copy.source_fd);
+   if (ret)
+   goto out_task;
+   ret = -ESTALE;

-   switch (ioctl) {
+   /*
+* Find the file struct associated with the target fd.
+*/
+
+   target_files = get_files_struct(task_target);
+   if (target_files == NULL) {
+   pr_info("Failed to get target files struct\n");
+   goto out_task;
+   }
+
+   ret = -EBADF;
+   target_file = fget_from_files(target_files, eventfd_copy.target_fd);
+
+   if (target_file == NULL) {
+   pr_info("Failed to get file from target pid\n");
+   goto out_target_files;
+   }
+
+
+   /*
+* Install the file struct from the target process into the
+* file desciptor of the source process,
+*/
+
+   fd_install(eventfd_copy.source_fd, target_file);
+
+   ret = 0;
+
+out_target_files:
+   put_files_struct(target_files);
+out_task:
+   put_task_struct(task_target);
+out:
+   return ret;
+}
+
+static long
+eventfd_link_ioctl(struct file *f, unsigned int ioctl, unsigned long arg)
+{
+   long ret = -ENOIOCTLCMD;
+
+   switch (ioctl)
+   {
case EVENTFD_COPY:
-   if (copy_from_user(&eventfd_copy, argp,
-   sizeof(struct eventfd_copy)))
-   return -EFAULT;
-
-   /*
-* Find the task struct for the target pid
-*/
-   task_target =
-   pid_task(find_vpid(eventfd_copy.target_pid), 
PIDTYPE_PID);
-   if (task_target == NULL) {
-   pr_debug("Failed to get mem ctx for target pid\n");
-   return -EFAULT;
-   }
-
-   files = get_files_struct(current);
-   if (files == NULL) {
-   pr_debug("Failed to get files struct\n");
-   return -EFAULT;
-   }
-
-   rcu_read_lock();
-   file = fcheck_files(files, eventfd_copy.source_fd);
-   if (file) {
-   if (file->f_mode & FMODE_PATH ||
-   !atomic_long_inc_not_zero(&file->f_count))
- 

[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Xie, Huawei
On 3/23/2015 10:52 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei mailto:huawei.xie at intel.com>> wrote:
On 3/23/2015 10:37 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei mailto:huawei.xie at intel.com>>> wrote:
On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> Due to increased `struct file's reference counter subsequent call
> to `filp_close' does not free the `struct file'. Prepend `fput' call
> to decrease the reference counter.
>
> Signed-off-by: Pavel Boldin mailto:pboldin at 
> mirantis.com>>>
> ---
>  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> index 7755dd6..62c45c8 100644
> --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
> unsigned long arg)
>* Release the existing eventfd in the source process
>*/
>   spin_lock(&files->file_lock);
> + fput(file);
Could we just call atomic_long_dec here?

We can but I don't like breaking encapsulation (which is broken anyway by the 
code). So, there is a special method and we should use it in my opinion.
it is increased by atomic_long_inc_not_zero so why don't we use the symmetric 
function?
The code with `atomic_long_inc_not_zero' call is a copy-paste of the `fget' 
function. If we want to make it clear we should make a separate function and 
name it so: `fget_from_files'.

I don't understand why there is a (exact?) copy&paste of fget there. :).
Maybe you could post a patchset, first replace the copy/paste with fget and 
then this patch. It will looks much clearer.
Second thing is: another thread of the same processor can call the `sys_close' 
on the `fd' and this will dereference counter so `fput' will correctly free the 
`struct file'. Using `atomic_long_dec' will leak a `struct file' and print a 
KERN_ERR message by `filp_close'.

So, the common thing is to use appropriate functions and don't reinvent the 
wheel.

Pavel



Pavel

>   filp_close(file, files);
>   fdt = files_fdtable(files);
>   fdt->fd[eventfd_copy.source_fd] = NULL;







[dpdk-dev] [PATCH] ethdev: additional parameter in RX callback

2015-03-23 Thread Thomas Monjalon
2015-03-13 19:15, Neil Horman:
> On Fri, Mar 13, 2015 at 06:28:31PM +, Mcnamara, John wrote:
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > 
> > > > Is encoding the information in the array really a better solution here?
> > > The cb->param already exists for passing in user defined information to
> > > the callback. The proposed patch merely transmits the parent function
> > > arguments to the enclosed callback.
> > > >
> > > The cb->param can't be used here, because its opaque to the internals of
> > > the DPDK.  rte_eth_rx_burst doesn't (and can't) know where in the cb-
> > > >params pointer to store that information.  Thats why you added an
> > > additional parameter in the first place, isn't it?
> > 
> > Yes. That is correct.
> > 
> Then why did you suggest doing so?
> 
> > > My point is that using
> > > an array terminator keeps us out of this habbit of just adding parameters
> > > to communicate more information (as thats an ABI breaking method, and not
> > > particularly scalable if there is more information to be transmitted in
> > > the future).  Using a context sensitive API set goes beyond even that, and
> > > allows to retrieve arbitrary information form callbacks as needed in an
> > > ABI safe manner
> > 
> > Again I can agree with this in the general case, but it isn't necessary,
> > in this case, to encode the information in the array since it is already
> > local to and available in the function. It seems artificial, at this point,
> > to implement an array terminator solution to protect an API that,
> > effectively, hasn't been published yet.
> > 
> You indicate that you agree an alternate solution is preferable in the general
> case, so as to provide an API that is extensible in a way that isn't subject 
> to
> ABI breakage, correct?  If so, why do assert that its not necessecary in this
> specific case?  If you feel you need to add information so that callbacks can 
> be
> more flexible (in this case specifying the size of a passed in array), why
> immediately shoehorn another parmeter in place, and break the consistency
> between rx and tx callbacks, when you don't have to?  I don't care if you 
> break
> ABI today (although to call it unpublished I think is disingenuous, as lots of
> testing and development has already taken place with the ABI as it currently
> stands).  I care, as I noted above about not getting into the habbit of just
> assuming a change like this requires that you invaliate ABI somehow.  You 
> don't
> have to, you can create an API that is fairly invariant to it here if you 
> like.
> The question in my mind is, why don't you?

I think John is saying that the API of rte_eth_rx_burst() already includes
the nb_pkts parameter. So it's natural to push it to the callback.
I also think Neil is saying that this parameter is useless in the callback
and in rte_eth_rx_burst() if the array was null terminated.
In any case, having a mix (null termination + parameter in rte_eth_rx_burst)
is not acceptable.
Moreover, I wonder how efficient are the compiler optimizations in each loop
case (index and null termination).

As the API was using an integer count, my opinion is to keep it and push it to
the callback for 2.0.
If null termination is validated to be better, it could be a later rework.

Is there something I'm missing?
Thoughts?


[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst

2015-03-23 Thread Dor Green
I changed it to free and it still happens. Note that the segmentation fault
happens before that anyway.

I am using 1.7.1 at the moment. I can try using a newer version.
On 23 Mar 2015 17:00, "Bruce Richardson"  wrote:

> On Mon, Mar 23, 2015 at 04:24:18PM +0200, Dor Green wrote:
> > I'm running a small app which captures packets on a single lcore and
> > then passes it to other workers for processing.
> >
> > Before even sending it to processing, when checking some minor
> > information in the packet mbuf's data I get a segfault.
> >
> > This code, for example gets a segfault:
> >
> > struct rte_mbuf *pkts[PKTS_BURST_SIZE];
> >
> > for (p = 0; p < portnb; ++p) {
> > nbrx = rte_eth_rx_burst(p, 0, pkts, PKTS_BURST_SIZE);
> >
> > if (unlikely(nbrx == 0)) {
> > continue;
> > }
> >
> > for (i = 0; likely(i < nbrx); i++) {
> > printf("Pkt %c\n", pkts[i]->pkt->data[0]);
> > rte_mempool_put(pktmbuf_pool, (void *const)pkts[i]);
> > }
> > }
> >
> > This doesn't happen on most packets, but when I used packets from a
> > certain cap it happened often (SSL traffic). In gdb the packet objects
> > looked like this:
> > {next = 0x0, data = 0x62132136406a6f6, data_len = 263, nb_segs = 1
> > '\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 55111,
> > f = {l3_len = 327, l2_len = 107, vlan_tci = 0}}, hash = {
> > rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
> > 311317915}} (Invalid)
> >  {next = 0x0, data = 0x7ffe43d8f640, data_len = 73, nb_segs = 1
> > '\001', in_port = 0 '\000', pkt_len = 73, vlan_macip = {data = 0, f =
> > {l3_len = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> > fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> > {next = 0x0, data = 0x7ffe43d7fa40, data_len = 74, nb_segs = 1 '\001',
> > in_port = 0 '\000', pkt_len = 74, vlan_macip = {data = 0, f = {l3_len
> > = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> > fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> > {next = 0x0, data = 0x7ffe43d7ff80, data_len = 66, nb_segs = 1 '\001',
> > in_port = 0 '\000', pkt_len = 66, vlan_macip = {data = 0, f = {l3_len
> > = 0, l2_len = 0, vlan_tci = 0}}, hash = {rss = 311317915,
> > fdir = {hash = 21915, id = 4750}, sched = 311317915}} (Valid)
> > {next = 0x0, data = 0x28153a8e63b3afc4, data_len = 263, nb_segs = 1
> > '\001', in_port = 0 '\000', pkt_len = 263, vlan_macip = {data = 59535,
> > f = {l3_len = 143, l2_len = 116, vlan_tci = 0}}, hash = {
> > rss = 311317915, fdir = {hash = 21915, id = 4750}, sched =
> > 311317915}} (Invalid)
> >
> > Note that in the first packet, the length does not match the actual
> > packet length (it does in the last though). The rest of the packets
> > are placed in the hugemem range as they should be.
> >
> > I'm running on Linux 3.2.0-77, the NIC is "10G 2P X520", I have 4 1GB
> > huge pages.
> >
> > Any ideas will be appreciated.
>
> What version of DPDK are you using? If you update the code to work with the
> latest code (or 2.0.0-rc2), does the problem still occur? Also, does it
> make
> any difference calling rte_pktmbuf_free rather thatn calling mempool_put
> directly?
>
> /Bruce
>


[dpdk-dev] [PATCH] table: fix a crash during key8 and key32 overload

2015-03-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Monday, March 23, 2015 3:09 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] table: fix a crash during key8 and key32
> overload
> 
> hash_key8_ext and hash_key32_ext tables allocate cache entries to
> support table overload cases. The crash can occur when cache entry is
> free after use. The problem is with computing the index of the free
> cache entry. The same case for key16 was fixed with earlier patch.
> 
> Signed-off-by: Maciej Gajdzica 
> ---
>  lib/librte_table/rte_table_hash_key32.c |5 ++---
>  lib/librte_table/rte_table_hash_key8.c  |5 ++---
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_key32.c
> b/lib/librte_table/rte_table_hash_key32.c
> index da0ce6a..6790594 100644
> --- a/lib/librte_table/rte_table_hash_key32.c
> +++ b/lib/librte_table/rte_table_hash_key32.c
> @@ -540,9 +540,8 @@ rte_table_hash_entry_delete_key32_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_32));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_32 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 
> diff --git a/lib/librte_table/rte_table_hash_key8.c
> b/lib/librte_table/rte_table_hash_key8.c
> index 443ca7d..6803eb2 100644
> --- a/lib/librte_table/rte_table_hash_key8.c
> +++ b/lib/librte_table/rte_table_hash_key8.c
> @@ -528,9 +528,8 @@ rte_table_hash_entry_delete_key8_ext(
> 
>   memset(bucket, 0,
>   sizeof(struct
> rte_bucket_4_8));
> - bucket_index = (bucket -
> - ((struct rte_bucket_4_8 *)
> - f->memory)) - f->n_buckets;
> + bucket_index = (((uint8_t *)bucket -
> + (uint8_t *)f->memory)/f-
> >bucket_size) - f->n_buckets;
>   f->stack[f->stack_pos++] =
> bucket_index;
>   }
> 
> --
> 1.7.9.5

Acked by: Cristian Dumitrescu 

Thanks, Maciej!



[dpdk-dev] Symmetric RSS Hashing, Part 2

2015-03-23 Thread Thomas Monjalon
2015-03-23 10:05, Matt Laswell:
> Hey Folks,
> 
> I have essentially the same question as Matthew.  Has there been progress
> in this area?

No, AFAIK.
Submitting a patch would be a good start I think.

> --
> Matt Laswell
> infinite io, inc.
> laswell at infiniteio.com
> 
> 
> On Sat, Mar 14, 2015 at 3:47 PM, Matthew Hall  
> wrote:
> 
> > A few months ago we had this thread about symmetric hashing of TCP in RSS:
> >
> > http://dpdk.org/ml/archives/dev/2014-December/010148.html
> >
> > I was wondering if we ever did figure out how to get the 0x6d5a hash key
> > mentioned in there to work, or another alternative one.
> >
> > Thanks,
> > Matthew.




[dpdk-dev] [PATCH] table: fix a crash during key8 and key32 overload

2015-03-23 Thread Walukiewicz, Miroslaw
Reviewed-by: Mirek Walukiewicz 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Dumitrescu, Cristian
> Sent: Monday, March 23, 2015 4:20 PM
> To: Gajdzica, MaciejX T; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] table: fix a crash during key8 and key32
> overload
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> > Sent: Monday, March 23, 2015 3:09 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH] table: fix a crash during key8 and key32
> > overload
> >
> > hash_key8_ext and hash_key32_ext tables allocate cache entries to
> > support table overload cases. The crash can occur when cache entry is
> > free after use. The problem is with computing the index of the free
> > cache entry. The same case for key16 was fixed with earlier patch.
> >
> > Signed-off-by: Maciej Gajdzica 
> > ---
> >  lib/librte_table/rte_table_hash_key32.c |5 ++---
> >  lib/librte_table/rte_table_hash_key8.c  |5 ++---
> >  2 files changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_table/rte_table_hash_key32.c
> > b/lib/librte_table/rte_table_hash_key32.c
> > index da0ce6a..6790594 100644
> > --- a/lib/librte_table/rte_table_hash_key32.c
> > +++ b/lib/librte_table/rte_table_hash_key32.c
> > @@ -540,9 +540,8 @@ rte_table_hash_entry_delete_key32_ext(
> >
> > memset(bucket, 0,
> > sizeof(struct
> > rte_bucket_4_32));
> > -   bucket_index = (bucket -
> > -   ((struct rte_bucket_4_32 *)
> > -   f->memory)) - f->n_buckets;
> > +   bucket_index = (((uint8_t *)bucket -
> > +   (uint8_t *)f->memory)/f-
> > >bucket_size) - f->n_buckets;
> > f->stack[f->stack_pos++] =
> > bucket_index;
> > }
> >
> > diff --git a/lib/librte_table/rte_table_hash_key8.c
> > b/lib/librte_table/rte_table_hash_key8.c
> > index 443ca7d..6803eb2 100644
> > --- a/lib/librte_table/rte_table_hash_key8.c
> > +++ b/lib/librte_table/rte_table_hash_key8.c
> > @@ -528,9 +528,8 @@ rte_table_hash_entry_delete_key8_ext(
> >
> > memset(bucket, 0,
> > sizeof(struct
> > rte_bucket_4_8));
> > -   bucket_index = (bucket -
> > -   ((struct rte_bucket_4_8 *)
> > -   f->memory)) - f->n_buckets;
> > +   bucket_index = (((uint8_t *)bucket -
> > +   (uint8_t *)f->memory)/f-
> > >bucket_size) - f->n_buckets;
> > f->stack[f->stack_pos++] =
> > bucket_index;
> > }
> >
> > --
> > 1.7.9.5
> 
> Acked by: Cristian Dumitrescu 
> 
> Thanks, Maciej!



[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Thomas Monjalon
Huawei,
This thread is unreadable because your mailer is not quoting.
Please check. Thanks

2015-03-23 15:16, Xie, Huawei:
> On 3/23/2015 10:52 PM, Pavel Boldin wrote:
> 
> 
> On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei  intel.com> wrote:
> On 3/23/2015 10:37 PM, Pavel Boldin wrote:
> 
> 
> On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei  intel.com intel.com>> wrote:
> On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> > Due to increased `struct file's reference counter subsequent call
> > to `filp_close' does not free the `struct file'. Prepend `fput' call
> > to decrease the reference counter.
> >
> > Signed-off-by: Pavel Boldin mailto:pboldin at 
> > mirantis.com> > mirantis.com>>>
> > ---
> >  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> > b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > index 7755dd6..62c45c8 100644
> > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
> > unsigned long arg)
> >* Release the existing eventfd in the source process
> >*/
> >   spin_lock(&files->file_lock);
> > + fput(file);
> Could we just call atomic_long_dec here?
> 
> We can but I don't like breaking encapsulation (which is broken anyway by the 
> code). So, there is a special method and we should use it in my opinion.
> it is increased by atomic_long_inc_not_zero so why don't we use the symmetric 
> function?
> The code with `atomic_long_inc_not_zero' call is a copy-paste of the `fget' 
> function. If we want to make it clear we should make a separate function and 
> name it so: `fget_from_files'.
> 
> I don't understand why there is a (exact?) copy&paste of fget there. :).
> Maybe you could post a patchset, first replace the copy/paste with fget and 
> then this patch. It will looks much clearer.
> Second thing is: another thread of the same processor can call the 
> `sys_close' on the `fd' and this will dereference counter so `fput' will 
> correctly free the `struct file'. Using `atomic_long_dec' will leak a `struct 
> file' and print a KERN_ERR message by `filp_close'.
> 
> So, the common thing is to use appropriate functions and don't reinvent the 
> wheel.
> 
> Pavel
> 
> 
> 
> Pavel
> 
> >   filp_close(file, files);
> >   fdt = files_fdtable(files);
> >   fdt->fd[eventfd_copy.source_fd] = NULL;
> 




[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Pavel Boldin
On Mon, Mar 23, 2015 at 5:16 PM, Xie, Huawei  wrote:

> On 3/23/2015 10:52 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei  huawei.xie at intel.com>> wrote:
> On 3/23/2015 10:37 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei  huawei.xie at intel.com>>> wrote:
> On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> > Due to increased `struct file's reference counter subsequent call
> > to `filp_close' does not free the `struct file'. Prepend `fput' call
> > to decrease the reference counter.
> >
> > Signed-off-by: Pavel Boldin  pboldin at mirantis.com>>>
> > ---
> >  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > index 7755dd6..62c45c8 100644
> > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int
> ioctl, unsigned long arg)
> >* Release the existing eventfd in the source process
> >*/
> >   spin_lock(&files->file_lock);
> > + fput(file);
> Could we just call atomic_long_dec here?
>
> We can but I don't like breaking encapsulation (which is broken anyway by
> the code). So, there is a special method and we should use it in my opinion.
> it is increased by atomic_long_inc_not_zero so why don't we use the
> symmetric function?
> The code with `atomic_long_inc_not_zero' call is a copy-paste of the
> `fget' function. If we want to make it clear we should make a separate
> function and name it so: `fget_from_files'.
>
> I don't understand why there is a (exact?) copy&paste of fget there. :).
> Maybe you could post a patchset, first replace the copy/paste with fget
> and then this patch. It will looks much clearer.
>
The code of this module received little to none review and requires some
love at the moment.

I wanted to refactor the module completely but Thomas said it is not going
to go into the 2.0. So I decided to make a simple one-line fix.

If you are interested this [0] is the latest version of the refactoring
patch.

I can provide you with an application that checks that there is indeed no
leakage and ensures that the `eventfd' moving works. It is being used in
our builds as a test [1]. The code is "heredoc"ed in [2]

[0] http://dpdk.org/dev/patchwork/patch/4113/
[1] https://review.fuel-infra.org/#/c/4639/
[2] https://review.fuel-infra.org/#/c/4639/3/tests/runtests.sh

Pavel


> Second thing is: another thread of the same processor can call the
> `sys_close' on the `fd' and this will dereference counter so `fput' will
> correctly free the `struct file'. Using `atomic_long_dec' will leak a
> `struct file' and print a KERN_ERR message by `filp_close'.
>
> So, the common thing is to use appropriate functions and don't reinvent
> the wheel.
>
> Pavel
>
>
>
> Pavel
>
> >   filp_close(file, files);
> >   fdt = files_fdtable(files);
> >   fdt->fd[eventfd_copy.source_fd] = NULL;
>
>
>
>
>
>


[dpdk-dev] [PATCH] ethdev: additional parameter in RX callback

2015-03-23 Thread Bruce Richardson
On Mon, Mar 23, 2015 at 04:16:36PM +0100, Thomas Monjalon wrote:
> 2015-03-13 19:15, Neil Horman:
> > On Fri, Mar 13, 2015 at 06:28:31PM +, Mcnamara, John wrote:
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > 
> > > > > Is encoding the information in the array really a better solution 
> > > > > here?
> > > > The cb->param already exists for passing in user defined information to
> > > > the callback. The proposed patch merely transmits the parent function
> > > > arguments to the enclosed callback.
> > > > >
> > > > The cb->param can't be used here, because its opaque to the internals of
> > > > the DPDK.  rte_eth_rx_burst doesn't (and can't) know where in the cb-
> > > > >params pointer to store that information.  Thats why you added an
> > > > additional parameter in the first place, isn't it?
> > > 
> > > Yes. That is correct.
> > > 
> > Then why did you suggest doing so?
> > 
> > > > My point is that using
> > > > an array terminator keeps us out of this habbit of just adding 
> > > > parameters
> > > > to communicate more information (as thats an ABI breaking method, and 
> > > > not
> > > > particularly scalable if there is more information to be transmitted in
> > > > the future).  Using a context sensitive API set goes beyond even that, 
> > > > and
> > > > allows to retrieve arbitrary information form callbacks as needed in an
> > > > ABI safe manner
> > > 
> > > Again I can agree with this in the general case, but it isn't necessary,
> > > in this case, to encode the information in the array since it is already
> > > local to and available in the function. It seems artificial, at this 
> > > point,
> > > to implement an array terminator solution to protect an API that,
> > > effectively, hasn't been published yet.
> > > 
> > You indicate that you agree an alternate solution is preferable in the 
> > general
> > case, so as to provide an API that is extensible in a way that isn't 
> > subject to
> > ABI breakage, correct?  If so, why do assert that its not necessecary in 
> > this
> > specific case?  If you feel you need to add information so that callbacks 
> > can be
> > more flexible (in this case specifying the size of a passed in array), why
> > immediately shoehorn another parmeter in place, and break the consistency
> > between rx and tx callbacks, when you don't have to?  I don't care if you 
> > break
> > ABI today (although to call it unpublished I think is disingenuous, as lots 
> > of
> > testing and development has already taken place with the ABI as it currently
> > stands).  I care, as I noted above about not getting into the habbit of just
> > assuming a change like this requires that you invaliate ABI somehow.  You 
> > don't
> > have to, you can create an API that is fairly invariant to it here if you 
> > like.
> > The question in my mind is, why don't you?
> 
> I think John is saying that the API of rte_eth_rx_burst() already includes
> the nb_pkts parameter. So it's natural to push it to the callback.
> I also think Neil is saying that this parameter is useless in the callback
> and in rte_eth_rx_burst() if the array was null terminated.
> In any case, having a mix (null termination + parameter in rte_eth_rx_burst)
> is not acceptable.
> Moreover, I wonder how efficient are the compiler optimizations in each loop
> case (index and null termination).

Compiler can't optimize/unroll the loop in the null termination case. For the
passing-the-size through option, in any app where the RX buffer size is 
constant,
i.e. probably a lot of them - like in our examples, the compiler can do loop
unrolling, and possibly other optimizations on the known value. [Whether it 
choses
too or not, is not something we have tested :-)]

/Bruce

> 
> As the API was using an integer count, my opinion is to keep it and push it to
> the callback for 2.0.
> If null termination is validated to be better, it could be a later rework.
> 
> Is there something I'm missing?
> Thoughts?


[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Xie, Huawei
On 3/23/2015 11:27 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 5:16 PM, Xie, Huawei mailto:huawei.xie at intel.com>> wrote:
On 3/23/2015 10:52 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei mailto:huawei.xie at intel.com>>> wrote:
On 3/23/2015 10:37 PM, Pavel Boldin wrote:


On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei mailto:huawei.xie at intel.com>> Due to increased `struct file's reference counter subsequent call
> to `filp_close' does not free the `struct file'. Prepend `fput' call
> to decrease the reference counter.
>
> Signed-off-by: Pavel Boldin mailto:pboldin at 
> mirantis.com> mirantis.com>> mirantis.com> mirantis.com
> ---
>  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> index 7755dd6..62c45c8 100644
> --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
> unsigned long arg)
>* Release the existing eventfd in the source process
>*/
>   spin_lock(&files->file_lock);
> + fput(file);
Could we just call atomic_long_dec here?

We can but I don't like breaking encapsulation (which is broken anyway by the 
code). So, there is a special method and we should use it in my opinion.
it is increased by atomic_long_inc_not_zero so why don't we use the symmetric 
function?
The code with `atomic_long_inc_not_zero' call is a copy-paste of the `fget' 
function. If we want to make it clear we should make a separate function and 
name it so: `fget_from_files'.

I don't understand why there is a (exact?) copy&paste of fget there. :).
Maybe you could post a patchset, first replace the copy/paste with fget and 
then this patch. It will looks much clearer.
The code of this module received little to none review and requires some love 
at the moment.

I wanted to refactor the module completely but Thomas said it is not going to 
go into the 2.0. So I decided to make a simple one-line fix.
Another isse is do we really need a src fd here? Could we just allocate a unsed 
fd in the kernel and installed it with the target eventfd.

If you are interested this [0] is the latest version of the refactoring patch.

I can provide you with an application that checks that there is indeed no 
leakage and ensures that the `eventfd' moving works. It is being used in our 
builds as a test [1]. The code is "heredoc"ed in [2]

[0] http://dpdk.org/dev/patchwork/patch/4113/
[1] https://review.fuel-infra.org/#/c/4639/
[2] https://review.fuel-infra.org/#/c/4639/3/tests/runtests.sh

Pavel

Second thing is: another thread of the same processor can call the `sys_close' 
on the `fd' and this will dereference counter so `fput' will correctly free the 
`struct file'. Using `atomic_long_dec' will leak a `struct file' and print a 
KERN_ERR message by `filp_close'.

So, the common thing is to use appropriate functions and don't reinvent the 
wheel.

Pavel



Pavel

>   filp_close(file, files);
>   fdt = files_fdtable(files);
>   fdt->fd[eventfd_copy.source_fd] = NULL;









[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'

2015-03-23 Thread Pavel Boldin
On Mon, Mar 23, 2015 at 5:36 PM, Xie, Huawei  wrote:

> On 3/23/2015 11:27 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 5:16 PM, Xie, Huawei  huawei.xie at intel.com>> wrote:
> On 3/23/2015 10:52 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 4:41 PM, Xie, Huawei  huawei.xie at intel.com>>> wrote:
> On 3/23/2015 10:37 PM, Pavel Boldin wrote:
>
>
> On Mon, Mar 23, 2015 at 4:21 PM, Xie, Huawei  huawei.xie at intel.com>> On 3/23/2015 8:54 PM, Pavel Boldin wrote:
> > Due to increased `struct file's reference counter subsequent call
> > to `filp_close' does not free the `struct file'. Prepend `fput' call
> > to decrease the reference counter.
> >
> > Signed-off-by: Pavel Boldin  pboldin at mirantis.com>> > ---
> >  lib/librte_vhost/eventfd_link/eventfd_link.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > index 7755dd6..62c45c8 100644
> > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int
> ioctl, unsigned long arg)
> >* Release the existing eventfd in the source process
> >*/
> >   spin_lock(&files->file_lock);
> > + fput(file);
> Could we just call atomic_long_dec here?
>
> We can but I don't like breaking encapsulation (which is broken anyway by
> the code). So, there is a special method and we should use it in my opinion.
> it is increased by atomic_long_inc_not_zero so why don't we use the
> symmetric function?
> The code with `atomic_long_inc_not_zero' call is a copy-paste of the
> `fget' function. If we want to make it clear we should make a separate
> function and name it so: `fget_from_files'.
>
> I don't understand why there is a (exact?) copy&paste of fget there. :).
> Maybe you could post a patchset, first replace the copy/paste with fget
> and then this patch. It will looks much clearer.
> The code of this module received little to none review and requires some
> love at the moment.
>
> I wanted to refactor the module completely but Thomas said it is not going
> to go into the 2.0. So I decided to make a simple one-line fix.
> Another isse is do we really need a src fd here? Could we just allocate a
> unsed fd in the kernel and installed it with the target eventfd.
>
This is for DPDK team to decide and should be discussed separately.

Pavel


>
> If you are interested this [0] is the latest version of the refactoring
> patch.
>
> I can provide you with an application that checks that there is indeed no
> leakage and ensures that the `eventfd' moving works. It is being used in
> our builds as a test [1]. The code is "heredoc"ed in [2]
>
> [0] http://dpdk.org/dev/patchwork/patch/4113/
> [1] https://review.fuel-infra.org/#/c/4639/
> [2] https://review.fuel-infra.org/#/c/4639/3/tests/runtests.sh
>
> Pavel
>
> Second thing is: another thread of the same processor can call the
> `sys_close' on the `fd' and this will dereference counter so `fput' will
> correctly free the `struct file'. Using `atomic_long_dec' will leak a
> `struct file' and print a KERN_ERR message by `filp_close'.
>
> So, the common thing is to use appropriate functions and don't reinvent
> the wheel.
>
> Pavel
>
>
>
> Pavel
>
> >   filp_close(file, files);
> >   fdt = files_fdtable(files);
> >   fdt->fd[eventfd_copy.source_fd] = NULL;
>
>
>
>
>
>
>
>


[dpdk-dev] pktgen rx errors with intel 82599

2015-03-23 Thread Matt Smith

> On Mar 14, 2015, at 1:33 PM, Wiles, Keith  wrote:
> 
> Hi Matt,
> 
> On 3/14/15, 8:47 AM, "Wiles, Keith"  > wrote:
> 
>> Hi Matt
>> 
>> On 3/13/15, 3:49 PM, "Matt Smith"  wrote:
>> 
>>> 
>>> Hi,
>>> 
>>> I?ve been using DPDK pktgen 2.8.0 (built against DPDK 1.8.0 libraries) to
>>> send traffic on a server using an Intel 82599 (X520-2). Traffic gets sent
>>> out port 1 through another server which also an Intel 82599 installed and
>>> is forwarded back into port 0. When I send using a single source and
>>> destination IP address, this works fine and packets arrive on port 0 at
>>> close to the maximum line rate.
>>> 
>>> If I change port 1 to range mode and send traffic from a range of source
>>> IP addresses to a single destination IP address, for a second or two the
>>> display indicates that some packets were received on port 0 but then the
>>> rate of received packets on the display goes to 0 and all incoming
>>> packets on port 0 are registered as rx errors.
>>> 
>>> The server that traffic is being forwarded through is running the
>>> ip_pipeline example app. I ruled this out as the source of the problem by
>>> sending directly from port 1 to port 0 of the pktgen box. The issue still
>>> occurs when the traffic is not being forwarded through the other box.
>>> Since ip_pipeline is able to receive the packets and forward them without
>>> getting rx errors and it?s running with the same model of NIC as pktgen
>>> is using, I checked to see if there were any differences in
>>> initialization of the rx port between ip_pipeline and pktgen. I noticed
>>> that pktgen has a setting that ip_pipeline doesn't:
>>> 
>>> const struct rte_eth_conf port_conf = {
>>>   .rxmode = {
>>>   .mq_mode = ETH_MQ_RX_RSS,
>>> 
>>> If I comment out the .mq_mode setting and rebuild pktgen, the problem no
>>> longer occurs and I now receive packets on port 0 at near line rate when
>>> testing from a range of source addresses.
>>> 
>>> I recall reading in the past that if a receive queue fills up on an 82599
>>> , that receiving stalls for all of the other queues and no more packets
>>> can be received. Could that be happening with pktgen? Is there any
>>> debugging I can do to help track it down?
>> 
>> I have seen this problem on some platforms a few times and it looks like
>> you may have found a possible solution to the problem. I will have to look
>> into the change and see if this is the problem, but it does seem to
>> suggest this may be the issue. When the port gets into this state the port
>> receives the number mbufs matching the number of descriptors and the rest
>> are ?missed? frames at the wire. The RX counter is the number of missed
>> frames.
>> 
>> Thanks for the input
>> ++Keith
> 
> I added code to hopefully setup the correct RX/TX conf values. The HEAD of
> the Pktgen-DPDK v2.8.4 should build and work with DPDK 1.8.0 or 2.0.0-rc1.
> I did still see some RX errors and reduced bit rate, but the traffic does
> not stop on my machine. Please give version 2.8.4 a try and let me know if
> you still see problems.
> 
> Regards,
> ++Keith

Hi Keith,

Sorry for the delay in responding, I have been out of town.

Thanks for your attention to the problem. I pulled the latest code from git and 
moved to the pktgen-2.8.4 tag. I had one issue building:

  CC pktgen-port-cfg.o
/root/dpdk/pktgen-dpdk/app/pktgen-port-cfg.c: In function ?pktgen_config_ports?:
/root/dpdk/pktgen-dpdk/app/pktgen-port-cfg.c:300:11: error: variable ?k? set 
but not used [-Werror=unused-but-set-variable]
  uint64_t k;
   ^
cc1: all warnings being treated as errors
make[2]: *** [pktgen-port-cfg.o] Error 1
make[1]: *** [all] Error 2
make: *** [app] Error 2


I prepended '__attribute__((unused))? to the declaration of k and then I was 
able to build successfully. I did not see any receive errors running the 
updated binary. So once I got past the initial build problem, the issue seems 
to be resolved.

Thanks,
-Matt




[dpdk-dev] Need info on --vdev option

2015-03-23 Thread Shankari Vaidyalingam
Hi

I'm trying to do a packet capture on the DPDK interface while running l2fwd
sample application and injecting packets from a traffic generator.
I'm getting the below error when I give this command: sudo ./build/l2fw-c
0x03 -n 2 -- -p 0x03 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'

Pls let me know if I'm missing something.
I also tried checking the documentation and confirmed that the ordering of
the options is correct.
But still I'm getting this error.
I also tried various combinations like: But none of them worked.

sudo ./build/l2fwd -c 0x03 -n 2 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap' --p 0x03
sudo ./build/l2fwd -c 0x03 -n 2 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap' -- -p 0x03
sudo ./build/l2fwd -c 0x03 -n 2 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'
sudo ./build/l2fwd -c 0x03 -n 2 --p 0x03 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'
sudo ./build/l2fwd -c 0x03 -n 2 -- -p 0x03 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'
sudo ./build/l2fwd -c 0x03 -n 2 -- -p 0x03
sudo ./build/l2fwd -c 0x03 -n 2 -- -p 0x03 --vdev
'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'

EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: Setting up memory...
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff21a20 (size = 0x20)
EAL: Ask a virtual area of 0x3ec0 bytes
EAL: Virtual area found at 0x7ff1db40 (size = 0x3ec0)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1db00 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1dac0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1da80 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1da40 (size = 0x20)
EAL: Ask a virtual area of 0x40 bytes
EAL: Virtual area found at 0x7ff1d9e0 (size = 0x40)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1d9a0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1d960 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff1d920 (size = 0x20)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~2683644 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !
EAL: Master core 0 is ready (tid=1b8d6800)
EAL: Core 1 is ready (tid=d87f8700)
EAL: PCI device :00:08.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   PCI memory mapped at 0x7ff21b88
EAL: PCI device :00:09.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   PCI memory mapped at 0x7ff21b86
EAL: PCI device :00:0a.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   :00:0a.0 not managed by UIO driver, skipping
EAL: PCI device :00:11.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   :00:11.0 not managed by UIO driver, skipping
./build/l2fwd: unrecognized option '--vdev'
./build/l2fwd [EAL options] -- -p PORTMASK [-q NQ]
  -p PORTMASK: hexadecimal bitmask of ports to configure
  -q NQ: number of queue (=ports) per lcore (default is 1)
  -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to
disable, 10 default, 86400 maximum)
EAL: Error - exiting with code: 1
  Cause: Invalid L2FWD arguments


Regards
Shankari.V


[dpdk-dev] [PATCH] ethdev: additional parameter in RX callback

2015-03-23 Thread Neil Horman
On Mon, Mar 23, 2015 at 04:16:36PM +0100, Thomas Monjalon wrote:
> 2015-03-13 19:15, Neil Horman:
> > On Fri, Mar 13, 2015 at 06:28:31PM +, Mcnamara, John wrote:
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > 
> > > > > Is encoding the information in the array really a better solution 
> > > > > here?
> > > > The cb->param already exists for passing in user defined information to
> > > > the callback. The proposed patch merely transmits the parent function
> > > > arguments to the enclosed callback.
> > > > >
> > > > The cb->param can't be used here, because its opaque to the internals of
> > > > the DPDK.  rte_eth_rx_burst doesn't (and can't) know where in the cb-
> > > > >params pointer to store that information.  Thats why you added an
> > > > additional parameter in the first place, isn't it?
> > > 
> > > Yes. That is correct.
> > > 
> > Then why did you suggest doing so?
> > 
> > > > My point is that using
> > > > an array terminator keeps us out of this habbit of just adding 
> > > > parameters
> > > > to communicate more information (as thats an ABI breaking method, and 
> > > > not
> > > > particularly scalable if there is more information to be transmitted in
> > > > the future).  Using a context sensitive API set goes beyond even that, 
> > > > and
> > > > allows to retrieve arbitrary information form callbacks as needed in an
> > > > ABI safe manner
> > > 
> > > Again I can agree with this in the general case, but it isn't necessary,
> > > in this case, to encode the information in the array since it is already
> > > local to and available in the function. It seems artificial, at this 
> > > point,
> > > to implement an array terminator solution to protect an API that,
> > > effectively, hasn't been published yet.
> > > 
> > You indicate that you agree an alternate solution is preferable in the 
> > general
> > case, so as to provide an API that is extensible in a way that isn't 
> > subject to
> > ABI breakage, correct?  If so, why do assert that its not necessecary in 
> > this
> > specific case?  If you feel you need to add information so that callbacks 
> > can be
> > more flexible (in this case specifying the size of a passed in array), why
> > immediately shoehorn another parmeter in place, and break the consistency
> > between rx and tx callbacks, when you don't have to?  I don't care if you 
> > break
> > ABI today (although to call it unpublished I think is disingenuous, as lots 
> > of
> > testing and development has already taken place with the ABI as it currently
> > stands).  I care, as I noted above about not getting into the habbit of just
> > assuming a change like this requires that you invaliate ABI somehow.  You 
> > don't
> > have to, you can create an API that is fairly invariant to it here if you 
> > like.
> > The question in my mind is, why don't you?
> 
> I think John is saying that the API of rte_eth_rx_burst() already includes
> the nb_pkts parameter. So it's natural to push it to the callback.
> I also think Neil is saying that this parameter is useless in the callback
> and in rte_eth_rx_burst() if the array was null terminated.
> In any case, having a mix (null termination + parameter in rte_eth_rx_burst)
> is not acceptable.
> Moreover, I wonder how efficient are the compiler optimizations in each loop
> case (index and null termination).
> 
> As the API was using an integer count, my opinion is to keep it and push it to
> the callback for 2.0.
> If null termination is validated to be better, it could be a later rework.
> 

I'm fine with this if thats the consensus, I'm more interested in making sure we
think about these problems in such a way that we're not just running from ABI
issues, because we're eventually going to have to deal with them
Neil

> Is there something I'm missing?
> Thoughts?
> 


[dpdk-dev] Need info on --vdev option

2015-03-23 Thread Olivier MATZ
Hi Shankari,

On 03/23/2015 04:54 PM, Shankari Vaidyalingam wrote:
> Hi
> 
> I'm trying to do a packet capture on the DPDK interface while running l2fwd
> sample application and injecting packets from a traffic generator.
> I'm getting the below error when I give this command: sudo ./build/l2fw-c
> 0x03 -n 2 -- -p 0x03 --vdev
> 'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'
> 
> [...]
> EAL:   :00:11.0 not managed by UIO driver, skipping
> ./build/l2fwd: unrecognized option '--vdev'
> ./build/l2fwd [EAL options] -- -p PORTMASK [-q NQ]
>   -p PORTMASK: hexadecimal bitmask of ports to configure
>   -q NQ: number of queue (=ports) per lcore (default is 1)
>   -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to
> disable, 10 default, 86400 maximum)
> EAL: Error - exiting with code: 1
>   Cause: Invalid L2FWD arguments

Please note the position of "--" in your command line. This is
used to separate the arguments of eal and application. As --vdev
is an eal argument, it should go before the "--".

Regards,
Olivier


[dpdk-dev] [PATCH] eal_common_options.c: set create_uio_dev option to no argument

2015-03-23 Thread Olivier MATZ
Hi,

On 03/23/2015 09:11 AM, gaohaifeng wrote:
> From: Haifeng Gao 
> 
> eal options OPT_CREATE_UIO_DEV does not need argument so set it to zero.
> It needs to reset create_uio_dev explicitly.
> 
> Signed-off-by: Haifeng Gao 

Acked-by: Olivier Matz 

Thank you for fixing this.

Note for Thomas: the bug is there since my initial commit adding
the feature, you may want to add this to the commit log:
Fixes: f7f97c16048e (" pci: add option --create-uio-dev to run without
hotplug")


Regards,
Olivier


[dpdk-dev] tools brainstorming

2015-03-23 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Friday, March 20, 2015 2:51 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] tools brainstorming
> 
> As we are lazy developers, writing guidelines is not enough. It must be
> coupled with the integration of some tools. Let's work on these ones:
>   - make autotests easier and faster to run for smoke testing
>   - automated basic testpmd check
>   - build check with various options combinations
>   - abi check (started with validate-abi.sh)
>   - static analyze (clang, free online coverity)


Hi,

There was a discussion about using Coverity several months ago and Vincent set 
up an initial project.

http://dpdk.org/ml/archives/dev/2014-August/004962.html

I created a similar Coverity project and have been uploading snapshots since 
January and comparing the results to a similar tool we use internally (it 
compares well). 

If anyone would like to view the analysis results you can sign up to Coverity 
and apply for access to the DPDK project. 

https://scan.coverity.com/users/sign_up
https://scan.coverity.com/projects/4005 (DPDK)

Alternatively, drop me an email and I'll send you an invite. Apply as 
Contributor/Member if you plan to review/close issues or as Defect Viewer if 
you just wish to see the issues.

Like any static analysis tool there may be false positives. These can be 
flagged as such.

John
-- 



[dpdk-dev] tools brainstorming

2015-03-23 Thread Thomas Monjalon
2015-03-20 15:07, Butler, Siobhan A:
> I propose we also add a bug tracking tool (e.g. Bugzilla or other).

Don't you think adding a bug tracker would artificially split discussions
between mailing list threads and bug tracker entries?
I think patchwork is great because it summarizes patches pending on the
mailing list without breaking the discussion flow.
Maybe that a tool which collect bug discussions - as patchwork do for
the patches - would be ideal?

> And also a standalone page/document/archive of FAQ's.

Which kind of questions do you want to answer?
You're adding some technical questions in the release notes, which is great.
Are you thinking to anything else?

Thanks


[dpdk-dev] tools brainstorming

2015-03-23 Thread Jim Thompson


> On Mar 20, 2015, at 10:16 AM, Neil Horman  wrote:
> 
> The kernel does this with some special make targets (make allyesconfig, make
> randconfig, etc)

Not all the world is Linux. 


[dpdk-dev] [PATCH] doc: add note on needing igb_uio module for VF devs

2015-03-23 Thread Bruce Richardson
Since the uio_pci_generic module requires that the device to which it is
being bound supports legacy interrupts, there can be problems using it
with VF devices. Add a note to the GSG doc to document this fact, and
provide information on loading igb_uio as a replacement.

Signed-off-by: Bruce Richardson 
---
 doc/guides/linux_gsg/build_dpdk.rst | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/doc/guides/linux_gsg/build_dpdk.rst 
b/doc/guides/linux_gsg/build_dpdk.rst
index 255d6dc..5f0f3ae 100644
--- a/doc/guides/linux_gsg/build_dpdk.rst
+++ b/doc/guides/linux_gsg/build_dpdk.rst
@@ -166,7 +166,7 @@ Loading Modules to Enable Userspace IO for DPDK
 ---

 To run any DPDK application, a suitable uio module can be loaded into the 
running kernel.
-In most cases, the standard uio_pci_generic module included in the linux kernel
+In many cases, the standard uio_pci_generic module included in the linux kernel
 can provide the uio capability. This module can be loaded using the command

 .. code-block:: console
@@ -174,7 +174,18 @@ can provide the uio capability. This module can be loaded 
using the command
 sudo modprobe uio_pci_generic

 As an alternative to the uio_pci_generic, the DPDK also includes the igb_uio
-module which can be found in the kmod subdirectory referred to above.
+module which can be found in the kmod subdirectory referred to above. It can
+be loaded as shown below:
+
+.. code-block:: console
+
+sudo modprobe uio
+sudo insmod kmod/igb_uio.ko
+
+.. note::
+
+For some devices which lack support for legacy interrupts, e.g. virtual 
function
+(VF) devices, the igb_uio module may be needed in place of uio_pci_generic.

 Since DPDK release 1.7 onward provides VFIO support, use of UIO is optional
 for platforms that support using VFIO.
-- 
2.1.0



[dpdk-dev] tools brainstorming

2015-03-23 Thread Thomas Monjalon
2015-03-20 11:16, Neil Horman:
> On Fri, Mar 20, 2015 at 03:51:11PM +0100, Thomas Monjalon wrote:
> > - build check with various options combinations
> 
> The kernel does this with some special make targets (make allyesconfig, make
> randconfig, etc).  They basically act as build time fuzzers and are very 
> useful.
> I'm not sure that the DPDK build system is really condusive to that yet 
> though,
> since its made up of static configuration files.  This may require some
> build environment changes

Exact.
Maybe that it should be thought while trying to solve the dependencies
of the build configuration, ? la ./configure.
Adding some allyesconfig and randconfig options would be great.



[dpdk-dev] tools brainstorming

2015-03-23 Thread Thomas Monjalon
2015-03-20 16:18, Simon K?gstr?m:
> > - make autotests easier and faster to run for smoke testing
> > - automated basic testpmd check
> 
> Code coverage for automated tests can be useful as well.
> 
> In a way I'm speaking in my own interests here since I've written a tool
> to do just this (and produce nice HTML etc output), kcov, that can be
> found at github (https://github.com/SimonKagstrom/kcov).

Feel free to do some DPDK integration for your kcov tool.
It could definitely help to write good tests.

Thanks


[dpdk-dev] [PATCH] eal_common_options.c: set create_uio_dev option to no argument

2015-03-23 Thread Thomas Monjalon
> > eal options OPT_CREATE_UIO_DEV does not need argument so set it to zero.
> > It needs to reset create_uio_dev explicitly.
> > 
> > Signed-off-by: Haifeng Gao 
> 
> Acked-by: Olivier Matz 
> 
> Thank you for fixing this.
> 
> Note for Thomas: the bug is there since my initial commit adding
> the feature, you may want to add this to the commit log:
> Fixes: f7f97c16048e (" pci: add option --create-uio-dev to run without
> hotplug")

Applied, thanks


[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest

2015-03-23 Thread Olivier MATZ
Hi Neil,

On 03/19/2015 02:16 PM, Neil Horman wrote:
>> On 03/18/2015 09:58 PM, Neil Horman wrote:
 +/**
 + * Free a bulk of mbufs into its original mempool.
 + * This function assumes:
 + * - refcnt equals 1
 + * - mbufs are direct
 + * - all mbufs must belong to the same mempool
 + *
 + * @param mbufs
 + *Array of pointers to mbuf
 + * @param count
 + *Array size
 + */
 +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs,
 +   unsigned count)
 +{
 +  unsigned idx;
 +
 +  RTE_MBUF_ASSERT(count > 0);
 +
 +  for (idx = 0; idx < count; idx++) {
 +  RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool);
 +  RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1);
 +  rte_mbuf_refcnt_set(mbufs[idx], 0);
>>> This is really a misuse of the API.  The entire point of reference counting 
>>> is
>>> to know when an mbuf has no more references and can be freed.  By forcing 
>>> all
>>> the reference counts to zero here, you allow the refcnt infrastructure to be
>>> circumvented, causing memory leaks.
>>>
>>> I think what you need to do here is enhance the underlying pktmbuf interface
>>> such that an rte_mbuf structure has a destructor method association with it
>>> which is called when its refcnt reaches zero.  That way the
>>> rte_pktmbuf_bulk_free function can just decrement the refcnt on each
>>> mbuf_structure, and the pool as a whole can be returned when the destructor
>>> function discovers that all mbufs in that bulk pool are freed.
>>
>> I don't really understand what's the problem here. The API explicitly
>> describes the conditions for calling this functions: the segments are
>> directs, they belong to the same mempool, and their refcnt is 1.
>>
>> This function could be useful in a driver which knows that the mbuf
>> it allocated matches this conditions. I think an application that
>> only uses direct mbufs and one mempool could also use this function.
> 
> 
> That last condition is my issue with this patch, that the user has to know 
> what
> refcnts are.  It makes this api useful for little more than the test case that
> is provided with it.  Its irritating enough that for singly allocated mbufs 
> the
> user has to know what the refcount of a buffer is before freeing, but at least
> they can macrotize a {rte_pktmbuf_refcnt_update; if(rte_pktmbuf_refct_read) 
> then
> free} operation.
> 
> With this, you've placed the user in charge of not only doing that, but also 
> of
> managing the relationship between pktmbufs and the pool they came from.  while
> that makes sense for the test case, it really doesn't in any general use case 
> in
> which packet processing is ever deferred or queued, because it means that the
> application is now responsible for holding a pointer to every packet it
> allocates and checking its refcount periodically until it completes.
> 
> There is never any reason that an application won't need to do this 
> management,
> so making it the purview of the application to handle rather than properly
> integrating that functionality in the library is really a false savings.

There are some places where you know that the prerequisites are met,
so you can save cycles by using this function.

>From what I imagine, if in a driver you allocate mbufs, chain them and
for some reason you realize you have to free them, you can use this
function instead of freeing them one by one.

Also, as it's up to the application to decide how many mbuf pools are
created, and whether indirect mbufs are used or not, the application
can take the short path of using this function in some conditions.

Vadim, maybe you have another reason or use case for adding this
function? Could you detail why you need it and how it improves your
use case?

Regards,
Olivier


[dpdk-dev] [PATCH] ixgbe: do not include CRC in Tx byte count

2015-03-23 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of stephen at 
> networkplumber.org
> Sent: Friday, January 23, 2015 6:24 AM
> To: dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH] ixgbe: do not include CRC in Tx byte count
> 
> From: Stephen Hemminger 
> 
> The ixgbe driver was including CRC in the transmit packet byte
> count, but not for packets received.
> This was notice when forwarding and
> the number of bytes received was greater than the number of bytes transmitted
> for the same number of packets. Make the driver behave like other
> virtual devices and not include CRC in byte count. Use the same queue
> counters already computed and used for Rx.

About RX side stats - as I remember it depends to what value hw_stip_crc is set 
at configure().
If hw_stip_crc==1, then, yes CRC bytes are not included into  QBRC value.
I If hw_stip_crc==0, then CRC bytes are included into QBRC. 

I.E:
./dpdk.org/x86_64-native-linuxapp-gcc/app/testpmd -c ff -n 4 -w 86:00.1 -- -i 
--tx-queue-stats-mapping='(0,0,0)' --rx-queue-stats-mapping='(0,0,0)' 
--crc-strip
testpmd> start
testpmd> show port stats all

 NIC statistics for port 0  
  RX-packets:   1RX-errors:  0RX-bytes: 
60   // sum of QBRC[]
  RX-badcrc:0RX-badlen:  0  RX-errors:  
 0
  RX-nombuf:0
  TX-packets:   1TX-errors:  0TX-bytes: 
64  //GOTC

  Stats reg  0 RX-packets:  1RX-errors:  0RX-bytes: 
60 //QBRC[0]
  ...

  Stats reg  0 TX-packets:  1 TX-bytes: 
60   // QBTC[0]
   


./dpdk.org/x86_64-native-linuxapp-gcc/app/testpmd -c ff -n 4 -w 86:00.1 -- -i 
--tx-queue-stats-mapping='(0,0,0)' --rx-queue-stats-mapping='(0,0,0)'
testpmd> start
testpmd> show port stats all

   NIC statistics for port 0  
  RX-packets:   1RX-errors:  0RX-bytes: 
64  // sum of QBRC[]
  RX-badcrc:0RX-badlen:  0  RX-errors:  
 0
  RX-nombuf:0
  TX-packets:   1TX-errors:  0TX-bytes: 
64  //GOTC

  Stats reg  0 RX-packets:  1RX-errors:  0RX-bytes: 
64 //QBRC[0]
  

  Stats reg  0 TX-packets:  1 TX-bytes: 
60  // QBTC[0]


So now, number of RX/TX stat bytes for simple forwarding are equal when 
hw_stip_crc==0 and
differ when hw_stip_crc==1,
With your change, as I understand, it would be visa-versa.
I don' t mind changing it yours way, but wonder why that is any better?
Probably the proper patch, should also do something like that:
stats->ibytes = total_qbrc - stats->ipackets  * dev->data->dev_conf. 
hw_strip_crc * CRC_LEN;
too? 

About using sum of QBRC[] for stats.ibytes instead of GORC, I think the initial 
reason was:

http://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-gbe-controller-spec-update.html:
7 GPRC and GORCL/H Also Count Missed Packets
Problem:
GPRC (Good Packets Received Count) and GORCL/H (Good Octets Received Count) 
count missed
packets and missed packets bytes; this is not consistent with previous products.
Implication:
None.
331521-002 23
Errata-Intel(r) 82599 10 GbE Controller
Workaround:
Statistics are available indirectly for these registers. This workaround is 
included in Intel drivers.
* For GPRC - Subtract MPC (Missed Packet Count) from GPRC. Alternatively, use 
QPRC.
* For GORCL/H - use QBRCL/H (Quad Bytes Received).

Konstantin

> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index b58ec45..27355eb 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -1724,12 +1724,15 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
> rte_eth_stats *stats)
>   struct ixgbe_hw_stats *hw_stats =
>   IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
>   uint32_t bprc, lxon, lxoff, total;
> - uint64_t total_missed_rx, total_qbrc, total_qprc;
> + uint64_t total_missed_rx;
> + uint64_t total_qbrc, total_qprc, total_qbtc, total_qptc;
>   unsigned i;
> 
>   total_missed_rx = 0;
>   total_qbrc = 0;
>   total_qprc = 0;
> + total_qbtc = 0;
> + total_qptc = 0;
> 
>   hw_stats->crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS);
>   hw_stats->illerrc += IXGBE_READ_REG(hw, IXGBE_ILLERRC);
>

[dpdk-dev] tools brainstorming

2015-03-23 Thread Butler, Siobhan A


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, March 23, 2015 4:19 PM
> To: Butler, Siobhan A
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] tools brainstorming
> 
> 2015-03-20 15:07, Butler, Siobhan A:
> > I propose we also add a bug tracking tool (e.g. Bugzilla or other).
> 
> Don't you think adding a bug tracker would artificially split discussions
> between mailing list threads and bug tracker entries?
> I think patchwork is great because it summarizes patches pending on the
> mailing list without breaking the discussion flow.

I see your point, I just think it might be a useful mechanism for people 
getting started as contributors - to pick up a bug and fix it, as well
a better way of tracking the known and resolved issues which are a nightmare to 
maintain in the documentation.

> Maybe that a tool which collect bug discussions - as patchwork do for the
> patches - would be ideal?

Sounds great if one exists.

> 
> > And also a standalone page/document/archive of FAQ's.
> 
> Which kind of questions do you want to answer?
> You're adding some technical questions in the release notes, which is great.
> Are you thinking to anything else?
> 

To have these questions and answers more visible really was my main objective.

> Thanks


[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest

2015-03-23 Thread Vadim Suraev
Hi, Olivier,
No, I personally need to free a chain using mempool bulk. If I'm not
mistaken, it has been proposed by one of reviewers to have lower level
function, so it was done (I'm sorry if misunderstood)
Regards,
Vadim.
On Mar 23, 2015 8:44 PM, "Olivier MATZ"  wrote:

> Hi Neil,
>
> On 03/19/2015 02:16 PM, Neil Horman wrote:
> >> On 03/18/2015 09:58 PM, Neil Horman wrote:
>  +/**
>  + * Free a bulk of mbufs into its original mempool.
>  + * This function assumes:
>  + * - refcnt equals 1
>  + * - mbufs are direct
>  + * - all mbufs must belong to the same mempool
>  + *
>  + * @param mbufs
>  + *Array of pointers to mbuf
>  + * @param count
>  + *Array size
>  + */
>  +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs,
>  +   unsigned count)
>  +{
>  +  unsigned idx;
>  +
>  +  RTE_MBUF_ASSERT(count > 0);
>  +
>  +  for (idx = 0; idx < count; idx++) {
>  +  RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool);
>  +  RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1);
>  +  rte_mbuf_refcnt_set(mbufs[idx], 0);
> >>> This is really a misuse of the API.  The entire point of reference
> counting is
> >>> to know when an mbuf has no more references and can be freed.  By
> forcing all
> >>> the reference counts to zero here, you allow the refcnt infrastructure
> to be
> >>> circumvented, causing memory leaks.
> >>>
> >>> I think what you need to do here is enhance the underlying pktmbuf
> interface
> >>> such that an rte_mbuf structure has a destructor method association
> with it
> >>> which is called when its refcnt reaches zero.  That way the
> >>> rte_pktmbuf_bulk_free function can just decrement the refcnt on each
> >>> mbuf_structure, and the pool as a whole can be returned when the
> destructor
> >>> function discovers that all mbufs in that bulk pool are freed.
> >>
> >> I don't really understand what's the problem here. The API explicitly
> >> describes the conditions for calling this functions: the segments are
> >> directs, they belong to the same mempool, and their refcnt is 1.
> >>
> >> This function could be useful in a driver which knows that the mbuf
> >> it allocated matches this conditions. I think an application that
> >> only uses direct mbufs and one mempool could also use this function.
> >
> >
> > That last condition is my issue with this patch, that the user has to
> know what
> > refcnts are.  It makes this api useful for little more than the test
> case that
> > is provided with it.  Its irritating enough that for singly allocated
> mbufs the
> > user has to know what the refcount of a buffer is before freeing, but at
> least
> > they can macrotize a {rte_pktmbuf_refcnt_update;
> if(rte_pktmbuf_refct_read) then
> > free} operation.
> >
> > With this, you've placed the user in charge of not only doing that, but
> also of
> > managing the relationship between pktmbufs and the pool they came from.
> while
> > that makes sense for the test case, it really doesn't in any general use
> case in
> > which packet processing is ever deferred or queued, because it means
> that the
> > application is now responsible for holding a pointer to every packet it
> > allocates and checking its refcount periodically until it completes.
> >
> > There is never any reason that an application won't need to do this
> management,
> > so making it the purview of the application to handle rather than
> properly
> > integrating that functionality in the library is really a false savings.
>
> There are some places where you know that the prerequisites are met,
> so you can save cycles by using this function.
>
> From what I imagine, if in a driver you allocate mbufs, chain them and
> for some reason you realize you have to free them, you can use this
> function instead of freeing them one by one.
>
> Also, as it's up to the application to decide how many mbuf pools are
> created, and whether indirect mbufs are used or not, the application
> can take the short path of using this function in some conditions.
>
> Vadim, maybe you have another reason or use case for adding this
> function? Could you detail why you need it and how it improves your
> use case?
>
> Regards,
> Olivier
>


[dpdk-dev] tools brainstorming

2015-03-23 Thread Neil Horman
On Mon, Mar 23, 2015 at 05:18:49PM +0100, Thomas Monjalon wrote:
> 2015-03-20 15:07, Butler, Siobhan A:
> > I propose we also add a bug tracking tool (e.g. Bugzilla or other).
> 
> Don't you think adding a bug tracker would artificially split discussions
> between mailing list threads and bug tracker entries?
> I think patchwork is great because it summarizes patches pending on the
> mailing list without breaking the discussion flow.
> Maybe that a tool which collect bug discussions - as patchwork do for
> the patches - would be ideal?
> 
> > And also a standalone page/document/archive of FAQ's.
> 
> Which kind of questions do you want to answer?
> You're adding some technical questions in the release notes, which is great.
> Are you thinking to anything else?
> 
> Thanks
> 

This hasn't proven to be the case anywhere else.  Bugs tend to be something a
small number of interested parties get involved with ("interested" being defined
by the MAINTANIERS list).  Once a bug is root caused and a solution developed,
the list is used to debate its merits.

The kernel has done this for years with kernel.bugzilla.org, and it works quite
well.

Neil



[dpdk-dev] tools brainstorming

2015-03-23 Thread Neil Horman
On Mon, Mar 23, 2015 at 11:22:43AM -0500, Jim Thompson wrote:
> 
> 
> > On Mar 20, 2015, at 10:16 AM, Neil Horman  wrote:
> > 
> > The kernel does this with some special make targets (make allyesconfig, make
> > randconfig, etc)
> 
> Not all the world is Linux. 
Your point being?

Are you suggesting another method for providing configuration fuzzing?  If so,
please do so instead of making vague snarky comments that have nothing to do
with the conversation at hand.
Neil



[dpdk-dev] Need info on --vdev option

2015-03-23 Thread Shankari Vaidyalingam
Hi Olivier,

Thanks a lot.
I changed the ordering and it worked. But the pcap file was empty even when
I specified the rx interface.
I have a basic doubt.  When the ethernet interfaces are bound to teh igb
driver they become user interfaces and no longer shown in "ifconfig"
command output. By what name can we identify those interfaces after they
get bound to igb driver and become dpdk interfaces?

Regards
Shankari.V


On Mon, Mar 23, 2015 at 9:31 PM, Olivier MATZ 
wrote:

> Hi Shankari,
>
> On 03/23/2015 04:54 PM, Shankari Vaidyalingam wrote:
> > Hi
> >
> > I'm trying to do a packet capture on the DPDK interface while running
> l2fwd
> > sample application and injecting packets from a traffic generator.
> > I'm getting the below error when I give this command: sudo ./build/l2fw-c
> > 0x03 -n 2 -- -p 0x03 --vdev
> > 'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap'
> >
> > [...]
> > EAL:   :00:11.0 not managed by UIO driver, skipping
> > ./build/l2fwd: unrecognized option '--vdev'
> > ./build/l2fwd [EAL options] -- -p PORTMASK [-q NQ]
> >   -p PORTMASK: hexadecimal bitmask of ports to configure
> >   -q NQ: number of queue (=ports) per lcore (default is 1)
> >   -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to
> > disable, 10 default, 86400 maximum)
> > EAL: Error - exiting with code: 1
> >   Cause: Invalid L2FWD arguments
>
> Please note the position of "--" in your command line. This is
> used to separate the arguments of eal and application. As --vdev
> is an eal argument, it should go before the "--".
>
> Regards,
> Olivier
>


[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest

2015-03-23 Thread Neil Horman
On Mon, Mar 23, 2015 at 05:44:02PM +0100, Olivier MATZ wrote:
> Hi Neil,
> 
> On 03/19/2015 02:16 PM, Neil Horman wrote:
> >> On 03/18/2015 09:58 PM, Neil Horman wrote:
>  +/**
>  + * Free a bulk of mbufs into its original mempool.
>  + * This function assumes:
>  + * - refcnt equals 1
>  + * - mbufs are direct
>  + * - all mbufs must belong to the same mempool
>  + *
>  + * @param mbufs
>  + *Array of pointers to mbuf
>  + * @param count
>  + *Array size
>  + */
>  +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs,
>  + unsigned count)
>  +{
>  +unsigned idx;
>  +
>  +RTE_MBUF_ASSERT(count > 0);
>  +
>  +for (idx = 0; idx < count; idx++) {
>  +RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool);
>  +RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1);
>  +rte_mbuf_refcnt_set(mbufs[idx], 0);
> >>> This is really a misuse of the API.  The entire point of reference 
> >>> counting is
> >>> to know when an mbuf has no more references and can be freed.  By forcing 
> >>> all
> >>> the reference counts to zero here, you allow the refcnt infrastructure to 
> >>> be
> >>> circumvented, causing memory leaks.
> >>>
> >>> I think what you need to do here is enhance the underlying pktmbuf 
> >>> interface
> >>> such that an rte_mbuf structure has a destructor method association with 
> >>> it
> >>> which is called when its refcnt reaches zero.  That way the
> >>> rte_pktmbuf_bulk_free function can just decrement the refcnt on each
> >>> mbuf_structure, and the pool as a whole can be returned when the 
> >>> destructor
> >>> function discovers that all mbufs in that bulk pool are freed.
> >>
> >> I don't really understand what's the problem here. The API explicitly
> >> describes the conditions for calling this functions: the segments are
> >> directs, they belong to the same mempool, and their refcnt is 1.
> >>
> >> This function could be useful in a driver which knows that the mbuf
> >> it allocated matches this conditions. I think an application that
> >> only uses direct mbufs and one mempool could also use this function.
> > 
> > 
> > That last condition is my issue with this patch, that the user has to know 
> > what
> > refcnts are.  It makes this api useful for little more than the test case 
> > that
> > is provided with it.  Its irritating enough that for singly allocated mbufs 
> > the
> > user has to know what the refcount of a buffer is before freeing, but at 
> > least
> > they can macrotize a {rte_pktmbuf_refcnt_update; if(rte_pktmbuf_refct_read) 
> > then
> > free} operation.
> > 
> > With this, you've placed the user in charge of not only doing that, but 
> > also of
> > managing the relationship between pktmbufs and the pool they came from.  
> > while
> > that makes sense for the test case, it really doesn't in any general use 
> > case in
> > which packet processing is ever deferred or queued, because it means that 
> > the
> > application is now responsible for holding a pointer to every packet it
> > allocates and checking its refcount periodically until it completes.
> > 
> > There is never any reason that an application won't need to do this 
> > management,
> > so making it the purview of the application to handle rather than properly
> > integrating that functionality in the library is really a false savings.
> 
> There are some places where you know that the prerequisites are met,
> so you can save cycles by using this function.
> 
> From what I imagine, if in a driver you allocate mbufs, chain them and
> for some reason you realize you have to free them, you can use this
> function instead of freeing them one by one.
> 
> Also, as it's up to the application to decide how many mbuf pools are
> created, and whether indirect mbufs are used or not, the application
> can take the short path of using this function in some conditions.
> 
> Vadim, maybe you have another reason or use case for adding this
> function? Could you detail why you need it and how it improves your
> use case?
> 
> Regards,
> Olivier
> 

So, I think we're making different points here.

You seem to be justifying the API as it exists by finding use cases that fit
into its documented restrictions (direct buffers, refcounts at 1, etc), which
severely limit that use case set.

My assertion is that those restrictions were created because it was
inconvienient to code using the reference count as intended.  I'm saying lets
augment the reference counting mechanism so that we can use these specially
allocated mbufs in a wider variety of use cases beyond the limited set they are
currently good for

Neil



[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst

2015-03-23 Thread Matthew Hall
On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote:
> I changed it to free and it still happens. Note that the segmentation fault
> happens before that anyway.
> 
> I am using 1.7.1 at the moment. I can try using a newer version.

I'm using 1.7.X in my open-source DPDK-based app and it works, but I have an 
IGB 1-gigabit NIC though, and how RX / TX work are quite driver specific of 
course.

I suspect there's some issue with how things are working in your IXGBE NIC 
driver / setup. Do the same failures occur inside of the DPDK's own sample 
apps?

Matthew.


[dpdk-dev] Symmetric RSS Hashing, Part 2

2015-03-23 Thread Matthew Hall
On Mon, Mar 23, 2015 at 04:20:33PM +0100, Thomas Monjalon wrote:
> 2015-03-23 10:05, Matt Laswell:
> > Hey Folks,
> > 
> > I have essentially the same question as Matthew.  Has there been progress
> > in this area?
> 
> No, AFAIK.
> Submitting a patch would be a good start I think.

Hi Thomas,

We did have somebody in the past thread who tried the key recommended in a 
scientific paper to get a symmetric hash, 0x6d5a I believe. However it didn't 
seem to get the expected results when tested in real life.

We're happy to make patches but we're not sure of the content to put in there 
if the only such RSS key we heard about didn't work when it was tested. Is 
there a subject matter expert in RSS who would be able to assist in more 
detail to figure out what went wrong?

Matthew.


[dpdk-dev] tools brainstorming

2015-03-23 Thread Jim Thompson

> On Mar 23, 2015, at 12:44 PM, Neil Horman  wrote:
> 
> On Mon, Mar 23, 2015 at 11:22:43AM -0500, Jim Thompson wrote:
>> 
>> 
>>> On Mar 20, 2015, at 10:16 AM, Neil Horman  wrote:
>>> 
>>> The kernel does this with some special make targets (make allyesconfig, make
>>> randconfig, etc)
>> 
>> Not all the world is Linux. 
> Your point being?
> 
> Are you suggesting another method for providing configuration fuzzing?  If so,
> please do so instead of making vague snarky comments that have nothing to do
> with the conversation at hand.
> Neil

My point is that your usage of ?the kernel? and various kernel makefile targets 
here is specific to linux, and DPDK is supported on both linux and FreeBSD.

Jim





[dpdk-dev] [PATCH] virtio: Fix stats issue

2015-03-23 Thread Stephen Hemminger
I agree with Thomas. If some other drivers have memset they should also be
fixed


[dpdk-dev] tools brainstorming

2015-03-23 Thread Neil Horman
On Mon, Mar 23, 2015 at 04:56:32PM -0500, Jim Thompson wrote:
> 
> > On Mar 23, 2015, at 12:44 PM, Neil Horman  wrote:
> > 
> > On Mon, Mar 23, 2015 at 11:22:43AM -0500, Jim Thompson wrote:
> >> 
> >> 
> >>> On Mar 20, 2015, at 10:16 AM, Neil Horman  
> >>> wrote:
> >>> 
> >>> The kernel does this with some special make targets (make allyesconfig, 
> >>> make
> >>> randconfig, etc)
> >> 
> >> Not all the world is Linux. 
> > Your point being?
> > 
> > Are you suggesting another method for providing configuration fuzzing?  If 
> > so,
> > please do so instead of making vague snarky comments that have nothing to do
> > with the conversation at hand.
> > Neil
> 
> My point is that your usage of ?the kernel? and various kernel makefile 
> targets here is specific to linux, and DPDK is supported on both linux and 
> FreeBSD.
> 
Again, your point?  The thread topic was brainstorming ways to automate and
improve testing of the DPDK.  I pointed to the linux kernel for sources of ideas
for how we might accomplish that same goal in the DPDK (specifically by using
makefile targets to randomize and fuzz the build time configuration).

You instead seem to be focusing on the minutae of my language.  Thats not in any
way the point of my posts.  Please go back and re-read the thread before you
misguidedly decide that my thoughts are simply here to bias the build
environment in any particular direction.

Better still, if you can point to some idea that can be used to improve
automated testing, which was the purpose of this conversation at the start, that
would be great.

Neil

> Jim
> 
> 
> 
> 


[dpdk-dev] tools brainstorming

2015-03-23 Thread Matthew Hall
On Mon, Mar 23, 2015 at 05:18:49PM +0100, Thomas Monjalon wrote:
> Don't you think adding a bug tracker would artificially split discussions
> between mailing list threads and bug tracker entries?

It is difficult to track the workflow around bugs without some kind of 
bug-friendly workflow tool.

I.e.

Suspected
Confirmed
Triaged
Patched
Tested
Committed (in master / development branch)
Released (in version X.Y.Z)

etc.

Importantly, Suspected, Confirmed, and Triaged all happen before anything 
appears in a patch manager tool if I'm not mistaken, and Committed and 
Released are also not likely present there. So a patch tool only covers 2/7 
steps by itself.

There are a number of bug tools which have email integration, Bugzilla or 
Debian BTS. The Debian BTS is especially email friendly, since Debian is a big 
email community. Since DPDK is for hackers I think it makes more sense to be 
email friendly than web-friendly like Github or JIRA.

Matthew.


[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest

2015-03-23 Thread Ananyev, Konstantin
Hi Vadim,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vadim Suraev
> Sent: Monday, March 23, 2015 5:31 PM
> To: Olivier MATZ
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions 
> added + unittest
> 
> Hi, Olivier,
> No, I personally need to free a chain using mempool bulk. If I'm not
> mistaken, it has been proposed by one of reviewers to have lower level
> function, so it was done (I'm sorry if misunderstood)

Was it me?
As I remember, I said it would be good to create rte_pktmbuf_bulk_free() or 
rte_pktmbuf_seg_bulk_free() -
that would free a bulk of mbufs segments in the same manner as 
rte_pktmbuf_free_chain() does:
count number of consecutive mbufs from the same mempool to be freed and then 
put them back into the pool at one go.
Such function would be useful inside PMD code.
In fact we already using analogy of such function inside vPMD TX code.
Though from my point, such function should be generic as 
rte_pktmbuf_free_chain() -
no special limitations like all mbufs from one pool, refcnt==1, etc.
So if it was me who confused you - I am sorry.
Konstantin

> Regards,
> Vadim.
> On Mar 23, 2015 8:44 PM, "Olivier MATZ"  wrote:
> 
> > Hi Neil,
> >
> > On 03/19/2015 02:16 PM, Neil Horman wrote:
> > >> On 03/18/2015 09:58 PM, Neil Horman wrote:
> >  +/**
> >  + * Free a bulk of mbufs into its original mempool.
> >  + * This function assumes:
> >  + * - refcnt equals 1
> >  + * - mbufs are direct
> >  + * - all mbufs must belong to the same mempool
> >  + *
> >  + * @param mbufs
> >  + *Array of pointers to mbuf
> >  + * @param count
> >  + *Array size
> >  + */
> >  +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs,
> >  +   unsigned count)
> >  +{
> >  +  unsigned idx;
> >  +
> >  +  RTE_MBUF_ASSERT(count > 0);
> >  +
> >  +  for (idx = 0; idx < count; idx++) {
> >  +  RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool);
> >  +  RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1);
> >  +  rte_mbuf_refcnt_set(mbufs[idx], 0);
> > >>> This is really a misuse of the API.  The entire point of reference
> > counting is
> > >>> to know when an mbuf has no more references and can be freed.  By
> > forcing all
> > >>> the reference counts to zero here, you allow the refcnt infrastructure
> > to be
> > >>> circumvented, causing memory leaks.
> > >>>
> > >>> I think what you need to do here is enhance the underlying pktmbuf
> > interface
> > >>> such that an rte_mbuf structure has a destructor method association
> > with it
> > >>> which is called when its refcnt reaches zero.  That way the
> > >>> rte_pktmbuf_bulk_free function can just decrement the refcnt on each
> > >>> mbuf_structure, and the pool as a whole can be returned when the
> > destructor
> > >>> function discovers that all mbufs in that bulk pool are freed.
> > >>
> > >> I don't really understand what's the problem here. The API explicitly
> > >> describes the conditions for calling this functions: the segments are
> > >> directs, they belong to the same mempool, and their refcnt is 1.
> > >>
> > >> This function could be useful in a driver which knows that the mbuf
> > >> it allocated matches this conditions. I think an application that
> > >> only uses direct mbufs and one mempool could also use this function.
> > >
> > >
> > > That last condition is my issue with this patch, that the user has to
> > know what
> > > refcnts are.  It makes this api useful for little more than the test
> > case that
> > > is provided with it.  Its irritating enough that for singly allocated
> > mbufs the
> > > user has to know what the refcount of a buffer is before freeing, but at
> > least
> > > they can macrotize a {rte_pktmbuf_refcnt_update;
> > if(rte_pktmbuf_refct_read) then
> > > free} operation.
> > >
> > > With this, you've placed the user in charge of not only doing that, but
> > also of
> > > managing the relationship between pktmbufs and the pool they came from.
> > while
> > > that makes sense for the test case, it really doesn't in any general use
> > case in
> > > which packet processing is ever deferred or queued, because it means
> > that the
> > > application is now responsible for holding a pointer to every packet it
> > > allocates and checking its refcount periodically until it completes.
> > >
> > > There is never any reason that an application won't need to do this
> > management,
> > > so making it the purview of the application to handle rather than
> > properly
> > > integrating that functionality in the library is really a false savings.
> >
> > There are some places where you know that the prerequisites are met,
> > so you can save cycles by using this function.
> >
> > From what I imagine, if in a driver you allocate mbufs, chain them and
> > for some reason you realize you have

[dpdk-dev] [PATCH v2] librte_pmd_bond: remove memory alloc for rte_pci_driver

2015-03-23 Thread Jia Yu
eth_driver already contains rte_pci_driver data structure.
Allocating rte_pci_driver without referencing it after bond
creation causes memory leakage.

Added signed off information.

Signed-off-by: Jia Yu 
---
 lib/librte_pmd_bond/rte_eth_bond_api.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c 
b/lib/librte_pmd_bond/rte_eth_bond_api.c
index 903b7c3..13f3941 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_api.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
@@ -230,11 +230,8 @@ rte_eth_bond_create(const char *name, uint8_t mode, 
uint8_t socket_id)
goto err;
}

-   pci_drv = rte_zmalloc_socket(name, sizeof(*pci_drv), 0, socket_id);
-   if (pci_drv == NULL) {
-   RTE_BOND_LOG(ERR, "Unable to malloc pci_drv on socket");
-   goto err;
-   }
+   pci_drv = ð_drv->pci_drv;
+
pci_id_table = rte_zmalloc_socket(name, sizeof(*pci_id_table), 0, 
socket_id);
if (pci_id_table == NULL) {
RTE_BOND_LOG(ERR, "Unable to malloc pci_id_table on socket");
@@ -266,9 +263,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)
pci_dev->numa_node = socket_id;
pci_drv->name = driver_name;

-   eth_drv->pci_drv = (struct rte_pci_driver)(*pci_drv);
eth_dev->driver = eth_drv;
-
eth_dev->data->dev_private = internals;
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;
@@ -325,8 +320,6 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)
 err:
if (pci_dev)
rte_free(pci_dev);
-   if (pci_drv)
-   rte_free(pci_drv);
if (pci_id_table)
rte_free(pci_id_table);
if (eth_drv)
-- 
1.9.1