[dpdk-dev] EAL: memzone_reserve_aligned_thread_unsafe(): No more room in config

2016-05-19 Thread 张伟
Hi all, 


When using dpdk multi process client server example, I create many clients. 
After the number of clients 1239, I met this error:

EAL: memzone_reserve_aligned_thread_unsafe(): No more room in config

RING: Cannot reserve memory

EAL: Error - exiting with code: 1

  Cause: Cannot create tx ring queue for client 1239

I have 32G huge page memory. Can anyone give some guidance how to increase the 
memzone memory? Which parameter should I adjust it? 


[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Jerin Jacob
On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:
> On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:
> > To avoid multiple stores on fast path, Ethernet drivers
> > aggregate the writes to data_off, refcnt, nb_segs and port
> > to an uint64_t data and write the data in one shot
> > with uint64_t* at &mbuf->rearm_data address.
> > 
> > Some of the non-IA platforms have store operation overhead
> > if the store address is not naturally aligned.This patch
> > fixes the performance issue on those targets.
> > 
> > Signed-off-by: Jerin Jacob 
> > ---
> > 
> > Tested this patch on IA and non-IA(ThunderX) platforms.
> > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector 
> > environment.
> > and this patch does not have any overhead on IA platform.
> > 
> > Have tried an another similar approach by replacing "buf_len" with "pad"
> > (in this patch context),
> > Since it has additional overhead on read and then mask to keep "buf_len" 
> > intact,
> > not much improvement is not shown.
> > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html
> > 
> > ---
> While this will work and from your tests doesn't seem to have a performance
> impact, I'm not sure I particularly like it. It's extending out the end of
> cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically 
> using
> up any more space of it.

Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 64 
bytes
in the first 64-byte cache line.

> 
> What I'm wondering about though, is do we have any usecases where we need a
> variable buf_len for packets for RX. These mbufs come directly from a mempool,
> which is generally understood to be a set of fixed-sized buffers. I realise 
> that
> this change was made in the past after some discussion, but one of the key 
> points
> there [at least to my reading] was that - even though nobody actually made a
> concrete case where they had variable-sized buffers - having support for them
> made no performance difference.
> 
> The latter part of that has now changed, and supporting variable-sized mbufs
> from an mbuf pool has a perf impact. Do we definitely need that functionality,
> because the easiest fix here is just to move the rxrearm marker back above
> mbuf_len as it was originally in releases like 1.8?

And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf).
Right?

I don't have a strong opinion on this, I can do this if there is no
objection on this. Let me know.

However, I do see in future, "buf_len" may belong at the end of the first 64 
byte
cache line as currently "port" is defined as uint8_t, IMO, that is less.
We may need to increase that uint16_t. The reason why I think that
because, Currently in ThunderX HW, we do have 128VFs per socket for
built-in NIC, So, the two node configuration and one external PCIe NW card
configuration can easily go beyond 256 ports.

> 
> Regards,
> /Bruce
> 
> Ref: http://dpdk.org/ml/archives/dev/2014-December/009432.html
> 


[dpdk-dev] Query on RSS Rule

2016-05-19 Thread Lu, Wenzhuo
Hi Nishant,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Nishant Verma
> Sent: Thursday, May 19, 2016 7:06 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Query on RSS Rule
> 
> ?Hi All,
> 
> It's very basic question, but somehow i am blocked due to this issue.
> Please help me out.
> 
> I have configured NTUPLE filter in my application with just Destination IP 
> every 
> thing else(SRC IP, S_PORT, D_PORT & proto) is disabled.
Suppose you're using 5-tuple, right? Suppose you're using a igb or ixgbe NIC as 
5-tuple is only supported by igb/ixgbe, right?
Would you like to let us know what you've done? I mean how you disable the 
other things. 
I think you might set the mask to do that. And please aware if the mask is FF, 
means the field is used. On the contrary, the mask should be 0.

> But whenever i send packet from any machine, it means Different Source IP,
> hash value at DPDK app side changed and hence result in, Arrival of packet at
> different queue.
> 
> Any hint would be appreciated.
> 
> Thanks
> 
> --
> Rgds,
> ?NV


[dpdk-dev] [PATCH v4] eal: make hugetlb initialization more robust

2016-05-19 Thread Tan, Jianfeng
Hi David,


On 5/18/2016 12:39 AM, David Marchand wrote:
> Hello Jianfeng,
>
> On Thu, May 12, 2016 at 2:44 AM, Jianfeng Tan  
> wrote:
>> This patch adds an option, --huge-trybest, to use a recover mechanism to
>> the case that there are not so many hugepages (declared in sysfs), which
>> can be used. It relys on a mem access to fault-in hugepages, and if fails
>> with SIGBUS, recover to previously saved stack environment with
>> siglongjmp().
>>
>> Besides, this solution fixes an issue when hugetlbfs is specified with an
>> option of size. Currently DPDK does not respect the quota of a hugetblfs
>> mount. It fails to init the EAL because it tries to map the number of free
>> hugepages in the system rather than using the number specified in the quota
>> for that mount.
>>
>> It's still an open issue with CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS. Under
>> this case (such as IVSHMEM target), having hugetlbfs mounts with quota will
>> fail to remap hugepages as it relies on having mapped all free hugepages
>> in the system.
>
>
>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> index 5b9132c..8c77010 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> @@ -417,12 +434,33 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
>>  hugepg_tbl[i].final_va = virtaddr;
>>  }
>>
>> +   if (orig && internal_config.huge_trybest) {
>> +   /* In linux, hugetlb limitations, like cgroup, are
>> +* enforced at fault time instead of mmap(), even
>> +* with the option of MAP_POPULATE. Kernel will send
>> +* a SIGBUS signal. To avoid to be killed, save stack
>> +* environment here, if SIGBUS happens, we can jump
>> +* back here.
>> +*/
>> +   if (wrap_sigsetjmp()) {
>> +   RTE_LOG(DEBUG, EAL, "SIGBUS: Cannot mmap 
>> more "
>> +   "hugepages of size %u MB\n",
>> +   (unsigned)(hugepage_sz / 0x10));
> For such a case case, maybe having some warning log message when it
> fails would help the user.
> + a known issue in the release notes ?

Do you mean when sigbus is triggered, like here, warn the user that "it 
fails to hold all free hugepages as sysfs shows", and
#ifdef RTE_EAL_SINGLE_FILE_SEGMENTS
/*we need to return error from rte_eal_init_memory */
#endif

Thanks,
Jianfeng


[dpdk-dev] [PATCH v4] eal: make hugetlb initialization more robust

2016-05-19 Thread Tan, Jianfeng
Hi Thomas & Sergio,


On 5/18/2016 4:06 PM, Sergio Gonzalez Monroy wrote:
> On 17/05/2016 17:40, Thomas Monjalon wrote:
>> 2016-05-12 00:44, Jianfeng Tan:
>>> This patch adds an option, --huge-trybest, to use a recover 
>>> mechanism to
>>> the case that there are not so many hugepages (declared in sysfs), 
>>> which
>>> can be used. It relys on a mem access to fault-in hugepages, and if 
>>> fails
>> relys -> relies
>>
>>> with SIGBUS, recover to previously saved stack environment with
>>> siglongjmp().
>>>
>>> Besides, this solution fixes an issue when hugetlbfs is specified 
>>> with an
>>> option of size. Currently DPDK does not respect the quota of a 
>>> hugetblfs
>>> mount. It fails to init the EAL because it tries to map the number 
>>> of free
>>> hugepages in the system rather than using the number specified in 
>>> the quota
>>> for that mount.
>> It looks to be a bug. Why adding an option?
>> What is the benefit of the old behaviour, not using --try-best?
>
> I do not see any benefit to the old behavior.
> Given that we need the signal handling for the cgroup use case, I 
> would be inclined to use
> this method as the default instead of trying to figure out how many 
> hugepages we have free, etc.
>
> Thoughts?

I tend to use this method as the default too, with some warning logs as 
suggested by David, and return error from rte_eal_memory() when sigbus 
is triggered under the case of RTE_EAL_SINGLE_FILE_SEGMENTS.

Thomas, all other trivial issues will be fixed in next version. Thank you!

Thanks,
Jianfeng

>
> Sergio
>
>>> +static sigjmp_buf jmpenv;
>>> +
>>> +static void sigbus_handler(int signo __rte_unused)
>>> +{
>>> +siglongjmp(jmpenv, 1);
>>> +}
>>> +
>>> +/* Put setjmp into a wrap method to avoid compiling error. Any 
>>> non-volatile,
>>> + * non-static local variable in the stack frame calling sigsetjmp 
>>> might be
>>> + * clobbered by a call to longjmp.
>>> + */
>>> +static int wrap_sigsetjmp(void)
>>> +{
>>> +return sigsetjmp(jmpenv, 1);
>>> +}
>> Please add the word "huge" to these variables and functions.
>>
>>> +static struct sigaction action_old;
>>> +static int need_recover;
>>> +
>>> +static void
>>> +register_sigbus(void)
>>> +{
>>> +sigset_t mask;
>>> +struct sigaction action;
>>> +
>>> +sigemptyset(&mask);
>>> +sigaddset(&mask, SIGBUS);
>>> +action.sa_flags = 0;
>>> +action.sa_mask = mask;
>>> +action.sa_handler = sigbus_handler;
>>> +
>>> +need_recover = !sigaction(SIGBUS, &action, &action_old);
>>> +}
>>> +
>>> +static void
>>> +recover_sigbus(void)
>>> +{
>>> +if (need_recover) {
>>> +sigaction(SIGBUS, &action_old, NULL);
>>> +need_recover = 0;
>>> +}
>>> +}
>> Idem, Please add the word "huge".
>>
>



[dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

2016-05-19 Thread Tan, Jianfeng
Hi David,


On 5/18/2016 8:46 PM, David Marchand wrote:
> Hello Jianfeng,
>
> On Wed, Mar 9, 2016 at 2:05 PM, Panu Matilainen  
> wrote:
>> On 03/08/2016 07:38 PM, Tan, Jianfeng wrote:
>>> Hi Panu,
>>>
>>> On 3/8/2016 4:54 PM, Panu Matilainen wrote:
 On 03/04/2016 12:05 PM, Jianfeng Tan wrote:
> This patch adds option, --avail-cores, to use lcores which are available
> by calling pthread_getaffinity_np() to narrow down detected cores before
> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>
> Test example:
> $ taskset 0xc ./examples/helloworld/build/helloworld \
>  --avail-cores -m 1024
>
> Signed-off-by: Jianfeng Tan 
> Acked-by: Neil Horman 

 Hmm, to me this sounds like something that should be done always so
 there's no need for an option. Or if there's a chance it might do the
 wrong thing in some rare circumstance then perhaps there should be a
 disabler option instead?
>>>
>>> Thanks for comments.
>>>
>>> Yes, there's a use case that we cannot handle.
>>>
>>> If we make it as default, DPDK applications may fail to start, when user
>>> specifies a core in isolcpus and its parent process (say bash) has a
>>> cpuset affinity that excludes isolcpus. Originally, DPDK applications
>>> just blindly do pthread_setaffinity_np() and it always succeeds because
>>> it always has root privilege to change any cpu affinity.
>>>
>>> Now, if we do the checking in rte_eal_cpu_init(), those lcores will be
>>> flagged as undetected (in my older implementation) and leads to failure.
>>> To make it correct, we would always add "taskset mask" (or other ways)
>>> before DPDK application cmd lines.
>>>
>>> How do you think?
>>
>> I still think it sounds like something that should be done by default and
>> maybe be overridable with some flag, rather than the other way around.
>> Another alternative might be detecting the cores always but if running as
>> root, override but with a warning.
>>
>> But I dont know, just wondering. To look at it from another angle: why would
>> somebody use this new --avail-cores option and in what situation, if things
>> "just work" otherwise anyway?
> +1 and I don't even see why we should have an option to disable this,
> since taskset would do the job.
>
> Looking at your special case, if the user did set an isolcpus option
> for another use, with no -c/-l, I understand the dpdk application
> won't care too much about it.
> So, this seems like somehow rude to the rest of the system and unwanted.

The case you mentioned above is not the case I mean. But you make your 
point about this one.
The case I originally mean: user sets an isolcpus option for DPDK 
applications. Originally, DPDK apps would be started without any 
problem. But for now, fail to start them because the required cores are 
excluded before -c/-l. As per your comments following, we can add a 
warning message (or should we quit on this situation?). But it indeed 
has an effect on old users (they should changed to use "taskset 
./dpdk_app ..."). Do you think it's a problem?

Thanks,
Jianfeng


>
> We can still help the user starting its application as root (without
> taskset) by adding a warning message if a requested cpu (-c / -l ..)
> is not part of the available cpus.
>
>



[dpdk-dev] Query on RSS Rule

2016-05-19 Thread Lu, Wenzhuo
Hi Nishant,

From: Nishant Verma [mailto:vnis...@gmail.com]
Sent: Thursday, May 19, 2016 10:29 AM
To: Lu, Wenzhuo
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Query on RSS Rule


Hi Wenzhuo,
Thanks for the reply. Yes, i am using ixgbe.

On software front, this is what i am doing.

I am using DPDK 16.04 and pktgen 3.0.00

On my DPDK machine, i have configured RSS rule just for Destination IP 
(172.10.10.2).

[rss]


[dpdk-dev] Query on RSS Rule

2016-05-19 Thread Lu, Wenzhuo
Hi Nishant,

From: Nishant Verma [mailto:vnis...@gmail.com]
Sent: Thursday, May 19, 2016 11:21 AM
To: Lu, Wenzhuo
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Query on RSS Rule

Hi Wenzhuo,
Tried UDP as well as TCP.
Also use dump function to check if packet it correct or not. I found packet 
perfectly fine.
but anyhow problem still remain the same.
Wenzhuo: Glad to know the problem is not related to the protocol. I don?t find 
anything wrong in your code, except I?m not sure if the mask for dst_ip is 
right. Suppose it?s right ? Seems having to check if the register is right. I 
mean the registers in the function ixgbe_add_5tuple_filter. Not sure if it?s 
easy to check them in your APP. If not, maybe you can try testpmd first.

On Wed, May 18, 2016 at 11:09 PM, Lu, Wenzhuo mailto:wenzhuo.lu at intel.com>> wrote:
Hi Nishant,

From: Nishant Verma [mailto:vnish11 at gmail.com]
Sent: Thursday, May 19, 2016 10:29 AM
To: Lu, Wenzhuo
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Query on RSS Rule


Hi Wenzhuo,
Thanks for the reply. Yes, i am using ixgbe.

On software front, this is what i am doing.

I am using DPDK 16.04 and pktgen 3.0.00

On my DPDK machine, i have configured RSS rule just for Destination IP 
(172.10.10.2).

[rss]


[dpdk-dev] [PATCH 4/4] pmd_hw_support.py: Add tool to query binaries for hw support information

2016-05-19 Thread Panu Matilainen
On 05/18/2016 04:48 PM, Neil Horman wrote:
> On Wed, May 18, 2016 at 03:48:12PM +0300, Panu Matilainen wrote:
>> On 05/18/2016 03:03 PM, Neil Horman wrote:
>>> On Wed, May 18, 2016 at 02:48:30PM +0300, Panu Matilainen wrote:
 On 05/16/2016 11:41 PM, Neil Horman wrote:
> This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF 
> binary,
> and, if found parses the remainder of the string as a json encoded string,
> outputting the results in either a human readable or raw, script parseable
> format
>
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 
> ---
>  tools/pmd_hw_support.py | 174 
> 
>  1 file changed, 174 insertions(+)
>  create mode 100755 tools/pmd_hw_support.py
>
> diff --git a/tools/pmd_hw_support.py b/tools/pmd_hw_support.py
> new file mode 100755
> index 000..0669aca
> --- /dev/null
> +++ b/tools/pmd_hw_support.py
> @@ -0,0 +1,174 @@
> +#!/usr/bin/python3

 I think this should use /usr/bin/python to be consistent with the other
 python scripts, and like the others work with python 2 and 3. I only tested
 it with python2 after changing this and it seemed to work fine so the
 compatibility side should be fine as-is.

>>> Sure, I can change the python executable, that makes sense.
>>>
 On the whole, AFAICT the patch series does what it promises, and works for
 both static and shared linkage. Using JSON formatted strings in an ELF
 section is a sound working technical solution for the storage of the data.
 But the difference between the two cases makes me wonder about this all...
>>> You mean the difference between checking static binaries and dynamic 
>>> binaries?
>>> yes, there is some functional difference there
>>>

 For static library build, you'd query the application executable, eg
>>> Correct.
>>>
 testpmd, to get the data out. For a shared library build, that method gives
 absolutely nothing because the data is scattered around in individual
 libraries which might be just about wherever, and you need to somehow
>>> Correct, I figured that users would be smart enough to realize that with
>>> dynamically linked executables, they would need to look at DSO's, but I 
>>> agree,
>>> its a glaring diffrence.
>>
>> Being able to look at DSOs is good, but expecting the user to figure out
>> which DSOs might be loaded and not and where to look is going to be well
>> above many users. At very least it's not what I would call user-friendly.
>>
> I disagree, there is no linkage between an application and the dso's it opens
> via dlopen that is exportable.  The only way to handle that is to have a
> standard search path for the pmd_hw_info python script.  Thats just like 
> modinfo
> works (i.e. "modinfo bnx2" finds the bnx2 module for the running kernel).  We
> can of course do something simmilar, but we have no existing implicit path
> information to draw from to do that (because you can have multiple dpdk 
> installs
> side by side).  The only way around that is to explicitly call out the path on
> the command line.

There's no telling what libraries user might load at runtime with -D, 
that is true for both static and shared libraries.

When CONFIG_RTE_EAL_PMD_PATH is set, as it is likely to be on distro 
builds, you *know* that everything in that path will be loaded on 
runtime regardless of what commandline options there might be so the 
situation is actually on par with static builds. Of course you still 
dont know about ones added with -D but that's a limitation of any 
solution that works without actually running the app.

>
 discover the location + correct library files to be able to query that. For
 the shared case, perhaps the script could be taught to walk files in
 CONFIG_RTE_EAL_PMD_PATH to give in-the-ballpark correct/identical results
>>> My initial thought would be to run ldd on the executable, and use a 
>>> heuristic to
>>> determine relevant pmd DSO's, and then feed each of those through the python
>>> script.  I didn't want to go to that trouble unless there was consensus on 
>>> it
>>> though.
>>
>> Problem is, ldd doesn't know about them either because the pmds are not
>> linked to the executables at all anymore. They could be force-linked of
>> course, but that means giving up the flexibility of plugins, which IMO is a
>> no-go. Except maybe as an option, but then that would be a third case to
>> support.
>>
> Thats not true at all, or at least its a perfectly valid way to link the DSO's
> in at link time via -lrte_pmd_.  Its really just the dlopen case we 
> need
> to worry about.  I would argue that, if they're not explicitly linked in like
> that, then its correct to indicate that an application supports no hardware,
> because it actually doesn't, it only supports the p

[dpdk-dev] [PATCH 19/20] thunderx/nicvf: updated driver documentation and release notes

2016-05-19 Thread Jerin Jacob
On Tue, May 17, 2016 at 04:31:58PM +, Mcnamara, John wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Saturday, May 7, 2016 4:17 PM
> > To: dev at dpdk.org
> > Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> > ; Jerin Jacob
> > ; Slawomir Rosek
> > 
> > Subject: [dpdk-dev] [PATCH 19/20] thunderx/nicvf: updated driver
> > documentation and release notes
> 
> Hi,
> 
> Very good documentation. The content is quite clear and almost no RST issues.
> The only comment is on some of the long lines. In general console blocks
> have to be wrapped at 80 chars or else they go off the page in the PDF docs.
> I see that you did that in some places but not in others.
> 
> It is worth building the pdf docs to check for that:
> 
> make doc-guides-pdf
> mupdf build/doc/pdf/guides/nics.pdf &
> 
> Some minor comments below:

Thanks John for the review. Will fix it in v2.

> 
> 
> > +
> > +#. Start ``testpmd`` with basic parameters:
> > +
> > +   .. code-block:: console
> > +
> > +  ./arm64-thunderx-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w
> > + 0002:01:00.2 -- -i --disable-hw-vlan-filter --crc-strip --no-flush-rx
> > + --port-topology=loop
> 
> Would be better wrapped as something like this:
> 
>.. code-block:: console
> 
>   ./arm64-thunderx-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w 0002:01:00.2 \
>   -- -i --disable-hw-vlan-filter --crc-strip --no-flush-rx
>  --port-topology=loop
> 
> 
> > +
> > +   Example output:
> > +
> > +   .. code-block:: console
> > +
> > +  ...
> > +
> > +  PMD: rte_nicvf_pmd_init(): librte_pmd_thunderx nicvf version 1.0
> > +
> > +  ...
> > +  EAL:   probe driver: 177d:11 rte_nicvf_pmd
> > +  EAL:   using IOMMU type 1 (Type 1)
> > +  EAL:   PCI memory mapped at 0x3ffade5
> > +  EAL: Trying to map BAR 4 that contains the MSI-X table. Trying
> > offsets: 0x400:0x, 0x1:0x1f
> > +  EAL:   PCI memory mapped at 0x3ffadc6
> > +  PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2
> > +  PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false
> > loopback_supported=true
> > +  PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01
> > +  Interactive-mode selected
> > +  Configuring Port 0 (socket 0)
> 
> 
> Also, this should be wrapped (even though it is the actual output):
> 
>   ...
>   EAL:   probe driver: 177d:11 rte_nicvf_pmd
>   EAL:   using IOMMU type 1 (Type 1)
>   EAL:   PCI memory mapped at 0x3ffade5
>   EAL: Trying to map BAR 4 that contains the MSI-X table.
>Trying offsets: 0x400:0x, 0x1:0x1f
>   EAL:   PCI memory mapped at 0x3ffadc6
>   PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2
>   PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false
>loopback_supported=true
>   PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01
>   Interactive-mode selected
>   Configuring Port 0 (socket 0)
>   ...
> 
> 
> > +SR-IOV: Prerequisites and sample Application Notes
> > +~~
> > +
> > +Current ThunderX NIC PF/VF kernel modules maps each physical Ethernet
> > +port automatically to virtual function (VF) and presented as PCIe-like
> > SR-IOV device.
> 
> 
> Slightly better as:
> 
> Current ThunderX NIC PF/VF kernel modules maps each physical Ethernet port
> automatically to virtual functions (VF) and presents them as PCIe-like SR-IOV 
> device.
> 
> 
> > +   Example qemu guest launch command:
> > +
> > +   .. code-block:: console
> > +
> > +  sudo qemu-system-aarch64 -name vm1 -machine
> > virt,gic_version=3,accel=kvm,usb=off \
> > +  -cpu host -m 4096 \
> > +  -smp 4,sockets=1,cores=8,threads=1 \
> > +  -nographic -nodefaults \
> > +  -kernel  \
> 
> Also wrap the first line:
> 
>.. code-block:: console
> 
>   sudo qemu-system-aarch64 -name vm1 \
>   -machine virt,gic_version=3,accel=kvm,usb=off \
>   -cpu host -m 4096 \
>   ...
> 
> 
> Apart from those small changes:
> 
> Acked-by: John McNamara 
> 
> 
> 
> 


[dpdk-dev] [PATCH v2] mbuf: add helpers to prefetch mbuf

2016-05-19 Thread Jerin Jacob
On Wed, May 18, 2016 at 06:02:08PM +0200, Olivier Matz wrote:
> Some architectures (ex: Power8) have a cache line size of 128 bytes,
> so the drivers should not expect that prefetching the second part of
> the mbuf with rte_prefetch0(&m->cacheline1) is valid.
> 
> This commit add helpers that can be used by drivers to prefetch the
> rx or tx part of the mbuf, whatever the cache line size.
> 
> Signed-off-by: Olivier Matz 

Reviewed-by: Jerin Jacob 

> ---
> 
> v1 -> v2:
> - rename part0 as part1 and part1 as part2, as suggested by Thomas
> 
> 
>  drivers/net/fm10k/fm10k_rxtx_vec.c |  8 
>  drivers/net/i40e/i40e_rxtx_vec.c   |  8 
>  drivers/net/ixgbe/ixgbe_rxtx_vec.c |  8 
>  drivers/net/mlx4/mlx4.c|  4 ++--
>  drivers/net/mlx5/mlx5_rxtx.c   |  4 ++--
>  examples/ipsec-secgw/ipsec-secgw.c |  2 +-
>  lib/librte_mbuf/rte_mbuf.h | 38 
> ++
>  7 files changed, 55 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 03e4a5c..ef256a5 100644
> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> @@ -487,10 +487,10 @@ fm10k_recv_raw_pkts_vec(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
>   rte_compiler_barrier();
>  
>   if (split_packet) {
> - rte_prefetch0(&rx_pkts[pos]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1);
> + rte_mbuf_prefetch_part2(rx_pkts[pos]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
>   }
>  
>   /* D.1 pkt 3,4 convert format from desc to pktmbuf */
> diff --git a/drivers/net/i40e/i40e_rxtx_vec.c 
> b/drivers/net/i40e/i40e_rxtx_vec.c
> index f7a62a8..eef80d9 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec.c
> @@ -297,10 +297,10 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct 
> rte_mbuf **rx_pkts,
>   _mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
>  
>   if (split_packet) {
> - rte_prefetch0(&rx_pkts[pos]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1);
> + rte_mbuf_prefetch_part2(rx_pkts[pos]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
>   }
>  
>   /* avoid compiler reorder optimization */
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c 
> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index c4d709b..e97ea82 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -307,10 +307,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct 
> rte_mbuf **rx_pkts,
>   _mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
>  
>   if (split_packet) {
> - rte_prefetch0(&rx_pkts[pos]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1);
> - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1);
> + rte_mbuf_prefetch_part2(rx_pkts[pos]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
> + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
>   }
>  
>   /* avoid compiler reorder optimization */
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index c5d8535..733d192 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -3235,8 +3235,8 @@ mlx4_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
> uint16_t pkts_n)
>* Fetch initial bytes of packet descriptor into a
>* cacheline while allocating rep.
>*/
> - rte_prefetch0(seg);
> - rte_prefetch0(&seg->cacheline1);
> + rte_mbuf_prefetch_part1(seg);
> + rte_mbuf_prefetch_part2(seg);
>   ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
>   &flags);
>   if (unlikely(ret < 0)) {
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 1832a21..5be8c62 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c

[dpdk-dev] [PATCH 01/29] ixgbe/base: add new VF requests for mailbox API

2016-05-19 Thread Gu, YongjieX
Tested-by: Yongjie Gu 

- Check patch: success
- Apply patch: success
- compilation: success
OS: fedora20
GCC: gcc_x86-64, 4.8.3
ICC: 16.0.2
Commit: 84c9b5a9fe926f1aa033dc5352be8d4a5e0b789d
i686-native-linuxapp-icc: compile pass
x86_64-native-linuxapp-gcc-combined: compile pass
i686-native-linuxapp-gcc: compile pass
x86_64-native-linuxapp-gcc: compile pass
x86_64-native-linuxapp-icc: compile pass
x86_64-native-linuxapp-gcc-debug: compile pass
x86_64-native-linuxapp-gcc-shared: compile pass
x86_64-native-linuxapp-clang: compile pass

- dts validation: 
 -- Test Commit: db340cf2ef71af231af67be8e42fd603e4bab0ac
 -- OS/Kernel: Ubuntu 15.04/3.19.0-15-generic
 -- GCC: gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)
 -- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
 -- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection 
[8086:10fb]
 -- total 96,failed 13(Detailed case list see in the attachment)

Thanks
Yongjie
-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Beilei Xing
Sent: Friday, May 06, 2016 2:07 PM
To: Zhang, Helin
Cc: dev at dpdk.org; Xing, Beilei
Subject: [dpdk-dev] [PATCH 01/29] ixgbe/base: add new VF requests for mailbox 
API

It adds two new VF requests of IXGBE_VF_GET_RETA and IXGBE_VF_GET_RSS_KEY for 
mailbox API.

Signed-off-by: Beilei Xing 
---
 drivers/net/ixgbe/base/ixgbe_mbx.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/base/ixgbe_mbx.h 
b/drivers/net/ixgbe/base/ixgbe_mbx.h
index 4a120a3..d775142 100644
--- a/drivers/net/ixgbe/base/ixgbe_mbx.h
+++ b/drivers/net/ixgbe/base/ixgbe_mbx.h
@@ -109,7 +109,9 @@ enum ixgbe_pfvf_api_rev {
 #define IXGBE_VF_GET_QUEUES0x09 /* get queue configuration */

 /* mailbox API, version 1.2 VF requests */
-#define IXGBE_VF_UPDATE_XCAST_MODE 0x0C
+#define IXGBE_VF_GET_RETA  0x0a /* VF request for RETA */
+#define IXGBE_VF_GET_RSS_KEY   0x0b /* get RSS key */
+#define IXGBE_VF_UPDATE_XCAST_MODE 0x0C

 /* GET_QUEUES return data indices within the mailbox */
 #define IXGBE_VF_TX_QUEUES 1   /* number of Tx queues supported */
--
2.5.0



[dpdk-dev] [PATCHv2 1/4] pmdinfogen: Add buildtools and pmdinfogen utility

2016-05-19 Thread Panu Matilainen
On 05/19/2016 12:08 AM, Neil Horman wrote:
[...]
> + if (strcmp(secname, ".modinfo") == 0) {
> + if (nobits)
> + fprintf(stderr, "%s has NOBITS .modinfo\n", 
> filename);
> + info->modinfo = (void *)hdr + sechdrs[i].sh_offset;
> + info->modinfo_len = sechdrs[i].sh_size;
> + } else if (strcmp(secname, "__ksymtab") == 0)
> + info->export_sec = i;
> + else if (strcmp(secname, "__ksymtab_unused") == 0)
> + info->export_unused_sec = i;
> + else if (strcmp(secname, "__ksymtab_gpl") == 0)
> + info->export_gpl_sec = i;
> + else if (strcmp(secname, "__ksymtab_unused_gpl") == 0)
> + info->export_unused_gpl_sec = i;
> + else if (strcmp(secname, "__ksymtab_gpl_future") == 0)
> + info->export_gpl_future_sec = i;
> +

Looks like a leftover from kernel modpost.c, not needed in DPDK.

- Panu -


[dpdk-dev] [PATCHv2 2/4] drivers: Update driver registration macro usage

2016-05-19 Thread Panu Matilainen
On 05/19/2016 12:08 AM, Neil Horman wrote:
> Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two
> (PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV.  Both of these do the
> same thing the origional macro did, but both add the definition of a string
> variable that informs interested parties of the name of the pmd, and the 
> former
> also defines an second string that holds the symbol name of the pci table that
> is registered by this pmd.
>
> pmdinfo uses this information to extract hardware support from an object file
> and create a json string to make hardware support info discoverable later.
>
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 
> ---
>  drivers/Makefile   |  2 ++
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  4 +++-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  4 +++-
>  drivers/crypto/null/null_crypto_pmd.c  |  4 +++-
>  drivers/crypto/qat/rte_qat_cryptodev.c |  4 +++-
>  drivers/crypto/snow3g/rte_snow3g_pmd.c |  4 +++-
>  drivers/net/af_packet/rte_eth_af_packet.c  |  4 +++-
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  6 --
>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +++-
>  drivers/net/e1000/em_ethdev.c  |  3 ++-
>  drivers/net/e1000/igb_ethdev.c |  6 --
>  drivers/net/ena/ena_ethdev.c   |  3 ++-
>  drivers/net/enic/enic_ethdev.c |  3 ++-
>  drivers/net/fm10k/fm10k_ethdev.c   |  3 ++-
>  drivers/net/i40e/i40e_ethdev.c |  3 ++-
>  drivers/net/i40e/i40e_ethdev_vf.c  |  3 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  6 --
>  drivers/net/mlx4/mlx4.c|  3 ++-
>  drivers/net/mlx5/mlx5.c|  3 ++-
>  drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
>  drivers/net/nfp/nfp_net.c  |  3 ++-
>  drivers/net/null/rte_eth_null.c|  3 ++-
>  drivers/net/pcap/rte_eth_pcap.c|  4 +++-
>  drivers/net/ring/rte_eth_ring.c|  3 ++-
>  drivers/net/szedata2/rte_eth_szedata2.c|  3 ++-
>  drivers/net/vhost/rte_eth_vhost.c  |  3 ++-
>  drivers/net/virtio/virtio_ethdev.c |  3 ++-
>  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  3 ++-
>  drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
>  lib/librte_eal/common/include/rte_dev.h| 20 
>  31 files changed, 93 insertions(+), 37 deletions(-)
>

drivers/net/qede is missing and causes a build failure with a fresh config.

It seems to be missing in v1 but I managed to test it, guess it must've 
been an old .config generated before QEDE got merged.

- Panu -


[dpdk-dev] [PATCH v2] vhost: add support for dynamic vhost PMD creation

2016-05-19 Thread Thomas Monjalon
2016-05-18 18:10, Ferruh Yigit:
> Add rte_eth_from_vhost() API to create vhost PMD dynamically from
> applications.

How is it different from rte_eth_dev_attach() calling rte_eal_vdev_init()?


[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Bruce Richardson
On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote:
> On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:
> > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:
> > > To avoid multiple stores on fast path, Ethernet drivers
> > > aggregate the writes to data_off, refcnt, nb_segs and port
> > > to an uint64_t data and write the data in one shot
> > > with uint64_t* at &mbuf->rearm_data address.
> > > 
> > > Some of the non-IA platforms have store operation overhead
> > > if the store address is not naturally aligned.This patch
> > > fixes the performance issue on those targets.
> > > 
> > > Signed-off-by: Jerin Jacob 
> > > ---
> > > 
> > > Tested this patch on IA and non-IA(ThunderX) platforms.
> > > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector 
> > > environment.
> > > and this patch does not have any overhead on IA platform.
> > > 
> > > Have tried an another similar approach by replacing "buf_len" with "pad"
> > > (in this patch context),
> > > Since it has additional overhead on read and then mask to keep "buf_len" 
> > > intact,
> > > not much improvement is not shown.
> > > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html
> > > 
> > > ---
> > While this will work and from your tests doesn't seem to have a performance
> > impact, I'm not sure I particularly like it. It's extending out the end of
> > cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically 
> > using
> > up any more space of it.
> 
> Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 64 
> bytes
> in the first 64-byte cache line.
> 
> > 
> > What I'm wondering about though, is do we have any usecases where we need a
> > variable buf_len for packets for RX. These mbufs come directly from a 
> > mempool,
> > which is generally understood to be a set of fixed-sized buffers. I realise 
> > that
> > this change was made in the past after some discussion, but one of the key 
> > points
> > there [at least to my reading] was that - even though nobody actually made a
> > concrete case where they had variable-sized buffers - having support for 
> > them
> > made no performance difference.
> > 
> > The latter part of that has now changed, and supporting variable-sized mbufs
> > from an mbuf pool has a perf impact. Do we definitely need that 
> > functionality,
> > because the easiest fix here is just to move the rxrearm marker back above
> > mbuf_len as it was originally in releases like 1.8?
> 
> And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf).
> Right?
> 
> I don't have a strong opinion on this, I can do this if there is no
> objection on this. Let me know.
> 
> However, I do see in future, "buf_len" may belong at the end of the first 64 
> byte
> cache line as currently "port" is defined as uint8_t, IMO, that is less.
> We may need to increase that uint16_t. The reason why I think that
> because, Currently in ThunderX HW, we do have 128VFs per socket for
> built-in NIC, So, the two node configuration and one external PCIe NW card
> configuration can easily go beyond 256 ports.
> 
Ok, good point. If you think it's needed, and if we are changing the mbuf
structure, it might be a good time to extend that field while you are at it, 
save
a second ABI break later on.

/Bruce

> > 
> > Regards,
> > /Bruce
> > 
> > Ref: http://dpdk.org/ml/archives/dev/2014-December/009432.html
> > 


[dpdk-dev] [PATCH v3 01/11] app/test: introduce resources for tests

2016-05-19 Thread Jan Viktorin
I forgot to fix this:

Check patch error:
12817: 
ERROR: space required after that ',' (ctx:VxV)
#244: FILE: app/test/resource.h:93:
+static void __attribute__((constructor,used)) resinitfn_ ##n(void) 
   ^

   total: 1 errors, 0 warnings, 259 lines checked

will do for v4.

Jan

On Tue, 17 May 2016 20:34:51 +0200
Jan Viktorin  wrote:

> Certain internal mechanisms of DPDK access different file system structures
> (e.g. /sys/bus/pci/devices). It is difficult to test those cases automatically
> by a unit test when such path is not hard-coded and there is no simple way how
> to distribute fake ones with the current testing environment.
> 
> This patch adds a possibility to declare a resource embedded in the test 
> binary
> itself. The structure resource cover the generic situation - it provides a 
> name
> for lookup and pointers to the embedded data blob. A resource is registered
> in a constructor by the macro REGISTER_RESOURCE.
> 
> Some initial tests of simple resources is included and added into the group_1.
> 
> Signed-off-by: Jan Viktorin 
> ---
> v3:
> * fixed doc comments
> ---
>  app/test/Makefile |  2 +
>  app/test/autotest_data.py |  6 +++
>  app/test/resource.c   | 66 +++
>  app/test/resource.h   | 98 
> +++
>  app/test/test_resource.c  | 75 
>  5 files changed, 247 insertions(+)
>  create mode 100644 app/test/resource.c
>  create mode 100644 app/test/resource.h
>  create mode 100644 app/test/test_resource.c

[...]


[dpdk-dev] [PATCH v3 09/11] eal/pci: allow to override sysfs

2016-05-19 Thread Jan Viktorin
I forgot to fix those:

   12826: 
   ERROR: code indent should use tabs where possible
   #175: FILE: lib/librte_eal/linuxapp/eal/eal_pci.c:69:
   +^I "%s/" PCI_PRI_FMT "/driver/unbind", pci_get_sysfs_path(),$

   WARNING: line over 80 characters
   #193: FILE: lib/librte_eal/linuxapp/eal/eal_pci.c:471:
   +snprintf(dirname, sizeof(dirname), "%s/%s", pci_get_sysfs_path(),

   total: 1 errors, 1 warnings, 160 lines checked

   NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
 scripts/cleanfile

will do for v4.

Jan

On Tue, 17 May 2016 20:34:59 +0200
Jan Viktorin  wrote:

> The SYSFS_PCI_DEVICES is a constant that makes the PCI testing difficult as
> it points to an absolute path. We remove using this constant and introducing
> a function pci_get_sysfs_path that gives the same value. However, the user can
> pass a SYSFS_PCI_DEVICES env variable to override the path. It is now possible
> to create a fake sysfs hierarchy for testing.
> 
> Signed-off-by: Jan Viktorin 
> ---
> v3:
> * changed subject
> * test_pci_sysfs has been slightly modified to be more understandable
> * fixed whitespace in *version.map files
> ---
>  app/test/test_pci.c | 28 
> +
>  drivers/net/szedata2/rte_eth_szedata2.c |  2 +-
>  drivers/net/virtio/virtio_pci.c |  2 +-
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  7 +++
>  lib/librte_eal/common/eal_common_pci.c  | 13 
>  lib/librte_eal/common/include/rte_pci.h |  2 +-
>  lib/librte_eal/linuxapp/eal/eal_pci.c   |  6 +++---
>  lib/librte_eal/linuxapp/eal/eal_pci_uio.c   |  7 ---
>  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c  |  2 +-
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map |  7 +++
>  10 files changed, 66 insertions(+), 10 deletions(-)
> 

[...]


[dpdk-dev] [PATCHv2 4/4] pmdinfo.py: Add tool to query binaries for hw and other support information

2016-05-19 Thread Panu Matilainen
On 05/19/2016 12:08 AM, Neil Horman wrote:
> This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary,
> and, if found parses the remainder of the string as a json encoded string,
> outputting the results in either a human readable or raw, script parseable
> format
>
> Note that, in the case of dynamically linked applications, pmdinfo.py will 
> scan
> for implicitly linked PMDs by searching the specified binaries .dynamic 
> section
> for DT_NEEDED entries that contain the substring librte_pmd.  The DT_RUNPATH,
> LD_LIBRARY_PATH, /usr/lib and /lib are searched for these libraries, in that
> order

Scanning /usr/lib and /lib does little good on systems where /usr/lib64 
and /lib64 are the standard path, such as x86_64 Fedora / RHEL and 
derivates.

With the path changed (or LD_LIBRARY_PATH set manually), I can confirm 
it works for a shared binary which is --whole-archive linked to all of 
DPDK such as ovs-vswitchd currently is (because it needs to for static 
DPDK linkage and is not aware of plugin autoloading).

It doesn't help testpmd though because its not linked with 
--whole-archive in the shared case, so its not working for the main DPDK 
executable...

In any case, using --whole-archive is a sledgehammer solution at best, 
and against the spirit of shared libs and plugins in particular.

I think the shared linkage case can be solved by exporting the PMD path 
from librte_eal (either through an elf section or c-level symbol) and 
teach the script to detect the case of an executable dynamically linked 
to librte_eal, fish the path from there and then process everything in 
that path.

>
> If a file is specified with no path, it is assumed to be a PMD DSO, and the
> LD_LIBRARY_PATH, /usr/lib/ and /lib is searched for it

Same as above, /usr/lib/ and /lib is incorrect for a large number of 
systems.

>
> Currently the tool can output data in 3 formats:
>
> a) raw, suitable for scripting, where the raw JSON strings are dumped out
> b) table format (default) where hex pci ids are dumped in a table format
> c) pretty, where a user supplied pci.ids file is used to print out vendor and
> device strings

c) is a nice addition. Would be even nicer if it knew the most common 
pci.ids locations so it doesn't need extra arguments in those cases, ie 
see if /usr/share/hwdata/pci.ids or /usr/share/misc/pci.ids exists and 
use that unless overridden on the cli.

- Panu -




[dpdk-dev] [PATCH v2 6/7] virtio: fix pci accesses for ppc64 in legacy mode

2016-05-19 Thread Chao Zhu
Olivier,

Thanks for the patches! 
Just one comment:
POWER8 machine only supports little endian OS on bare metal. In VM guest, it
can support both little endian and big endian OS. Did you try to run it on
both host (little endian) and guest (big endian and little endian)?

-Original Message-
From: Olivier Matz [mailto:olivier.m...@6wind.com] 
Sent: 2016?5?17? 18:00
To: dev at dpdk.org
Cc: david.marchand at 6wind.com; chaozhu at linux.vnet.ibm.com; yuanhan.liu at 
linux.
intel.com; huawei.xie at intel.com
Subject: [PATCH v2 6/7] virtio: fix pci accesses for ppc64 in legacy mode

From: David Marchand 

Although ppc supports both endianesses, qemu supposes that the cpu is big
endian and enforces this for the virtio-net stuff.

Fix PCI accesses in legacy mode. Only ppc64le is supported at the moment.

Signed-off-by: David Marchand 
Signed-off-by: Olivier Matz 
---
 drivers/net/virtio/virtio_pci.c | 68
+
 1 file changed, 68 insertions(+)

diff --git a/drivers/net/virtio/virtio_pci.c
b/drivers/net/virtio/virtio_pci.c index 9cdca06..ebf4cf7 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -55,20 +55,88 @@
  */
 #define VIRTIO_PCI_CONFIG(hw) (((hw)->use_msix) ? 24 : 20)

+/*
+ * Since we are in legacy mode:
+ * http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf
+ *
+ * "Note that this is possible because while the virtio header is PCI (i.e.
+ * little) endian, the device-specific region is encoded in the native 
+endian of
+ * the guest (where such distinction is applicable)."
+ *
+ * For powerpc which supports both, qemu supposes that cpu is big 
+endian and
+ * enforces this for the virtio-net stuff.
+ */
+
 static void
 legacy_read_dev_config(struct virtio_hw *hw, size_t offset,
   void *dst, int length)
 {
+#ifdef RTE_ARCH_PPC_64
+   int size;
+
+   while (length > 0) {
+   if (length >= 4) {
+   size = 4;
+   rte_eal_pci_ioport_read(&hw->io, dst, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   *(uint32_t *)dst = rte_be_to_cpu_32(*(uint32_t
*)dst);
+   } else if (length >= 2) {
+   size = 2;
+   rte_eal_pci_ioport_read(&hw->io, dst, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   *(uint16_t *)dst = rte_be_to_cpu_16(*(uint16_t
*)dst);
+   } else {
+   size = 1;
+   rte_eal_pci_ioport_read(&hw->io, dst, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   }
+
+   dst = (char *)dst + size;
+   offset += size;
+   length -= size;
+   }
+#else
rte_eal_pci_ioport_read(&hw->io, dst, length,
VIRTIO_PCI_CONFIG(hw) + offset);
+#endif
 }

 static void
 legacy_write_dev_config(struct virtio_hw *hw, size_t offset,
const void *src, int length)
 {
+#ifdef RTE_ARCH_PPC_64
+   union {
+   uint32_t u32;
+   uint16_t u16;
+   } tmp;
+   int size;
+
+   while (length > 0) {
+   if (length >= 4) {
+   size = 4;
+   tmp.u32 = rte_cpu_to_be_32(*(const uint32_t *)src);
+   rte_eal_pci_ioport_write(&hw->io, &tmp.u32, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   } else if (length >= 2) {
+   size = 2;
+   tmp.u16 = rte_cpu_to_be_16(*(const uint16_t *)src);
+   rte_eal_pci_ioport_write(&hw->io, &tmp.u16, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   } else {
+   size = 1;
+   rte_eal_pci_ioport_write(&hw->io, src, size,
+   VIRTIO_PCI_CONFIG(hw) + offset);
+   }
+
+   src = (const char *)src + size;
+   offset += size;
+   length -= size;
+   }
+#else
rte_eal_pci_ioport_write(&hw->io, src, length,
 VIRTIO_PCI_CONFIG(hw) + offset);
+#endif
 }

 static uint64_t
--
2.8.0.rc3




[dpdk-dev] [PATCH 07/20] thunderx/nicvf: add rx_queue_setup/release support

2016-05-19 Thread Pattan, Reshma


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> Sent: Saturday, May 7, 2016 4:16 PM
> To: dev at dpdk.org
> Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> ; Jerin Jacob
> ; Maciej Czekaj
> ; Kamil Rytarowski
> ; Zyta Szpak
> ; Slawomir Rosek ;
> Radoslaw Biernacki 
> Subject: [dpdk-dev] [PATCH 07/20] thunderx/nicvf: add rx_queue_setup/release
> support
> 
> diff --git a/drivers/net/thunderx/nicvf_ethdev.c
> b/drivers/net/thunderx/nicvf_ethdev.c
> index 1269672..3b94168 100644
> --- a/drivers/net/thunderx/nicvf_ethdev.c
> +++ b/drivers/net/thunderx/nicvf_ethdev.c
> 
> +static int
> +nicvf_qset_cq_alloc(struct nicvf *nic, struct nicvf_rxq *rxq, uint16_t qidx,
> + uint32_t desc_cnt)
> +{
> + const struct rte_memzone *rz;
> + uint32_t ring_size = desc_cnt * sizeof(union cq_entry_t);
> +
> + rz = rte_eth_dma_zone_reserve(nic->eth_dev, "cq_ring", qidx, ring_size,
> +NICVF_CQ_BASE_ALIGN_BYTES, nic->node);
> + if (rz == NULL) {
> + PMD_INIT_LOG(ERR, "Failed allocate mem for cq hw ring");

Typo "Failed to"?

> +static int
> +nicvf_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t qidx,
> +  uint16_t nb_desc, unsigned int socket_id,
> +  const struct rte_eth_rxconf *rx_conf,
> +  struct rte_mempool *mp)
> +{
> + uint16_t rx_free_thresh;
> + struct nicvf_rxq *rxq;
> + struct nicvf *nic = nicvf_pmd_priv(dev);
> +
> + PMD_INIT_FUNC_TRACE();
> +
> + /* Socked id check */

Typo  "Socket"?

Thanks,
Reshma


[dpdk-dev] [PATCH] tools: allow binding to other network class devices

2016-05-19 Thread Thomas Monjalon
2016-05-06 15:27, Thadeu Lima de Souza Cascardo:
> dpdk_nic_bind will only handle Ethernet devices, but Mellanox ConnectX-3 Pro,
> for example, is a Network class device, but not an Ethernet one. Even though
> this allows other devices in the list, like Wireless devices, this should not 
> be
> a problem.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 

Applied, thanks


[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-19 Thread Thomas Monjalon
2016-05-11 14:08, Ziye Yang:
> This patch is used to add the class_id (class_code,
> subclass_code, programming_interface) support for
> pci_device probe. With this patch, it will be
> flexible for users to probe a class of devices
> by class_id.
> 
> Signed-off-by: Ziye Yang 
> ---
>  lib/librte_eal/bsdapp/eal/eal_pci.c | 4 
>  lib/librte_eal/common/eal_common_pci.c  | 3 +++
>  lib/librte_eal/common/include/rte_pci.h | 8 ++--
>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 9 +
>  4 files changed, 22 insertions(+), 2 deletions(-)

Please remove the deprecation notice.

> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -129,6 +129,7 @@ struct rte_pci_id {
>   uint16_t device_id;   /**< Device ID or PCI_ANY_ID. */
>   uint16_t subsystem_vendor_id; /**< Subsystem vendor ID or PCI_ANY_ID. */
>   uint16_t subsystem_device_id; /**< Subsystem device ID or PCI_ANY_ID. */
> + uint32_t class_id;   /**< Class ID (class, subclass, pi) or 
> CLASS_ANY_ID. */
>  };

A space is missing.
It would be more logical to put the class_id at the beginning of the struct.

>  /** Any PCI device identifier (vendor, device, ...) */
>  #define PCI_ANY_ID (0x)
> +#define CLASS_ANY_ID (0xff)

These constants should be prefixed with RTE_.

> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> + /* get class_id */
> + snprintf(filename, sizeof(filename), "%s/class",
> +  dirname);
> + if (eal_parse_sysfs_value(filename, &tmp) < 0) {
> + free(dev);
> + return -1;
> + }
> + dev->id.class_id = (uint32_t)tmp && CLASS_ANY_ID;

Should be a bitwise &. Why masking is needed?


[dpdk-dev] [PATCHv2 2/4] drivers: Update driver registration macro usage

2016-05-19 Thread Neil Horman
On Thu, May 19, 2016 at 10:58:23AM +0300, Panu Matilainen wrote:
> On 05/19/2016 12:08 AM, Neil Horman wrote:
> > Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two
> > (PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV.  Both of these do 
> > the
> > same thing the origional macro did, but both add the definition of a string
> > variable that informs interested parties of the name of the pmd, and the 
> > former
> > also defines an second string that holds the symbol name of the pci table 
> > that
> > is registered by this pmd.
> > 
> > pmdinfo uses this information to extract hardware support from an object 
> > file
> > and create a json string to make hardware support info discoverable later.
> > 
> > Signed-off-by: Neil Horman 
> > CC: Bruce Richardson 
> > CC: Thomas Monjalon 
> > CC: Stephen Hemminger 
> > CC: Panu Matilainen 
> > ---
> >  drivers/Makefile   |  2 ++
> >  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  4 +++-
> >  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  4 +++-
> >  drivers/crypto/null/null_crypto_pmd.c  |  4 +++-
> >  drivers/crypto/qat/rte_qat_cryptodev.c |  4 +++-
> >  drivers/crypto/snow3g/rte_snow3g_pmd.c |  4 +++-
> >  drivers/net/af_packet/rte_eth_af_packet.c  |  4 +++-
> >  drivers/net/bnx2x/bnx2x_ethdev.c   |  6 --
> >  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >  drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +++-
> >  drivers/net/e1000/em_ethdev.c  |  3 ++-
> >  drivers/net/e1000/igb_ethdev.c |  6 --
> >  drivers/net/ena/ena_ethdev.c   |  3 ++-
> >  drivers/net/enic/enic_ethdev.c |  3 ++-
> >  drivers/net/fm10k/fm10k_ethdev.c   |  3 ++-
> >  drivers/net/i40e/i40e_ethdev.c |  3 ++-
> >  drivers/net/i40e/i40e_ethdev_vf.c  |  3 ++-
> >  drivers/net/ixgbe/ixgbe_ethdev.c   |  6 --
> >  drivers/net/mlx4/mlx4.c|  3 ++-
> >  drivers/net/mlx5/mlx5.c|  3 ++-
> >  drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
> >  drivers/net/nfp/nfp_net.c  |  3 ++-
> >  drivers/net/null/rte_eth_null.c|  3 ++-
> >  drivers/net/pcap/rte_eth_pcap.c|  4 +++-
> >  drivers/net/ring/rte_eth_ring.c|  3 ++-
> >  drivers/net/szedata2/rte_eth_szedata2.c|  3 ++-
> >  drivers/net/vhost/rte_eth_vhost.c  |  3 ++-
> >  drivers/net/virtio/virtio_ethdev.c |  3 ++-
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  3 ++-
> >  drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
> >  lib/librte_eal/common/include/rte_dev.h| 20 
> >  31 files changed, 93 insertions(+), 37 deletions(-)
> > 
> 
> drivers/net/qede is missing and causes a build failure with a fresh config.
> 
> It seems to be missing in v1 but I managed to test it, guess it must've been
> an old .config generated before QEDE got merged.
> 
>   - Panu -
> 
No, It only got added recently.  I pulled when I started writing this (about two
weeks ago), and it got added during its development.  I'll rebase
Neil



[dpdk-dev] [dpdk-dev, PATCHv2, 2/4] drivers: Update driver registration macro usage

2016-05-19 Thread Jan Viktorin
Hello Neil,

just few notes...

(sorry if you've recevied this twice, importing mbox files from patchwork
always changes my default From: field)

On Wed, 18 May 2016 17:08:05 -0400
Neil Horman  wrote:

> Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two
> (PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV.  Both of these do the
> same thing the origional macro did, but both add the definition of a string

I could not find any of those: PMD_REGISTER_DRIVER_PDEV, 
PMD_REGISTER_DRIVER_VDEV
in this patch. I think the message is misleading...

I am interested as this may lead to merge conflicts when generalizing
rte_pci_device/driver and stuff around it.

> variable that informs interested parties of the name of the pmd, and the 
> former
> also defines an second string that holds the symbol name of the pci table that
> is registered by this pmd.

> 
> pmdinfo uses this information to extract hardware support from an object file
> and create a json string to make hardware support info discoverable later.
> 
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 
> 
> ---
> drivers/Makefile   |  2 ++
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  4 +++-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  4 +++-
>  drivers/crypto/null/null_crypto_pmd.c  |  4 +++-
>  drivers/crypto/qat/rte_qat_cryptodev.c |  4 +++-
>  drivers/crypto/snow3g/rte_snow3g_pmd.c |  4 +++-
>  drivers/net/af_packet/rte_eth_af_packet.c  |  4 +++-
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  6 --
>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +++-
>  drivers/net/e1000/em_ethdev.c  |  3 ++-
>  drivers/net/e1000/igb_ethdev.c |  6 --
>  drivers/net/ena/ena_ethdev.c   |  3 ++-
>  drivers/net/enic/enic_ethdev.c |  3 ++-
>  drivers/net/fm10k/fm10k_ethdev.c   |  3 ++-
>  drivers/net/i40e/i40e_ethdev.c |  3 ++-
>  drivers/net/i40e/i40e_ethdev_vf.c  |  3 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  6 --
>  drivers/net/mlx4/mlx4.c|  3 ++-
>  drivers/net/mlx5/mlx5.c|  3 ++-
>  drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
>  drivers/net/nfp/nfp_net.c  |  3 ++-
>  drivers/net/null/rte_eth_null.c|  3 ++-
>  drivers/net/pcap/rte_eth_pcap.c|  4 +++-
>  drivers/net/ring/rte_eth_ring.c|  3 ++-
>  drivers/net/szedata2/rte_eth_szedata2.c|  3 ++-
>  drivers/net/vhost/rte_eth_vhost.c  |  3 ++-
>  drivers/net/virtio/virtio_ethdev.c |  3 ++-
>  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  3 ++-
>  drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
>  lib/librte_eal/common/include/rte_dev.h| 20 
>  31 files changed, 93 insertions(+), 37 deletions(-)
> 

[...]

>  
> -PMD_REGISTER_DRIVER(pmd_xenvirt_drv);
> +PMD_REGISTER_DRIVER(pmd_xenvirt_drv, xenvirt);
> diff --git a/lib/librte_eal/common/include/rte_dev.h 
> b/lib/librte_eal/common/include/rte_dev.h
> index f1b5507..871089a 100644
> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -48,7 +48,7 @@ extern "C" {
>  
>  #include 
>  #include 
> -
> +#include 

This should be done in an opposite way. The rte_pci.h should include rte_dev.h
(more specific includes a generic one). I've introduced this in the following 
patch:

 [PATCH v1 01/28] eal: make enum rte_kernel_driver non-PCI specific
 http://dpdk.org/dev/patchwork/patch/12488/

Regards
Jan

>  #include 
>  
>  __attribute__((format(printf, 2, 0)))
> @@ -178,12 +178,24 @@ int rte_eal_vdev_init(const char *name, const char 
> *args);
>   */
>  int rte_eal_vdev_uninit(const char *name);
>  
> -#define PMD_REGISTER_DRIVER(d)\
> +#define DRIVER_EXPORT_NAME_ARRAY(n, idx) n##idx[] __attribute__((used))
> +
> +#define DRIVER_EXPORT_NAME(d, idx) \
> +static const char DRIVER_EXPORT_NAME_ARRAY(this_pmd_name, idx) = RTE_STR(d);\
> + 
> +#define PMD_REGISTER_DRIVER(d, n)\
>  void devinitfn_ ##d(void);\
>  void __attribute__((constructor, used)) devinitfn_ ##d(void)\
>  {\
> - rte_eal_driver_register(&d);\
> -}
> +rte_eal_driver_register(&d);\
> +}\
> +DRIVER_EXPORT_NAME(n, __COUNTER__)
> +
> +#define DRIVER_REGISTER_PCI_TABLE(n, t) \
> +static const char n##_pci_tbl_export[] __attribute__((used)) = RTE_STR(t)
> +
> +#define DRIVER_REGISTER_PARAM_STRING(n, s) \
> +static const char n##_param_string_export[] __attribute__((used)) = s
>  
>  #ifdef __cplusplus
>  }

-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH] rte mempool: division or modulo by zero

2016-05-19 Thread Mrozowicz, SlawomirX
Hi Olivier,

I try to marge my change CID 13234 with your patch 12057.
Can you tell me  which is the base commit to apply the patch.
I think that I should apply your patches starting  from 12834.

Regards,
Slawomir


>-Original Message-
>From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>Sent: Monday, May 16, 2016 11:23 AM
>To: Mrozowicz, SlawomirX 
>Cc: dev at dpdk.org
>Subject: Re: [PATCH] rte mempool: division or modulo by zero
>
>Hi Slawomir,
>
>On 05/12/2016 02:46 PM, Slawomir Mrozowicz wrote:
>> Fix issue reported by Coverity.
>>
>> Coverity ID 13243: Division or modulo by zero In function call
>> rte_mempool_xmem_size, division by expression total_size which may be
>> zero has undefined behavior.
>>
>> Fixes: 148f963fb532 ("xen: core library changes")
>>
>> Signed-off-by: Slawomir Mrozowicz 
>> ---
>>  lib/librte_mempool/rte_mempool.c | 18 +++---
>>  1 file changed, 11 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/librte_mempool/rte_mempool.c
>> b/lib/librte_mempool/rte_mempool.c
>> index f8781e1..01668c1 100644
>> --- a/lib/librte_mempool/rte_mempool.c
>> +++ b/lib/librte_mempool/rte_mempool.c
>> @@ -327,15 +327,19 @@ rte_mempool_calc_obj_size(uint32_t elt_size,
>> uint32_t flags,  size_t  rte_mempool_xmem_size(uint32_t elt_num,
>> size_t elt_sz, uint32_t pg_shift)  {
>> -size_t n, pg_num, pg_sz, sz;
>> +size_t n, pg_num, pg_sz;
>> +size_t sz = 0;
>>
>> -pg_sz = (size_t)1 << pg_shift;
>> +if (elt_sz > 0) {
>> +pg_sz = (size_t)1 << pg_shift;
>> +n = pg_sz / elt_sz;
>>
>> -if ((n = pg_sz / elt_sz) > 0) {
>> -pg_num = (elt_num + n - 1) / n;
>> -sz = pg_num << pg_shift;
>> -} else {
>> -sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
>> +if (n > 0) {
>> +pg_num = (elt_num + n - 1) / n;
>> +sz = pg_num << pg_shift;
>> +} else {
>> +sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
>> +}
>>  }
>>
>>  return sz;
>>
>
>I think it would be clearer (either for the patch and the code) to avoid an
>additional indent, and do something like that:
>
>   size_t
>   rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
>   uint32_t pg_shift)
>   {
>   if (elt_sz == 0)
>   return 0;
>
>   /* same code as before */
>
>It will also facilitate the merge with
>http://patchwork.dpdk.org/dev/patchwork/patch/12057/
>
>Could you please submit a v2 with this logic?
>
>Thanks,
>Olivier


[dpdk-dev] [PATCH] rte mempool: division or modulo by zero

2016-05-19 Thread Olivier Matz
Hi Slawomir,

On 05/19/2016 12:57 PM, Mrozowicz, SlawomirX wrote:
> Hi Olivier,
> 
> I try to marge my change CID 13234 with your patch 12057.
> Can you tell me  which is the base commit to apply the patch.
> I think that I should apply your patches starting  from 12834.
> 

Yes that's correct, the v3 patchset can be found here:
http://dpdk.org/ml/archives/dev/2016-May/039229.html

Regards,
Olivier


[dpdk-dev] DPDK Summit USA 2016, Announcement and Call for Speakers

2016-05-19 Thread O'Driscoll, Tim
We're delighted to announce that the DPDK Summit USA 2016 will be held on 
August 10th & 11th at The Tech Museum of Innovation (http://www.thetech.org), 
San Jose.

The DPDK Summit events provide an opportunity for the DPDK community to meet 
face-to-face and discuss the future direction of the project in a variety of 
industry segments including telco, cloud, enterprise, security, and financial 
services. The agenda will cover the latest developments to the DPDK framework, 
plans for future releases, as well as an opportunity to hear from DPDK users 
who have used the framework in their applications. The Summits also provide a 
great opportunity to meet and learn from other DPDK developers.

This year, we've extended our USA Summit to 2 days, so that we can accommodate 
additional technical presentations and discussions. The full agenda is 
currently being determined and will be published in advance of the Summit.

Please join us to hear the latest developments in DPDK, and to help shape the 
future direction of packet processing.

You can register for this event at: http://dpdksummit.com.

We're also initiating a Call for Speakers for this event. To submit a proposal 
for a full presentation or a lightning talk on any DPDK-related topic, please 
email dpdk.summit at intel.com by Friday June 24th, providing the following 
details:
- Presentation or lightning talk
- Title
- Abstract
- Presenter name, company, title, contact details


Registration Policies: Cancellations, Substitutions & Changes
- If  you need to cancel your DPDK Summit USA 2016 registration, you may do so 
until August 3rd,  2016. If you are unable to attend the event, we recommend 
that you send a substitute in your place.  Please send your cancelation or 
substitution name changes to dpdk.summit at intel.com.
- DPDK Summit USA 2016 reserves the right to rescind any registration. All 
dates and times of DPDK Summit USA 2016 are subject to change.
- If  you have a disability and require special assistance, please contact 
dpdk.summit at intel.com.
- Attendee consents to any recording of the event by Intel or its designees.


[dpdk-dev] [dpdk-dev, PATCHv2, 2/4] drivers: Update driver registration macro usage

2016-05-19 Thread Neil Horman
On Thu, May 19, 2016 at 12:46:50PM +0200, Jan Viktorin wrote:
> Hello Neil,
> 
> just few notes...
> 
> On Wed, 18 May 2016 17:08:05 -0400
> Neil Horman  wrote:
> 
> > Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two
> > (PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV.  Both of these do 
> > the
> > same thing the origional macro did, but both add the definition of a string
> 

> I could not find any of those: PMD_REGISTER_DRIVER_PDEV, 
> PMD_REGISTER_DRIVER_VDEV
> in this patch. I think the message is misleading...
> 
Forgot to fix up the changelog entry, when I merged these two macros back to the
single PMD_REGISTER_DRIVER

> I am interested as this may lead to merge conflicts when generalizing
> rte_pci_device/driver and stuff around it.
> 
Not sure what you mean by that

> > variable that informs interested parties of the name of the pmd, and the 
> > former
> > also defines an second string that holds the symbol name of the pci table 
> > that
> > is registered by this pmd.
> 
> > 
> > pmdinfo uses this information to extract hardware support from an object 
> > file
> > and create a json string to make hardware support info discoverable later.
> > 
> > Signed-off-by: Neil Horman 
> > CC: Bruce Richardson 
> > CC: Thomas Monjalon 
> > CC: Stephen Hemminger 
> > CC: Panu Matilainen 
> > 
> > ---
> > drivers/Makefile   |  2 ++
> >  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  4 +++-
> >  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  4 +++-
> >  drivers/crypto/null/null_crypto_pmd.c  |  4 +++-
> >  drivers/crypto/qat/rte_qat_cryptodev.c |  4 +++-
> >  drivers/crypto/snow3g/rte_snow3g_pmd.c |  4 +++-
> >  drivers/net/af_packet/rte_eth_af_packet.c  |  4 +++-
> >  drivers/net/bnx2x/bnx2x_ethdev.c   |  6 --
> >  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
> >  drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +++-
> >  drivers/net/e1000/em_ethdev.c  |  3 ++-
> >  drivers/net/e1000/igb_ethdev.c |  6 --
> >  drivers/net/ena/ena_ethdev.c   |  3 ++-
> >  drivers/net/enic/enic_ethdev.c |  3 ++-
> >  drivers/net/fm10k/fm10k_ethdev.c   |  3 ++-
> >  drivers/net/i40e/i40e_ethdev.c |  3 ++-
> >  drivers/net/i40e/i40e_ethdev_vf.c  |  3 ++-
> >  drivers/net/ixgbe/ixgbe_ethdev.c   |  6 --
> >  drivers/net/mlx4/mlx4.c|  3 ++-
> >  drivers/net/mlx5/mlx5.c|  3 ++-
> >  drivers/net/mpipe/mpipe_tilegx.c   |  4 ++--
> >  drivers/net/nfp/nfp_net.c  |  3 ++-
> >  drivers/net/null/rte_eth_null.c|  3 ++-
> >  drivers/net/pcap/rte_eth_pcap.c|  4 +++-
> >  drivers/net/ring/rte_eth_ring.c|  3 ++-
> >  drivers/net/szedata2/rte_eth_szedata2.c|  3 ++-
> >  drivers/net/vhost/rte_eth_vhost.c  |  3 ++-
> >  drivers/net/virtio/virtio_ethdev.c |  3 ++-
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  3 ++-
> >  drivers/net/xenvirt/rte_eth_xenvirt.c  |  2 +-
> >  lib/librte_eal/common/include/rte_dev.h| 20 
> >  31 files changed, 93 insertions(+), 37 deletions(-)
> > 
> 
> [...]
> 
> >  
> > -PMD_REGISTER_DRIVER(pmd_xenvirt_drv);
> > +PMD_REGISTER_DRIVER(pmd_xenvirt_drv, xenvirt);
> > diff --git a/lib/librte_eal/common/include/rte_dev.h 
> > b/lib/librte_eal/common/include/rte_dev.h
> > index f1b5507..871089a 100644
> > --- a/lib/librte_eal/common/include/rte_dev.h
> > +++ b/lib/librte_eal/common/include/rte_dev.h
> > @@ -48,7 +48,7 @@ extern "C" {
> >  
> >  #include 
> >  #include 
> > -
> > +#include 
> 
> This should be done in an opposite way. The rte_pci.h should include rte_dev.h
> (more specific includes a generic one). I've introduced this in the following 
> patch:
> 
>  [PATCH v1 01/28] eal: make enum rte_kernel_driver non-PCI specific
>  http://dpdk.org/dev/patchwork/patch/12488/
> 
> Regards
> Jan
> 
That seems to be still in flight.  If your change lands by the time this gets
merged I'll change it accordingly.



[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Jan Viktorin
On Thu, 19 May 2016 09:50:48 +0100
Bruce Richardson  wrote:

> On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote:
> > On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:  
> > > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:  
> > > > To avoid multiple stores on fast path, Ethernet drivers
> > > > aggregate the writes to data_off, refcnt, nb_segs and port
> > > > to an uint64_t data and write the data in one shot
> > > > with uint64_t* at &mbuf->rearm_data address.
> > > > 
> > > > Some of the non-IA platforms have store operation overhead
> > > > if the store address is not naturally aligned.This patch
> > > > fixes the performance issue on those targets.
> > > > 
> > > > Signed-off-by: Jerin Jacob 
> > > > ---
> > > > 
> > > > Tested this patch on IA and non-IA(ThunderX) platforms.
> > > > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector 
> > > > environment.
> > > > and this patch does not have any overhead on IA platform.

Hello,

I can confirm a very small improvement in our synthetic tests based on the PMD
null (ARM Cortex-A9). For a single-core (1C) test, there is now a lower overhead
and it is more stable with different packet lengths. However, when running 
dual-core
(2C), the result is slightly slower but again, it seems to be more stable.

Without this patch (cycles per packet):

 length:   64 128 256 512102412801518
  1C  488 544 487 454 543 488 515
  2C  433 433 431 433 433 461 443

Applied this patch (cycles per packet):

 length:   64 128 256 512102412801518
  1C  472 472 472 472 473 472 473
  2C  435 435 435 435 436 436 436

Regards
Jan

> > > > 
> > > > Have tried an another similar approach by replacing "buf_len" with "pad"
> > > > (in this patch context),
> > > > Since it has additional overhead on read and then mask to keep 
> > > > "buf_len" intact,
> > > > not much improvement is not shown.
> > > > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html
> > > > 
> > > > ---  
> > > While this will work and from your tests doesn't seem to have a 
> > > performance
> > > impact, I'm not sure I particularly like it. It's extending out the end of
> > > cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically 
> > > using
> > > up any more space of it.  
> > 
> > Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 64 
> > bytes
> > in the first 64-byte cache line.
> >   
> > > 
> > > What I'm wondering about though, is do we have any usecases where we need 
> > > a
> > > variable buf_len for packets for RX. These mbufs come directly from a 
> > > mempool,
> > > which is generally understood to be a set of fixed-sized buffers. I 
> > > realise that
> > > this change was made in the past after some discussion, but one of the 
> > > key points
> > > there [at least to my reading] was that - even though nobody actually 
> > > made a
> > > concrete case where they had variable-sized buffers - having support for 
> > > them
> > > made no performance difference.
> > > 
> > > The latter part of that has now changed, and supporting variable-sized 
> > > mbufs
> > > from an mbuf pool has a perf impact. Do we definitely need that 
> > > functionality,
> > > because the easiest fix here is just to move the rxrearm marker back above
> > > mbuf_len as it was originally in releases like 1.8?  
> > 
> > And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf).
> > Right?
> > 
> > I don't have a strong opinion on this, I can do this if there is no
> > objection on this. Let me know.
> > 
> > However, I do see in future, "buf_len" may belong at the end of the first 
> > 64 byte
> > cache line as currently "port" is defined as uint8_t, IMO, that is less.
> > We may need to increase that uint16_t. The reason why I think that
> > because, Currently in ThunderX HW, we do have 128VFs per socket for
> > built-in NIC, So, the two node configuration and one external PCIe NW card
> > configuration can easily go beyond 256 ports.
> >   
> Ok, good point. If you think it's needed, and if we are changing the mbuf
> structure, it might be a good time to extend that field while you are at it, 
> save
> a second ABI break later on.

> 
> /Bruce
> 
> > > 
> > > Regards,
> > > /Bruce
> > > 
> > > Ref: http://dpdk.org/ml/archives/dev/2014-December/009432.html
> > >   



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCHv2 4/4] pmdinfo.py: Add tool to query binaries for hw and other support information

2016-05-19 Thread Neil Horman
On Thu, May 19, 2016 at 12:02:27PM +0300, Panu Matilainen wrote:
> On 05/19/2016 12:08 AM, Neil Horman wrote:
> > This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary,
> > and, if found parses the remainder of the string as a json encoded string,
> > outputting the results in either a human readable or raw, script parseable
> > format
> > 
> > Note that, in the case of dynamically linked applications, pmdinfo.py will 
> > scan
> > for implicitly linked PMDs by searching the specified binaries .dynamic 
> > section
> > for DT_NEEDED entries that contain the substring librte_pmd.  The 
> > DT_RUNPATH,
> > LD_LIBRARY_PATH, /usr/lib and /lib are searched for these libraries, in that
> > order
> 
> Scanning /usr/lib and /lib does little good on systems where /usr/lib64 and
> /lib64 are the standard path, such as x86_64 Fedora / RHEL and derivates.
> 
Ah, sorry, forgot the 64 bit variants, I can add those in.

> With the path changed (or LD_LIBRARY_PATH set manually), I can confirm it
> works for a shared binary which is --whole-archive linked to all of DPDK
> such as ovs-vswitchd currently is (because it needs to for static DPDK
> linkage and is not aware of plugin autoloading).
> 
Right, thats why it works, because DPDK always requires --whole-archive for
static linking, and likely always will (see commit
20afd76a504155e947c770783ef5023e87136ad8)

> It doesn't help testpmd though because its not linked with --whole-archive
> in the shared case, so its not working for the main DPDK executable...
> 
This sentence doesn't make sense --whole-archive is only applicable in the
static binary case, and only when linking archive files.

> In any case, using --whole-archive is a sledgehammer solution at best, and
> against the spirit of shared libs and plugins in particular.
> 
It may be a sledgehammer solution, but its the one dpdk uses, and will likely
use in perpituity.

> I think the shared linkage case can be solved by exporting the PMD path from
> librte_eal (either through an elf section or c-level symbol) and teach the
> script to detect the case of an executable dynamically linked to librte_eal,
> fish the path from there and then process everything in that path.
> 
I really disagree with this, because its a half-measure at best.  Yes, if its
set, you will definately get all the shared objects in that directory loaded,
but that is in no way a guarantee that those are the only libraries that get
loaded (the application may load its own independently).  So you're left in this
situation in which you get maybe some of the hardware support an application
offers.  Its also transient.  That is to say, if you configure a plugin
directory and search it when you scan an application, its contents may change
post scan, leading to erroneous results.

The way I see it, we have 3 cases that we need to handle:

1) Statically linked application - in this case, all pmds that are statically
linked in to the application will be reported, so we're good here

2) Dynamically loaded via DT_NEEDED entries - This is effectively the same as a
static linking case, in that we have a list of libraries that must be resolved
at run time, so we are safe to search for and scan the DSO's that the
application ennumerates

3) Dynamically loaded via dlopen - In this case, we don't actually know until
runtime what DSO's are going to get loaded, even if RTE_EAL_PMD_PATH is set,
because the contents of that path can change at arbitrary times.  In this case,
its correct to indicate that the application itself _doesn't_ actually support
the hardware of the PMD's in that path, because until the application is
executed, it has none of the support embodied in any DSO that it loads via
dlopen.  The hardware support travels with the DSO itself, and so its correct to
only display hardware support when the PMD shared library itself is scanned.

Handling case 3 the way I'm proposing is exactly the way the OS does it (that is
to say, it only details hardware support for the module being queried, and you
have to specify the module name to get that).  I don't see there being any
problem with that.

> > 
> > If a file is specified with no path, it is assumed to be a PMD DSO, and the
> > LD_LIBRARY_PATH, /usr/lib/ and /lib is searched for it
> 
> Same as above, /usr/lib/ and /lib is incorrect for a large number of
> systems.
> 
Yeah, I'll add 64 bit detection and correct that path accordingly

> > 
> > Currently the tool can output data in 3 formats:
> > 
> > a) raw, suitable for scripting, where the raw JSON strings are dumped out
> > b) table format (default) where hex pci ids are dumped in a table format
> > c) pretty, where a user supplied pci.ids file is used to print out vendor 
> > and
> > device strings
> 
> c) is a nice addition. Would be even nicer if it knew the most common
> pci.ids locations so it doesn't need extra arguments in those cases, ie see
> if /usr/share/hwdata/pci.ids or /usr/share/misc/pci.ids exists and use that
> unle

[dpdk-dev] [PATCHv2 1/4] pmdinfogen: Add buildtools and pmdinfogen utility

2016-05-19 Thread Neil Horman
On Thu, May 19, 2016 at 10:51:19AM +0300, Panu Matilainen wrote:
> On 05/19/2016 12:08 AM, Neil Horman wrote:
> [...]
> > +   if (strcmp(secname, ".modinfo") == 0) {
> > +   if (nobits)
> > +   fprintf(stderr, "%s has NOBITS .modinfo\n", 
> > filename);
> > +   info->modinfo = (void *)hdr + sechdrs[i].sh_offset;
> > +   info->modinfo_len = sechdrs[i].sh_size;
> > +   } else if (strcmp(secname, "__ksymtab") == 0)
> > +   info->export_sec = i;
> > +   else if (strcmp(secname, "__ksymtab_unused") == 0)
> > +   info->export_unused_sec = i;
> > +   else if (strcmp(secname, "__ksymtab_gpl") == 0)
> > +   info->export_gpl_sec = i;
> > +   else if (strcmp(secname, "__ksymtab_unused_gpl") == 0)
> > +   info->export_unused_gpl_sec = i;
> > +   else if (strcmp(secname, "__ksymtab_gpl_future") == 0)
> > +   info->export_gpl_future_sec = i;
> > +
> 
> Looks like a leftover from kernel modpost.c, not needed in DPDK.
> 
>   - Panu -
> 
Yup, I'll remove it
Neil



[dpdk-dev] EAL: memzone_reserve_aligned_thread_unsafe(): No more room in config

2016-05-19 Thread Sergio Gonzalez Monroy
On 18/05/2016 17:15, ?? wrote:
> Hi all,
>
>
> When using dpdk multi process client server example, I create many clients. 
> After the number of clients 1239, I met this error:
>
> EAL: memzone_reserve_aligned_thread_unsafe(): No more room in config
>
> RING: Cannot reserve memory
>
> EAL: Error - exiting with code: 1
>
>Cause: Cannot create tx ring queue for client 1239
>
> I have 32G huge page memory. Can anyone give some guidance how to increase 
> the memzone memory? Which parameter should I adjust it?

Usually that error means that you need to increase the following config 
value:
CONFIG_RTE_MAX_MEMZONE

Sergio



[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-19 Thread Yang, Ziye


-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] 
Sent: Thursday, May 19, 2016 6:34 PM
To: Yang, Ziye 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-11 14:08, Ziye Yang:
> This patch is used to add the class_id (class_code, subclass_code, 
> programming_interface) support for pci_device probe. With this patch, 
> it will be flexible for users to probe a class of devices by class_id.
> 
> Signed-off-by: Ziye Yang 
> ---
>  lib/librte_eal/bsdapp/eal/eal_pci.c | 4 
>  lib/librte_eal/common/eal_common_pci.c  | 3 +++  
> lib/librte_eal/common/include/rte_pci.h | 8 ++--
>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 9 +
>  4 files changed, 22 insertions(+), 2 deletions(-)

Please remove the deprecation notice.

> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -129,6 +129,7 @@ struct rte_pci_id {
>   uint16_t device_id;   /**< Device ID or PCI_ANY_ID. */
>   uint16_t subsystem_vendor_id; /**< Subsystem vendor ID or PCI_ANY_ID. */
>   uint16_t subsystem_device_id; /**< Subsystem device ID or 
> PCI_ANY_ID. */
> + uint32_t class_id;   /**< Class ID (class, subclass, pi) or 
> CLASS_ANY_ID. */
>  };

A space is missing.
It would be more logical to put the class_id at the beginning of the struct.

>  /** Any PCI device identifier (vendor, device, ...) */  #define 
> PCI_ANY_ID (0x)
> +#define CLASS_ANY_ID (0xff)

These constants should be prefixed with RTE_.
[Ziye] suggest for doing another patch to change PCI_ANY_ID to RTE_PCI_ANY_ID 
since it will affect 
Other files.

> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> + /* get class_id */
> + snprintf(filename, sizeof(filename), "%s/class",
> +  dirname);
> + if (eal_parse_sysfs_value(filename, &tmp) < 0) {
> + free(dev);
> + return -1;
> + }
> + dev->id.class_id = (uint32_t)tmp && CLASS_ANY_ID;

Should be a bitwise &. Why masking is needed?
[Ziye]  Only 24bit info is needed.


[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Ananyev, Konstantin

Hi everyone,

> On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote:
> > On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:
> > > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:
> > > > To avoid multiple stores on fast path, Ethernet drivers
> > > > aggregate the writes to data_off, refcnt, nb_segs and port
> > > > to an uint64_t data and write the data in one shot
> > > > with uint64_t* at &mbuf->rearm_data address.
> > > >
> > > > Some of the non-IA platforms have store operation overhead
> > > > if the store address is not naturally aligned.This patch
> > > > fixes the performance issue on those targets.
> > > >
> > > > Signed-off-by: Jerin Jacob 
> > > > ---
> > > >
> > > > Tested this patch on IA and non-IA(ThunderX) platforms.
> > > > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector 
> > > > environment.
> > > > and this patch does not have any overhead on IA platform.
> > > >
> > > > Have tried an another similar approach by replacing "buf_len" with "pad"
> > > > (in this patch context),
> > > > Since it has additional overhead on read and then mask to keep 
> > > > "buf_len" intact,
> > > > not much improvement is not shown.
> > > > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html
> > > >
> > > > ---
> > > While this will work and from your tests doesn't seem to have a 
> > > performance
> > > impact, I'm not sure I particularly like it. It's extending out the end of
> > > cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically 
> > > using
> > > up any more space of it.
> >
> > Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 64 
> > bytes
> > in the first 64-byte cache line.
> >
> > >
> > > What I'm wondering about though, is do we have any usecases where we need 
> > > a
> > > variable buf_len for packets for RX. These mbufs come directly from a 
> > > mempool,
> > > which is generally understood to be a set of fixed-sized buffers. I 
> > > realise that
> > > this change was made in the past after some discussion, but one of the 
> > > key points
> > > there [at least to my reading] was that - even though nobody actually 
> > > made a
> > > concrete case where they had variable-sized buffers - having support for 
> > > them
> > > made no performance difference.

I was going to point to vhost zcp support, but as Thomas pointed out
that functionality was removed  from dpdk.org recently.
So I am not aware does such case exist right now in the 'real world' or not.
Though I still think RX function should leave buf_len field intact. 

> > >
> > > The latter part of that has now changed, and supporting variable-sized 
> > > mbufs
> > > from an mbuf pool has a perf impact. Do we definitely need that 
> > > functionality,
> > > because the easiest fix here is just to move the rxrearm marker back above
> > > mbuf_len as it was originally in releases like 1.8?
> >
> > And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf).
> > Right?
> >
> > I don't have a strong opinion on this, I can do this if there is no
> > objection on this. Let me know.
> >
> > However, I do see in future, "buf_len" may belong at the end of the first 
> > 64 byte
> > cache line as currently "port" is defined as uint8_t, IMO, that is less.
> > We may need to increase that uint16_t. The reason why I think that
> > because, Currently in ThunderX HW, we do have 128VFs per socket for
> > built-in NIC, So, the two node configuration and one external PCIe NW card
> > configuration can easily go beyond 256 ports.

I wonder does anyone really use mbuf port field?
My though was - could we to drop it completely?
Actually, after discussing it with Bruce offline, an interesting idea came out:
if we'll drop port and make mbuf_prefree() to reset nb_segs=1, then
we can reduce RX rearm_data to 4B. So with that layout:

struct rte_mbuf {

 MARKER cacheline0;

void *buf_addr;   
phys_addr_t buf_physaddr; 
uint16_t buf_len;
uint8_t nb_segs;
uint8_t reserved_1byte;   /* former port */

MARKER32 rearm_data;
uint16_t data_off;
   uint16_t refcnt;

uint64_t ol_flags;
...

We can keep buf_len at its place and avoid 2B gap, while making rearm_data
4B long and 4B aligned.

Another similar alternative, is to make mbuf_prefree() to set refcnt=1
(as it update it anyway). Then we can remove refcnt from the RX rearm_data,
and again make rearm_data 4B long and 4B aligned:

struct rte_mbuf {

 MARKER cacheline0;

void *buf_addr;   
phys_addr_t buf_physaddr; 
uint16_t buf_len;
uint16_t refcnt;

MARKER32 rearm_data;
uint16_t data_off;
uint8_t nb_segs;
uint8_t port;

uint64_t ol_flags;
 ..

As additional plus, __rte_mbuf_raw_alloc() wouldn't need to modify mbuf 
contents at all -
which probably is a good thing.
As a drawback - we'll have a free mbufs in pool wi

[dpdk-dev] [PATCH 08/20] thunderx/nicvf: add tx_queue_setup/release support

2016-05-19 Thread Pattan, Reshma


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> Sent: Saturday, May 7, 2016 4:16 PM
> To: dev at dpdk.org
> Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> ; Jerin Jacob
> ; Maciej Czekaj
> ; Kamil Rytarowski
> ; Zyta Szpak
> ; Slawomir Rosek ;
> Radoslaw Biernacki 
> Subject: [dpdk-dev] [PATCH 08/20] thunderx/nicvf: add tx_queue_setup/release
> support
> + txq->txq_flags &
> ETH_TXQ_FLAGS_NOMULTMEMP);
> +
> + /* Choose optimum free threshold value for multipool case */
> + if (!txq->is_single_pool)
> + txq->tx_free_thresh =
> + (uint16_t)(tx_conf->tx_free_thresh ==
> DEFAULT_TX_FREE_THRESH ?
> + DEFAULT_TX_FREE_MPOOL_THRESH :
> + tx_conf->tx_free_thresh);
> + txq->tail = 0;
> + txq->head = 0;
> +

txq->tail and txq->head are set to 0 in nicvf_tx_queue_reset().  So will that 
be ok to remove here?

Thanks,
Reshma


[dpdk-dev] [PATCH] tools:new tool for system info CPU, memory and huge pages

2016-05-19 Thread Hunt, David
Hi Keith.
Works nicely on the few different machines I tried it on.
Regards,
Dave.

On 5/13/2016 4:43 PM, Keith Wiles wrote:
> The new tool uses /sys/devices instead of /proc directory, which
> does not exist on all systems. If the procfs is not available
> then memory and huge page information is skipped.
>
> The tool also can emit a json format in short or long form to
> allow for machine readable information.
>
> Here is the usage information:
>
> Usage: sys_info.py [options]
>
> Show the lcores layout with core id and socket(s).
>
> Options:
>  --help or -h:
>  Display the usage information and quit
>
>  --long or -l:
>  Display the information in a machine readable long format.
>
>  --short or -s:
>  Display the information in a machine readable short format.
>
>  --pci or -p:
>  Display all of the Ethernet devices in the system using 'lspci'.
>
>  --version or -v:
>  Display the current version number of this script.
>
>  --debug or -d:
>  Output some debug information.
>
>  default:
>  Display the information in a human readable format.
>
> Signed-off-by: Keith Wiles 
> ---
>   tools/sys_info.py | 551 
> ++
>   1 file changed, 551 insertions(+)
>   create mode 100755 tools/sys_info.py
>
> diff --git a/tools/sys_info.py b/tools/sys_info.py
> new file mode 100755
> index 000..5e09d12
> --- /dev/null
> +++ b/tools/sys_info.py
> @@ -0,0 +1,551 @@
> +#! /usr/bin/python
> +#
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2016 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +#
> +
> +import os, sys
> +import getopt
> +import json
> +from os.path import exists, abspath, dirname, basename
> +
> +version = "0.1.3"
> +
> +# Global lists and flags
> +machine_readable = 0
> +show_pci = False
> +debug_flag = False
> +coremaps_flag = False
> +
> +sys_info = {}
> +coremaps = {}
> +
> +def proc_cpuinfo_path():
> +'''Return the cpu information from /proc'''
> +return "/proc/cpuinfo"
> +
> +def proc_sysinfo_path():
> +'''Return the system path string from /proc'''
> +return "/proc/sysinfo"
> +
> +def proc_meminfo_path():
> +'''Return the memory information path from /proc'''
> +return "/proc/meminfo"
> +
> +def sys_system_path():
> +'''Return the system path string from /sys'''
> +return "/sys/devices/system"
> +
> +def read_file(path, whole_file=False):
> +'''Read the first line of a file'''
> +
> +if os.access(path, os.F_OK) == False:
> +print "Path (%s) Not found" % path
> +return ""
> +
> +fd = open(path)
> +if whole_file == True:
> +lines = fd.readlines()
> +else:
> +line = fd.readline()
> +fd.close()
> +
> +if whole_file == True:
> +return lines
> +
> +return line
> +
> +def get_range(line):
> +'''Split a line and convert to low/high values'''
> +
> +line = line.strip()
> +
> +if '-' in line:
> +low, high = line.split("-")
> +elif ',' in line:
> +low, high = line.split(",")
> +else:
> +return [int(line)]
> +
> +return [int(low), int(high)]
> +
> +def get_ranges(line):
> +'''Split a set of ranges into first low/high, second low/high value'''
> +
> +line = line.strip()
> +
> +first, second = line.split

[dpdk-dev] [PATCH v2] ci: Add the class_id support in pci probe

2016-05-19 Thread Ziye Yang
This patch is used to add the class_id (class_code,
subclass_code, programming_interface) support for
pci_device probe. With this patch, it will be
flexible for users to probe a class of devices
by class_id.

Signed-off-by: Ziye Yang 
---
 doc/guides/rel_notes/deprecation.rst| 6 --
 lib/librte_eal/bsdapp/eal/eal_pci.c | 5 +
 lib/librte_eal/common/eal_common_pci.c  | 3 +++
 lib/librte_eal/common/include/rte_pci.h | 8 ++--
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 9 +
 5 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 7d94ba5..28f9c61 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -20,12 +20,6 @@ Deprecation Notices
   do not need to care about the kind of devices that are being used, making it
   easier to add new buses later.

-* ABI changes are planned for struct rte_pci_id, i.e., add new field ``class``.
-  This new added ``class`` field can be used to probe pci device by class
-  related info. This change should impact size of struct rte_pci_id and struct
-  rte_pci_device. The release 16.04 does not contain these ABI changes, but
-  release 16.07 will.
-
 * The xstats API and rte_eth_xstats struct will be changed to allow retrieval
   of values without any string copies or parsing.
   No backwards compatibility is planned, as it would require code duplication
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 2d16d78..7fdd6f1 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -278,6 +278,11 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
/* get subsystem_device id */
dev->id.subsystem_device_id = conf->pc_subdevice;

+   /* get class id */
+   dev->id.class_id = (conf->pc_class << 16) |
+  (conf->pc_subclass << 8) |
+  (conf->pc_progif);
+
/* TODO: get max_vfs */
dev->max_vfs = 0;

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index 3cae4cb..6c3117d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -162,6 +162,9 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, 
struct rte_pci_device *d
if (id_table->subsystem_device_id != 
dev->id.subsystem_device_id &&
id_table->subsystem_device_id != PCI_ANY_ID)
continue;
+   if (id_table->class_id != dev->id.class_id &&
+   id_table->class_id != RTE_CLASS_ANY_ID)
+   continue;

struct rte_pci_addr *loc = &dev->addr;

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 8fa2712..c30adaf 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -125,6 +125,7 @@ struct rte_pci_resource {
  * table of these IDs for each device that it supports.
  */
 struct rte_pci_id {
+   uint32_t class_id;/**< Class ID (class, subclass, pi) or 
RTE_CLASS_ANY_ID. */
uint16_t vendor_id;   /**< Vendor ID or PCI_ANY_ID. */
uint16_t device_id;   /**< Device ID or PCI_ANY_ID. */
uint16_t subsystem_vendor_id; /**< Subsystem vendor ID or PCI_ANY_ID. */
@@ -170,6 +171,7 @@ struct rte_pci_device {

 /** Any PCI device identifier (vendor, device, ...) */
 #define PCI_ANY_ID (0x)
+#define RTE_CLASS_ANY_ID (0xff)

 #ifdef __cplusplus
 /** C++ macro used to help building up tables of device IDs */
@@ -177,14 +179,16 @@ struct rte_pci_device {
(vend),   \
(dev),\
PCI_ANY_ID,   \
-   PCI_ANY_ID
+   PCI_ANY_ID,   \
+   RTE_CLASS_ANY_ID
 #else
 /** Macro used to help building up tables of device IDs */
 #define RTE_PCI_DEVICE(vend, dev)  \
.vendor_id = (vend),   \
.device_id = (dev),\
.subsystem_vendor_id = PCI_ANY_ID, \
-   .subsystem_device_id = PCI_ANY_ID
+   .subsystem_device_id = PCI_ANY_ID, \
+   .class_id = RTE_CLASS_ANY_ID
 #endif

 struct rte_pci_driver;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index bdc08a0..ff255b4 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -306,6 +306,15 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
}
dev->id.subsystem_device_id = (uint16_t)tmp;

+   /* get class_id */
+   snprintf(filename, sizeof(filename), "%s/class",
+dirname);
+   if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+   free(dev);
+   return -1;
+   }
+   dev->id.class_id = (uint32_t)tm

[dpdk-dev] [PATCH v2] cmdline: check return value at initialization

2016-05-19 Thread Thomas Monjalon
2016-05-17 10:36, Olivier Matz:
> From: Marcin Kerlin 
> 
> The value returned by rdline_init() was not checked in cmdline_new().
> On error, free the allocated memory and return NULL.
> 
> This condition should not happen today, but it's safer to do the check
> in case rdline_init() is updated.
> 
> Fixes: af75078fece3 ("first public release")
> Coverity ID 13204
> 
> Signed-off-by: Marcin Kerlin 
> Signed-off-by: Olivier Matz 
> ---
> 
> Hi Marcin,
> 
> I updated the commit log and title to be clearer.

Applied, thanks


[dpdk-dev] [PATCH 1/2] mbuf: new NSH packet type

2016-05-19 Thread Olivier Matz
Hi Jingjing,

On 05/03/2016 07:51 AM, Jingjing Wu wrote:
> Signed-off-by: Jingjing Wu 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 529debb..79edae3 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -274,6 +274,13 @@ extern "C" {
>   */
>  #define RTE_PTYPE_L2_ETHER_LLDP 0x0004
>  /**
> + * NSH (Network Service Header) packet type.
> + *
> + * Packet format:
> + * <'ether type'=0x894F>
> + */
> +#define RTE_PTYPE_L2_ETHER_NSH  0x0005
> +/**
>   * Mask of layer 2 packet types.
>   * It is used for outer packet for tunneling cases.
>   */
> 

Acked-by: Olivier Matz 


I have no objection for this patch, but it makes me think about
2 things:

- we have the room for 16 types for each layer, maybe we should
  start to be careful about which types should be supported to
  avoid running out of types in the future.

- The types supported in outer and inner have diverged. It would
  have been better to have something like:

  #define RTE_PTYPE_INNER_$type (RTE_PTYPE_$type << 16)

  Because it would make the software using the packet types
  simpler.

It's maybe a bit late now because it would break the ABI, but
this is something we could keep in mind in case we change the
ABI for another reason.

Regards,
Olivier


[dpdk-dev] [PATCH v2] rte mempool: division or modulo by zero

2016-05-19 Thread Slawomir Mrozowicz
Fix issue reported by Coverity.

Coverity ID 13243: Division or modulo by zero
In function call rte_mempool_xmem_size, division by expression total_size
which may be zero has undefined behavior.

Fixes: 148f963fb532 ("xen: core library changes")

Signed-off-by: Slawomir Mrozowicz 
---
 lib/librte_mempool/rte_mempool.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 1ab6701..b54de43 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -239,6 +239,9 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t 
total_elt_sz, uint32_t pg_shift)
 {
size_t obj_per_page, pg_num, pg_sz;

+   if (total_elt_sz == 0)
+   return 0;
+
if (pg_shift == 0)
return total_elt_sz * elt_num;

-- 
1.9.1



[dpdk-dev] [PATCH v4] mbuf: decrease refcnt when detaching

2016-05-19 Thread Thomas Monjalon
> > The rte_pktmbuf_detach() function should decrease refcnt on a direct
> > buffer.
> > 
> > Signed-off-by: Hiroyuki Mikita 
> 
> Acked-by: Olivier Matz 

Applied with the doc reference in the commit message, thanks.


[dpdk-dev] [PATCH v3 00/35] mempool: rework memory allocation

2016-05-19 Thread Thomas Monjalon
2016-05-18 13:04, Olivier Matz:
> This series is a rework of mempool. For those who don't want to read
> all the cover letter, here is a sumary:
> 
> - it is not possible to allocate large mempools if there is not enough
>   contiguous memory, this series solves this issue
> - introduce new APIs with less arguments: "create, populate, obj_init"
> - allow to free a mempool
> - split code in smaller functions, will ease the introduction of ext_handler
> - remove test-pmd anonymous mempool creation
> - remove most of dom0-specific mempool code
> - opens the door for a eal_memory rework: we probably don't need large
>   contiguous memory area anymore, working with pages would work.
> 
> This breaks the ABI as it was indicated in the deprecation for 16.04.
> The API stays almost the same, no modification is needed in examples app
> or in test-pmd. Only kni and mellanox drivers are slightly modified.

Applied with a small change you sent me to fix mlx build in the middle of the 
patchset
and update the removed Xen files in MAINTAINERS file.

Thanks for the big rework!


[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-19 Thread Thomas Monjalon
2016-05-19 12:18, Yang, Ziye:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] 
> 2016-05-11 14:08, Ziye Yang:
> > +   dev->id.class_id = (uint32_t)tmp && CLASS_ANY_ID;
> 
> Should be a bitwise &. Why masking is needed?
> [Ziye]  Only 24bit info is needed.

What are the other bits?
Please put a comment in the code.


[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-19 Thread Yang, Ziye


-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] 
Sent: Thursday, May 19, 2016 8:57 PM
To: Yang, Ziye 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] pci: Add the class_id support in pci probe

2016-05-19 12:18, Yang, Ziye:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] 
> 2016-05-11 14:08, Ziye Yang:
> > +   dev->id.class_id = (uint32_t)tmp && CLASS_ANY_ID;
> 
> Should be a bitwise &. Why masking is needed?
> [Ziye]  Only 24bit info is needed.

What are the other bits?
Please put a comment in the code.
[Ziye] Revision ID is defined in pci spec, classid has 24 bits. And when  we 
read from the system, we will only get class_id, subclass and program 
interface. I will put the comment


[dpdk-dev] [PATCH v3] ci: Add the class_id support in pci probe

2016-05-19 Thread Ziye Yang
This patch is used to add the class_id (class_code,
subclass_code, programming_interface) support for
pci_device probe. With this patch, it will be
flexible for users to probe a class of devices
by class_id.

Signed-off-by: Ziye Yang 
---
 doc/guides/rel_notes/deprecation.rst|  6 --
 lib/librte_eal/bsdapp/eal/eal_pci.c |  5 +
 lib/librte_eal/common/eal_common_pci.c  |  3 +++
 lib/librte_eal/common/include/rte_pci.h |  8 ++--
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 10 ++
 5 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 7d94ba5..28f9c61 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -20,12 +20,6 @@ Deprecation Notices
   do not need to care about the kind of devices that are being used, making it
   easier to add new buses later.

-* ABI changes are planned for struct rte_pci_id, i.e., add new field ``class``.
-  This new added ``class`` field can be used to probe pci device by class
-  related info. This change should impact size of struct rte_pci_id and struct
-  rte_pci_device. The release 16.04 does not contain these ABI changes, but
-  release 16.07 will.
-
 * The xstats API and rte_eth_xstats struct will be changed to allow retrieval
   of values without any string copies or parsing.
   No backwards compatibility is planned, as it would require code duplication
diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 2d16d78..7fdd6f1 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -278,6 +278,11 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
/* get subsystem_device id */
dev->id.subsystem_device_id = conf->pc_subdevice;

+   /* get class id */
+   dev->id.class_id = (conf->pc_class << 16) |
+  (conf->pc_subclass << 8) |
+  (conf->pc_progif);
+
/* TODO: get max_vfs */
dev->max_vfs = 0;

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index 3cae4cb..6c3117d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -162,6 +162,9 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, 
struct rte_pci_device *d
if (id_table->subsystem_device_id != 
dev->id.subsystem_device_id &&
id_table->subsystem_device_id != PCI_ANY_ID)
continue;
+   if (id_table->class_id != dev->id.class_id &&
+   id_table->class_id != RTE_CLASS_ANY_ID)
+   continue;

struct rte_pci_addr *loc = &dev->addr;

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 8fa2712..c30adaf 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -125,6 +125,7 @@ struct rte_pci_resource {
  * table of these IDs for each device that it supports.
  */
 struct rte_pci_id {
+   uint32_t class_id;/**< Class ID (class, subclass, pi) or 
RTE_CLASS_ANY_ID. */
uint16_t vendor_id;   /**< Vendor ID or PCI_ANY_ID. */
uint16_t device_id;   /**< Device ID or PCI_ANY_ID. */
uint16_t subsystem_vendor_id; /**< Subsystem vendor ID or PCI_ANY_ID. */
@@ -170,6 +171,7 @@ struct rte_pci_device {

 /** Any PCI device identifier (vendor, device, ...) */
 #define PCI_ANY_ID (0x)
+#define RTE_CLASS_ANY_ID (0xff)

 #ifdef __cplusplus
 /** C++ macro used to help building up tables of device IDs */
@@ -177,14 +179,16 @@ struct rte_pci_device {
(vend),   \
(dev),\
PCI_ANY_ID,   \
-   PCI_ANY_ID
+   PCI_ANY_ID,   \
+   RTE_CLASS_ANY_ID
 #else
 /** Macro used to help building up tables of device IDs */
 #define RTE_PCI_DEVICE(vend, dev)  \
.vendor_id = (vend),   \
.device_id = (dev),\
.subsystem_vendor_id = PCI_ANY_ID, \
-   .subsystem_device_id = PCI_ANY_ID
+   .subsystem_device_id = PCI_ANY_ID, \
+   .class_id = RTE_CLASS_ANY_ID
 #endif

 struct rte_pci_driver;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index bdc08a0..e6f0f13 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -306,6 +306,16 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
}
dev->id.subsystem_device_id = (uint16_t)tmp;

+   /* get class_id */
+   snprintf(filename, sizeof(filename), "%s/class",
+dirname);
+   if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+   free(dev);
+   return -1;
+   }
+   /* the least 24 bits are 

[dpdk-dev] [PATCH 4/4] pmd_hw_support.py: Add tool to query binaries for hw support information

2016-05-19 Thread Neil Horman
On Thu, May 19, 2016 at 09:08:52AM +0300, Panu Matilainen wrote:
> On 05/18/2016 04:48 PM, Neil Horman wrote:
> > On Wed, May 18, 2016 at 03:48:12PM +0300, Panu Matilainen wrote:
> > > On 05/18/2016 03:03 PM, Neil Horman wrote:
> > > > On Wed, May 18, 2016 at 02:48:30PM +0300, Panu Matilainen wrote:
> > > > > On 05/16/2016 11:41 PM, Neil Horman wrote:
> > > > > > This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF 
> > > > > > binary,
> > > > > > and, if found parses the remainder of the string as a json encoded 
> > > > > > string,
> > > > > > outputting the results in either a human readable or raw, script 
> > > > > > parseable
> > > > > > format
> > > > > > 
> > > > > > Signed-off-by: Neil Horman 
> > > > > > CC: Bruce Richardson 
> > > > > > CC: Thomas Monjalon 
> > > > > > CC: Stephen Hemminger 
> > > > > > CC: Panu Matilainen 
> > > > > > ---
> > > > > >  tools/pmd_hw_support.py | 174 
> > > > > > 
> > > > > >  1 file changed, 174 insertions(+)
> > > > > >  create mode 100755 tools/pmd_hw_support.py
> > > > > > 
> > > > > > diff --git a/tools/pmd_hw_support.py b/tools/pmd_hw_support.py
> > > > > > new file mode 100755
> > > > > > index 000..0669aca
> > > > > > --- /dev/null
> > > > > > +++ b/tools/pmd_hw_support.py
> > > > > > @@ -0,0 +1,174 @@
> > > > > > +#!/usr/bin/python3
> > > > > 
> > > > > I think this should use /usr/bin/python to be consistent with the 
> > > > > other
> > > > > python scripts, and like the others work with python 2 and 3. I only 
> > > > > tested
> > > > > it with python2 after changing this and it seemed to work fine so the
> > > > > compatibility side should be fine as-is.
> > > > > 
> > > > Sure, I can change the python executable, that makes sense.
> > > > 
> > > > > On the whole, AFAICT the patch series does what it promises, and 
> > > > > works for
> > > > > both static and shared linkage. Using JSON formatted strings in an ELF
> > > > > section is a sound working technical solution for the storage of the 
> > > > > data.
> > > > > But the difference between the two cases makes me wonder about this 
> > > > > all...
> > > > You mean the difference between checking static binaries and dynamic 
> > > > binaries?
> > > > yes, there is some functional difference there
> > > > 
> > > > > 
> > > > > For static library build, you'd query the application executable, eg
> > > > Correct.
> > > > 
> > > > > testpmd, to get the data out. For a shared library build, that method 
> > > > > gives
> > > > > absolutely nothing because the data is scattered around in individual
> > > > > libraries which might be just about wherever, and you need to somehow
> > > > Correct, I figured that users would be smart enough to realize that with
> > > > dynamically linked executables, they would need to look at DSO's, but I 
> > > > agree,
> > > > its a glaring diffrence.
> > > 
> > > Being able to look at DSOs is good, but expecting the user to figure out
> > > which DSOs might be loaded and not and where to look is going to be well
> > > above many users. At very least it's not what I would call user-friendly.
> > > 
> > I disagree, there is no linkage between an application and the dso's it 
> > opens
> > via dlopen that is exportable.  The only way to handle that is to have a
> > standard search path for the pmd_hw_info python script.  Thats just like 
> > modinfo
> > works (i.e. "modinfo bnx2" finds the bnx2 module for the running kernel).  
> > We
> > can of course do something simmilar, but we have no existing implicit path
> > information to draw from to do that (because you can have multiple dpdk 
> > installs
> > side by side).  The only way around that is to explicitly call out the path 
> > on
> > the command line.
> 
> There's no telling what libraries user might load at runtime with -D, that
> is true for both static and shared libraries.
> 
I agree.

> When CONFIG_RTE_EAL_PMD_PATH is set, as it is likely to be on distro builds,
> you *know* that everything in that path will be loaded on runtime regardless
> of what commandline options there might be so the situation is actually on
> par with static builds. Of course you still dont know about ones added with
> -D but that's a limitation of any solution that works without actually
> running the app.
> 
Its not on ours, as the pmd libraries get placed in the same directory as every
other dpdk library, and no one wants to try (and fail to load
rte_sched/rte_acl/etc twice, or deal with the fallout of trying to do so, or
adjust the packaging so that pmds are placed in their own subdirectory, or
handle the need for multiuser support.

Using CONFIG_RTE_EAL_PMD_PATH also doesn't account for directory changes.  This
use case:
1) run pmdinfo 
2) remove DSOs from RTE_EAL_PMD_PATH
3) execute 

leads to erroneous results, as hardware support that was reported in (1) is no
longer available at (3)

It also completely misses any libraries that we load via the -d option on the
co

[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Jerin Jacob
On Thu, May 19, 2016 at 12:18:57PM +, Ananyev, Konstantin wrote:
> 
> Hi everyone,
>  
> > On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote:
> > > On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:
> > > > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:
> > > > > To avoid multiple stores on fast path, Ethernet drivers
> > > > > aggregate the writes to data_off, refcnt, nb_segs and port
> > > > > to an uint64_t data and write the data in one shot
> > > > > with uint64_t* at &mbuf->rearm_data address.
> > > > >
> > > > > Some of the non-IA platforms have store operation overhead
> > > > > if the store address is not naturally aligned.This patch
> > > > > fixes the performance issue on those targets.
> > > > >
> > > > > Signed-off-by: Jerin Jacob 
> > > > > ---
> > > > >
> > > > > Tested this patch on IA and non-IA(ThunderX) platforms.
> > > > > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + 
> > > > > vector environment.
> > > > > and this patch does not have any overhead on IA platform.
> > > > >
> > > > > Have tried an another similar approach by replacing "buf_len" with 
> > > > > "pad"
> > > > > (in this patch context),
> > > > > Since it has additional overhead on read and then mask to keep 
> > > > > "buf_len" intact,
> > > > > not much improvement is not shown.
> > > > > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html
> > > > >
> > > > > ---
> > > > While this will work and from your tests doesn't seem to have a 
> > > > performance
> > > > impact, I'm not sure I particularly like it. It's extending out the end 
> > > > of
> > > > cacheline0 of the mbuf by 16 bytes, though I suppose it's not 
> > > > technically using
> > > > up any more space of it.
> > >
> > > Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 
> > > 64 bytes
> > > in the first 64-byte cache line.
> > >
> > > >
> > > > What I'm wondering about though, is do we have any usecases where we 
> > > > need a
> > > > variable buf_len for packets for RX. These mbufs come directly from a 
> > > > mempool,
> > > > which is generally understood to be a set of fixed-sized buffers. I 
> > > > realise that
> > > > this change was made in the past after some discussion, but one of the 
> > > > key points
> > > > there [at least to my reading] was that - even though nobody actually 
> > > > made a
> > > > concrete case where they had variable-sized buffers - having support 
> > > > for them
> > > > made no performance difference.
> 
> I was going to point to vhost zcp support, but as Thomas pointed out
> that functionality was removed  from dpdk.org recently.
> So I am not aware does such case exist right now in the 'real world' or not.
> Though I still think RX function should leave buf_len field intact. 
> 
> > > >
> > > > The latter part of that has now changed, and supporting variable-sized 
> > > > mbufs
> > > > from an mbuf pool has a perf impact. Do we definitely need that 
> > > > functionality,
> > > > because the easiest fix here is just to move the rxrearm marker back 
> > > > above
> > > > mbuf_len as it was originally in releases like 1.8?
> > >
> > > And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf).
> > > Right?
> > >
> > > I don't have a strong opinion on this, I can do this if there is no
> > > objection on this. Let me know.
> > >
> > > However, I do see in future, "buf_len" may belong at the end of the first 
> > > 64 byte
> > > cache line as currently "port" is defined as uint8_t, IMO, that is less.
> > > We may need to increase that uint16_t. The reason why I think that
> > > because, Currently in ThunderX HW, we do have 128VFs per socket for
> > > built-in NIC, So, the two node configuration and one external PCIe NW card
> > > configuration can easily go beyond 256 ports.
> 
> I wonder does anyone really use mbuf port field?
> My though was - could we to drop it completely?
> Actually, after discussing it with Bruce offline, an interesting idea came 
> out:
> if we'll drop port and make mbuf_prefree() to reset nb_segs=1, then
> we can reduce RX rearm_data to 4B. So with that layout:
> 
> struct rte_mbuf {
> 
>  MARKER cacheline0;
> 
> void *buf_addr;   
> phys_addr_t buf_physaddr; 
> uint16_t buf_len;
> uint8_t nb_segs;
> uint8_t reserved_1byte;   /* former port */
> 
> MARKER32 rearm_data;
> uint16_t data_off;
>uint16_t refcnt;
>
> uint64_t ol_flags;
> ...
> 
> We can keep buf_len at its place and avoid 2B gap, while making rearm_data
> 4B long and 4B aligned.

Couple of comments,
- IMO, It is good if nb_segs can move under rearm_data, as some
drivers(not in ixgbe may be) can write nb_segs in one shot also
in segmented rx handler case
- I think, it makes sense to keep port in mbuf so that application
can make use of it(Not sure what real application developers think of
this)
- if Writing 4B and 8B consume

[dpdk-dev] mempool: external mempool manager

2016-05-19 Thread David Hunt

Here's the latest version of the External Mempool Manager patchset.
It's re-based on top of the latest head as of 19/5/2016, including 
Olivier's 35-part patch series on mempool re-org [1]

[1] http://dpdk.org/ml/archives/dev/2016-May/039229.html

v5 changes:
 * rebasing, as it is dependent on another patch series [1]

v4 changes (Olivier Matz):
 * remove the rte_mempool_create_ext() function. To change the handler, the
   user has to do the following:
   - mp = rte_mempool_create_empty()
   - rte_mempool_set_handler(mp, "my_handler")
   - rte_mempool_populate_default(mp)
   This avoids to add another function with more than 10 arguments, duplicating
   the doxygen comments
 * change the api of rte_mempool_alloc_t: only the mempool pointer is required
   as all information is available in it
 * change the api of rte_mempool_free_t: remove return value
 * move inline wrapper functions from the .c to the .h (else they won't be
   inlined). This implies to have one header file (rte_mempool.h), or it
   would have generate cross dependencies issues.
 * remove now unused MEMPOOL_F_INT_HANDLER (note: it was misused anyway due
   to the use of && instead of &)
 * fix build in debug mode (__MEMPOOL_STAT_ADD(mp, put_pool, n) remaining)
 * fix build with shared libraries (global handler has to be declared in
   the .map file)
 * rationalize #include order
 * remove unused function rte_mempool_get_handler_name()
 * rename some structures, fields, functions
 * remove the static in front of rte_tailq_elem rte_mempool_tailq (comment
   from Yuanhan)
 * test the ext mempool handler in the same file than standard mempool tests,
   avoiding to duplicate the code
 * rework the custom handler in mempool_test
 * rework a bit the patch selecting default mbuf pool handler
 * fix some doxygen comments

v3 changes:
 * simplified the file layout, renamed to rte_mempool_handler.[hc]
 * moved the default handlers into rte_mempool_default.c
 * moved the example handler out into app/test/test_ext_mempool.c
 * removed is_mc/is_mp change, slight perf degredation on sp cached operation
 * removed stack hanler, may re-introduce at a later date
 * Changes out of code reviews

v2 changes:
 * There was a lot of duplicate code between rte_mempool_xmem_create and
   rte_mempool_create_ext. This has now been refactored and is now
   hopefully cleaner.
 * The RTE_NEXT_ABI define is now used to allow building of the library
   in a format that is compatible with binaries built against previous
   versions of DPDK.
 * Changes out of code reviews. Hopefully I've got most of them included.

The External Mempool Manager is an extension to the mempool API that allows
users to add and use an external mempool manager, which allows external memory
subsystems such as external hardware memory management systems and software
based memory allocators to be used with DPDK.

The existing API to the internal DPDK mempool manager will remain unchanged
and will be backward compatible. However, there will be an ABI breakage, as
the mempool struct is changing. These changes are all contained withing
RTE_NEXT_ABI defs, and the current or next code can be changed with
the CONFIG_RTE_NEXT_ABI config setting

There are two aspects to external mempool manager.
  1. Adding the code for your new mempool handler. This is achieved by adding a
 new mempool handler source file into the librte_mempool library, and
 using the REGISTER_MEMPOOL_HANDLER macro.
  2. Using the new API to call rte_mempool_create_empty and
 rte_mempool_set_handler to create a new mempool
 using the name parameter to identify which handler to use.

New API calls added
 1. A new rte_mempool_create_empty() function
 2. rte_mempool_set_handler() which sets the mempool's handler
 3. An rte_mempool_populate_default() and rte_mempool_populate_anon() functions
which populates the mempool using the relevant handler

Several external mempool managers may be used in the same application. A new
mempool can then be created by using the new 'create' function, providing the
mempool handler name to point the mempool to the relevant mempool manager
callback structure.

The old 'create' function can still be called by legacy programs, and will
internally work out the mempool handle based on the flags provided (single
producer, single consumer, etc). By default handles are created internally to
implement the built-in DPDK mempool manager and mempool types.

The external mempool manager needs to provide the following functions.
 1. alloc - allocates the mempool memory, and adds each object onto a ring
 2. put   - puts an object back into the mempool once an application has
finished with it
 3. get   - gets an object from the mempool for use by the application
 4. get_count - gets the number of available objects in the mempool
 5. free  - frees the mempool memory

Every time a get/put/get_count is called from the application/PMD, the
callback for that mempool is called. 

[dpdk-dev] [PATCH v5 1/3] mempool: support external handler

2016-05-19 Thread David Hunt
Until now, the objects stored in mempool mempool were internally stored a
ring. This patch introduce the possibility to register external handlers
replacing the ring.

The default behavior remains unchanged, but calling the new function
rte_mempool_set_handler() right after rte_mempool_create_empty() allows to
change the handler that will be used when populating the mempool.

v5 changes: rebasing on top of 35 patch set mempool work.

Signed-off-by: David Hunt 
Signed-off-by: Olivier Matz 
---
 app/test/test_mempool_perf.c   |   1 -
 lib/librte_mempool/Makefile|   2 +
 lib/librte_mempool/rte_mempool.c   |  73 --
 lib/librte_mempool/rte_mempool.h   | 212 +
 lib/librte_mempool/rte_mempool_default.c   | 147 
 lib/librte_mempool/rte_mempool_handler.c   | 139 +++
 lib/librte_mempool/rte_mempool_version.map |   4 +
 7 files changed, 506 insertions(+), 72 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_default.c
 create mode 100644 lib/librte_mempool/rte_mempool_handler.c

diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
index cdc02a0..091c1df 100644
--- a/app/test/test_mempool_perf.c
+++ b/app/test/test_mempool_perf.c
@@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
   n_get_bulk);
if (unlikely(ret < 0)) {
rte_mempool_dump(stdout, mp);
-   rte_ring_dump(stdout, mp->ring);
/* in this case, objects are lost... */
return -1;
}
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 43423e0..f19366e 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 2

 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_handler.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 1ab6701..6ec2b3f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -148,7 +148,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, 
phys_addr_t physaddr)
 #endif

/* enqueue in ring */
-   rte_ring_sp_enqueue(mp->ring, obj);
+   rte_mempool_ext_put_bulk(mp, &obj, 1);
 }

 /* call obj_cb() for each mempool element */
@@ -300,40 +300,6 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t 
elt_num,
return (size_t)paddr_idx << pg_shift;
 }

-/* create the internal ring */
-static int
-rte_mempool_ring_create(struct rte_mempool *mp)
-{
-   int rg_flags = 0, ret;
-   char rg_name[RTE_RING_NAMESIZE];
-   struct rte_ring *r;
-
-   ret = snprintf(rg_name, sizeof(rg_name),
-   RTE_MEMPOOL_MZ_FORMAT, mp->name);
-   if (ret < 0 || ret >= (int)sizeof(rg_name))
-   return -ENAMETOOLONG;
-
-   /* ring flags */
-   if (mp->flags & MEMPOOL_F_SP_PUT)
-   rg_flags |= RING_F_SP_ENQ;
-   if (mp->flags & MEMPOOL_F_SC_GET)
-   rg_flags |= RING_F_SC_DEQ;
-
-   /* Allocate the ring that will be used to store objects.
-* Ring functions will return appropriate errors if we are
-* running as a secondary process etc., so no checks made
-* in this function for that condition.
-*/
-   r = rte_ring_create(rg_name, rte_align32pow2(mp->size + 1),
-   mp->socket_id, rg_flags);
-   if (r == NULL)
-   return -rte_errno;
-
-   mp->ring = r;
-   mp->flags |= MEMPOOL_F_RING_CREATED;
-   return 0;
-}
-
 /* free a memchunk allocated with rte_memzone_reserve() */
 static void
 rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
@@ -351,7 +317,7 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
void *elt;

while (!STAILQ_EMPTY(&mp->elt_list)) {
-   rte_ring_sc_dequeue(mp->ring, &elt);
+   rte_mempool_ext_get_bulk(mp, &elt, 1);
(void)elt;
STAILQ_REMOVE_HEAD(&mp->elt_list, next);
mp->populated_size--;
@@ -380,15 +346,18 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char 
*vaddr,
unsigned i = 0;
size_t off;
struct rte_mempool_memhdr *memhdr;
-   int ret;

/* create the internal ring if not already done */
if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
-   ret = rte_mempool_ring_create(mp);
-   if (ret < 0)
-   return ret;
+   rte_errno = 0;
+   mp->po

[dpdk-dev] [PATCH v5 2/3] app/test: test external mempool handler

2016-05-19 Thread David Hunt
Use a minimal custom mempool external handler and check that it also
passes basic mempool autotests.

Signed-off-by: Olivier Matz 
Signed-off-by: David Hunt 
---
 app/test/test_mempool.c | 113 
 1 file changed, 113 insertions(+)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 9f02758..f55d126 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -85,6 +85,96 @@
 static rte_atomic32_t synchro;

 /*
+ * Simple example of custom mempool structure. Holds pointers to all the
+ * elements which are simply malloc'd in this example.
+ */
+struct custom_mempool {
+   rte_spinlock_t lock;
+   unsigned count;
+   unsigned size;
+   void *elts[];
+};
+
+/*
+ * Loop though all the element pointers and allocate a chunk of memory, then
+ * insert that memory into the ring.
+ */
+static void *
+custom_mempool_alloc(struct rte_mempool *mp)
+{
+   struct custom_mempool *cm;
+
+   cm = rte_zmalloc("custom_mempool",
+   sizeof(struct custom_mempool) + mp->size * sizeof(void *), 0);
+   if (cm == NULL)
+   return NULL;
+
+   rte_spinlock_init(&cm->lock);
+   cm->count = 0;
+   cm->size = mp->size;
+   return cm;
+}
+
+static void
+custom_mempool_free(void *p)
+{
+   rte_free(p);
+}
+
+static int
+custom_mempool_put(void *p, void * const *obj_table, unsigned n)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)p;
+   int ret = 0;
+
+   rte_spinlock_lock(&cm->lock);
+   if (cm->count + n > cm->size) {
+   ret = -ENOBUFS;
+   } else {
+   memcpy(&cm->elts[cm->count], obj_table, sizeof(void *) * n);
+   cm->count += n;
+   }
+   rte_spinlock_unlock(&cm->lock);
+   return ret;
+}
+
+
+static int
+custom_mempool_get(void *p, void **obj_table, unsigned n)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)p;
+   int ret = 0;
+
+   rte_spinlock_lock(&cm->lock);
+   if (n > cm->count) {
+   ret = -ENOENT;
+   } else {
+   cm->count -= n;
+   memcpy(obj_table, &cm->elts[cm->count], sizeof(void *) * n);
+   }
+   rte_spinlock_unlock(&cm->lock);
+   return ret;
+}
+
+static unsigned
+custom_mempool_get_count(void *p)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)p;
+   return cm->count;
+}
+
+static struct rte_mempool_handler mempool_handler_custom = {
+   .name = "custom_handler",
+   .alloc = custom_mempool_alloc,
+   .free = custom_mempool_free,
+   .put = custom_mempool_put,
+   .get = custom_mempool_get,
+   .get_count = custom_mempool_get_count,
+};
+
+MEMPOOL_REGISTER_HANDLER(mempool_handler_custom);
+
+/*
  * save the object number in the first 4 bytes of object data. All
  * other bytes are set to 0.
  */
@@ -479,6 +569,7 @@ test_mempool(void)
 {
struct rte_mempool *mp_cache = NULL;
struct rte_mempool *mp_nocache = NULL;
+   struct rte_mempool *mp_ext = NULL;

rte_atomic32_init(&synchro);

@@ -507,6 +598,27 @@ test_mempool(void)
goto err;
}

+   /* create a mempool with an external handler */
+   mp_ext = rte_mempool_create_empty("test_ext",
+   MEMPOOL_SIZE,
+   MEMPOOL_ELT_SIZE,
+   RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
+   SOCKET_ID_ANY, 0);
+
+   if (mp_ext == NULL) {
+   printf("cannot allocate mp_ext mempool\n");
+   goto err;
+   }
+   if (rte_mempool_set_handler(mp_ext, "custom_handler") < 0) {
+   printf("cannot set custom handler\n");
+   goto err;
+   }
+   if (rte_mempool_populate_default(mp_ext) < 0) {
+   printf("cannot populate mp_ext mempool\n");
+   goto err;
+   }
+   rte_mempool_obj_iter(mp_ext, my_obj_init, NULL);
+
/* retrieve the mempool from its name */
if (rte_mempool_lookup("test_nocache") != mp_nocache) {
printf("Cannot lookup mempool from its name\n");
@@ -547,6 +659,7 @@ test_mempool(void)
 err:
rte_mempool_free(mp_nocache);
rte_mempool_free(mp_cache);
+   rte_mempool_free(mp_ext);
return -1;
 }

-- 
2.5.5



[dpdk-dev] [PATCH v5 3/3] mbuf: get default mempool handler from configuration

2016-05-19 Thread David Hunt
By default, the mempool handler used for mbuf allocations is a multi
producer and multi consumer ring. We could imagine a target (maybe some
network processors?) that provides an hardware-assisted pool
mechanism. In this case, the default configuration for this architecture
would contain a different value for RTE_MBUF_DEFAULT_MEMPOOL_HANDLER.

Signed-off-by: David Hunt 
Signed-off-by: Olivier Matz 
---
 config/common_base |  1 +
 lib/librte_mbuf/rte_mbuf.c | 21 +
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/config/common_base b/config/common_base
index 3535c6e..5cf5e52 100644
--- a/config/common_base
+++ b/config/common_base
@@ -394,6 +394,7 @@ CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n
 #
 CONFIG_RTE_LIBRTE_MBUF=y
 CONFIG_RTE_LIBRTE_MBUF_DEBUG=n
+CONFIG_RTE_MBUF_DEFAULT_MEMPOOL_HANDLER="ring_mp_mc"
 CONFIG_RTE_MBUF_REFCNT_ATOMIC=y
 CONFIG_RTE_PKTMBUF_HEADROOM=128

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index eec1456..5dcdc05 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -153,6 +153,7 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
int socket_id)
 {
+   struct rte_mempool *mp;
struct rte_pktmbuf_pool_private mbp_priv;
unsigned elt_size;

@@ -167,10 +168,22 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
mbp_priv.mbuf_data_room_size = data_room_size;
mbp_priv.mbuf_priv_size = priv_size;

-   return rte_mempool_create(name, n, elt_size,
-   cache_size, sizeof(struct rte_pktmbuf_pool_private),
-   rte_pktmbuf_pool_init, &mbp_priv, rte_pktmbuf_init, NULL,
-   socket_id, 0);
+   mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
+   if (mp == NULL)
+   return NULL;
+
+   rte_mempool_set_handler(mp, RTE_MBUF_DEFAULT_MEMPOOL_HANDLER);
+   rte_pktmbuf_pool_init(mp, &mbp_priv);
+
+   if (rte_mempool_populate_default(mp) < 0) {
+   rte_mempool_free(mp);
+   return NULL;
+   }
+
+   rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL);
+
+   return mp;
 }

 /* do some sanity checks on a mbuf: panic if it fails */
-- 
2.5.5



[dpdk-dev] [PATCH] arm64: change rte_memcpy to inline function

2016-05-19 Thread Jianbo Liu
On 13 May 2016 at 23:49, Thomas Monjalon  wrote:
> 2016-05-10 14:01, Jianbo Liu:
>> Other APP may call rte_memcpy by function pointer,
>> so change it to an inline function.
>
> Any example in mind?
>
It's for ODP-DPDK.
>> --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
>> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
>> -#define rte_memcpy(d, s, n)  memcpy((d), (s), (n))
>> +static inline void *
>> +rte_memcpy(void *dst, const void *src, size_t n)
>> +{
>> + return memcpy(dst, src, n);
>> +}
>
> It has no sense if other archs (arm32, ppc64, tile) are not updated.
>
But it also an inline function on x86.
Sorry for my late reply...


[dpdk-dev] [PATCH v2] doc: announce ABI change of struct rte_port_source_params and rte_port_sink_params

2016-05-19 Thread Fan Zhang
The ABI changes are planned for rte_port_source_params and
rte_port_sink_params, which will be supported from release 16.11. Here
announces that ABI changes in detail.

Signed-off-by: Fan Zhang 
Acked-by: Cristian Dumitrescu 
---
 doc/guides/rel_notes/deprecation.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index fffe9c7..4f3fefe 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -74,3 +74,11 @@ Deprecation Notices
   a handle, like the way kernel exposes an fd to user for locating a
   specific file, and to keep all major structures internally, so that
   we are likely to be free from ABI violations in future.
+
+* ABI will change for rte_port_source_params struct. The member file_name
+  data type will be changed from char * to const char *. This change targets
+  release 16.11
+
+* ABI will change for rte_port_sink_params struct. The member file_name
+  data type will be changed from char * to const char *. This change targets
+  release 16.11
-- 
2.5.5



[dpdk-dev] [PATCH] librte_table: remove unnecessary printf messages from acl build

2016-05-19 Thread Jasvinder Singh
Removes rte_acl_dump() call from rte_table_acl_build () as it invokes
number of printf messages.

Signed-off-by: Jasvinder Singh 
Acked-by: Cristian Dumitrescu 
---
 lib/librte_table/rte_table_acl.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_table/rte_table_acl.c b/lib/librte_table/rte_table_acl.c
index c1eb848..8f1f8ce 100644
--- a/lib/librte_table/rte_table_acl.c
+++ b/lib/librte_table/rte_table_acl.c
@@ -236,8 +236,6 @@ rte_table_acl_build(struct rte_table_acl *acl, struct 
rte_acl_ctx **acl_ctx)
return -1;
}

-   rte_acl_dump(ctx);
-
*acl_ctx = ctx;
return 0;
 }
-- 
2.5.5



[dpdk-dev] [PATCH] doc: fix typo in freebsd doc

2016-05-19 Thread Pablo de Lara
Signed-off-by: Pablo de Lara 
---
 doc/guides/freebsd_gsg/build_dpdk.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst 
b/doc/guides/freebsd_gsg/build_dpdk.rst
index ceacf7f..edf3725 100644
--- a/doc/guides/freebsd_gsg/build_dpdk.rst
+++ b/doc/guides/freebsd_gsg/build_dpdk.rst
@@ -263,7 +263,7 @@ To avoid this error, reduce the number of buffers or the 
buffer size.
 Loading the DPDK nic_uio Module
 ---

-After loading the contigmem module, the ``nic_uio must`` also be loaded into 
the
+After loading the contigmem module, the ``nic_uio`` must also be loaded into 
the
 running kernel prior to running any DPDK application.  This module must
 be loaded using the kldload command as shown below (assuming that the current
 directory is the DPDK target directory).
-- 
2.5.5



[dpdk-dev] [PATCH] doc: fix typo in freebsd doc

2016-05-19 Thread Bruce Richardson
On Thu, May 19, 2016 at 03:32:22PM +0100, Pablo de Lara wrote:
> Signed-off-by: Pablo de Lara 
> ---
>  doc/guides/freebsd_gsg/build_dpdk.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst 
> b/doc/guides/freebsd_gsg/build_dpdk.rst
> index ceacf7f..edf3725 100644
> --- a/doc/guides/freebsd_gsg/build_dpdk.rst
> +++ b/doc/guides/freebsd_gsg/build_dpdk.rst
> @@ -263,7 +263,7 @@ To avoid this error, reduce the number of buffers or the 
> buffer size.
>  Loading the DPDK nic_uio Module
>  ---
>  
> -After loading the contigmem module, the ``nic_uio must`` also be loaded into 
> the
> +After loading the contigmem module, the ``nic_uio`` must also be loaded into 
> the

Actually, that sentence doesn't read right. It should probably be either 
"the ``nic_uio`` module" or just "``nic_uio``". "The ``nic_uio``" doesn't sound
correct.

/Bruce



[dpdk-dev] [PATCH] doc: fix typo in freebsd doc

2016-05-19 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, May 19, 2016 3:34 PM
> To: De Lara Guarch, Pablo
> Cc: dev at dpdk.org; Mcnamara, John
> Subject: Re: [dpdk-dev] [PATCH] doc: fix typo in freebsd doc
> 
> On Thu, May 19, 2016 at 03:32:22PM +0100, Pablo de Lara wrote:
> > Signed-off-by: Pablo de Lara 
> > ---
> >  doc/guides/freebsd_gsg/build_dpdk.rst | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst
> b/doc/guides/freebsd_gsg/build_dpdk.rst
> > index ceacf7f..edf3725 100644
> > --- a/doc/guides/freebsd_gsg/build_dpdk.rst
> > +++ b/doc/guides/freebsd_gsg/build_dpdk.rst
> > @@ -263,7 +263,7 @@ To avoid this error, reduce the number of buffers
> or the buffer size.
> >  Loading the DPDK nic_uio Module
> >  ---
> >
> > -After loading the contigmem module, the ``nic_uio must`` also be loaded
> into the
> > +After loading the contigmem module, the ``nic_uio`` must also be loaded
> into the
> 
> Actually, that sentence doesn't read right. It should probably be either
> "the ``nic_uio`` module" or just "``nic_uio``". "The ``nic_uio``" doesn't 
> sound
> correct.

True! Will send a v2.
> 
> /Bruce



[dpdk-dev] [PATCH v2] doc: fix typo in freebsd doc

2016-05-19 Thread Pablo de Lara
Signed-off-by: Pablo de Lara 
---

Changes in v2:
- Added missing word

 doc/guides/freebsd_gsg/build_dpdk.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst 
b/doc/guides/freebsd_gsg/build_dpdk.rst
index ceacf7f..1d92c08 100644
--- a/doc/guides/freebsd_gsg/build_dpdk.rst
+++ b/doc/guides/freebsd_gsg/build_dpdk.rst
@@ -263,7 +263,7 @@ To avoid this error, reduce the number of buffers or the 
buffer size.
 Loading the DPDK nic_uio Module
 ---

-After loading the contigmem module, the ``nic_uio must`` also be loaded into 
the
+After loading the contigmem module, the ``nic_uio`` module must also be loaded 
into the
 running kernel prior to running any DPDK application.  This module must
 be loaded using the kldload command as shown below (assuming that the current
 directory is the DPDK target directory).
-- 
2.5.5



[dpdk-dev] [PATCH v2] doc: fix typo in freebsd doc

2016-05-19 Thread Richardson, Bruce


> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Thursday, May 19, 2016 3:39 PM
> To: dev at dpdk.org; Richardson, Bruce 
> Cc: Mcnamara, John ; De Lara Guarch, Pablo
> 
> Subject: [PATCH v2] doc: fix typo in freebsd doc
> 
> Signed-off-by: Pablo de Lara 
Acked-by: Bruce Richardson 



[dpdk-dev] v2 mempool: add stack (lifo) mempool handler

2016-05-19 Thread David Hunt
This patch set adds a fifo stack handler to the external mempool
manager.

This patch set depends on the 3 part external mempool handler
patch set (v5 of the series):
http://dpdk.org/ml/archives/dev/2016-May/039364.html

v2 changes:
  * updated based on mailing list feedback (Thanks Stephen)
  * checkpatch fixes.




[dpdk-dev] [PATCH v2 1/3] mempool: add stack (lifo) mempool handler

2016-05-19 Thread David Hunt
This is a mempool handler that is useful for pipelining apps, where
the mempool cache doesn't really work - example, where we have one
core doing rx (and alloc), and another core doing Tx (and return). In
such a case, the mempool ring simply cycles through all the mbufs,
resulting in a LLC miss on every mbuf allocated when the number of
mbufs is large. A stack recycles buffers more effectively in this
case.

v2: cleanup based on mailing list comments. Mainly removal of
unnecessary casts and comments.

Signed-off-by: David Hunt 
---
 lib/librte_mempool/Makefile|   1 +
 lib/librte_mempool/rte_mempool_stack.c | 145 +
 2 files changed, 146 insertions(+)
 create mode 100644 lib/librte_mempool/rte_mempool_stack.c

diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index f19366e..5aa9ef8 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -44,6 +44,7 @@ LIBABIVER := 2
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_handler.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_stack.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h

diff --git a/lib/librte_mempool/rte_mempool_stack.c 
b/lib/librte_mempool/rte_mempool_stack.c
new file mode 100644
index 000..6e25028
--- /dev/null
+++ b/lib/librte_mempool/rte_mempool_stack.c
@@ -0,0 +1,145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+struct rte_mempool_common_stack {
+   rte_spinlock_t sl;
+
+   uint32_t size;
+   uint32_t len;
+   void *objs[];
+};
+
+static void *
+common_stack_alloc(struct rte_mempool *mp)
+{
+   struct rte_mempool_common_stack *s;
+   unsigned n = mp->size;
+   int size = sizeof(*s) + (n+16)*sizeof(void *);
+
+   /* Allocate our local memory structure */
+   s = rte_zmalloc_socket("common-stack",
+   size,
+   RTE_CACHE_LINE_SIZE,
+   mp->socket_id);
+   if (s == NULL) {
+   RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n");
+   return NULL;
+   }
+
+   rte_spinlock_init(&s->sl);
+
+   s->size = n;
+   mp->pool = s;
+   rte_mempool_set_handler(mp, "stack");
+
+   return s;
+}
+
+static int common_stack_put(void *p, void * const *obj_table,
+   unsigned n)
+{
+   struct rte_mempool_common_stack *s = p;
+   void **cache_objs;
+   unsigned index;
+
+   rte_spinlock_lock(&s->sl);
+   cache_objs = &s->objs[s->len];
+
+   /* Is there sufficient space in the stack ? */
+   if ((s->len + n) > s->size) {
+   rte_spinlock_unlock(&s->sl);
+   return -ENOENT;
+   }
+
+   /* Add elements back into the cache */
+   for (index = 0; index < n; ++index, obj_table++)
+   cache_objs[index] = *obj_table;
+
+   s->len += n;
+
+   rte_spinlock_unlock(&s->sl);
+   return 0;
+}
+
+static int common_stack_get(void *p, void **obj_table,
+   unsigned n)
+{
+   struct rte_mempool_common_stack *s = p;
+   void **cache_objs;
+   unsigned index, len;
+
+   rte_spinlock_lock(&s->

[dpdk-dev] [PATCH v2 2/3] mempool: make declaration of handler structs const

2016-05-19 Thread David Hunt
For security, any data structure with function pointers should be const

Signed-off-by: David Hunt 
---
 lib/librte_mempool/rte_mempool.h | 2 +-
 lib/librte_mempool/rte_mempool_default.c | 8 
 lib/librte_mempool/rte_mempool_handler.c | 2 +-
 lib/librte_mempool/rte_mempool_stack.c   | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index ed2c110..baa5f48 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -491,7 +491,7 @@ rte_mempool_set_handler(struct rte_mempool *mp, const char 
*name);
  *   - >=0: Sucess; return the index of the handler in the table.
  *   - <0: Error (errno)
  */
-int rte_mempool_handler_register(struct rte_mempool_handler *h);
+int rte_mempool_handler_register(const struct rte_mempool_handler *h);

 /**
  * Macro to statically register an external pool handler.
diff --git a/lib/librte_mempool/rte_mempool_default.c 
b/lib/librte_mempool/rte_mempool_default.c
index a6ac65a..764f772 100644
--- a/lib/librte_mempool/rte_mempool_default.c
+++ b/lib/librte_mempool/rte_mempool_default.c
@@ -105,7 +105,7 @@ common_ring_free(void *p)
rte_ring_free((struct rte_ring *)p);
 }

-static struct rte_mempool_handler handler_mp_mc = {
+static const struct rte_mempool_handler handler_mp_mc = {
.name = "ring_mp_mc",
.alloc = common_ring_alloc,
.free = common_ring_free,
@@ -114,7 +114,7 @@ static struct rte_mempool_handler handler_mp_mc = {
.get_count = common_ring_get_count,
 };

-static struct rte_mempool_handler handler_sp_sc = {
+static const struct rte_mempool_handler handler_sp_sc = {
.name = "ring_sp_sc",
.alloc = common_ring_alloc,
.free = common_ring_free,
@@ -123,7 +123,7 @@ static struct rte_mempool_handler handler_sp_sc = {
.get_count = common_ring_get_count,
 };

-static struct rte_mempool_handler handler_mp_sc = {
+static const struct rte_mempool_handler handler_mp_sc = {
.name = "ring_mp_sc",
.alloc = common_ring_alloc,
.free = common_ring_free,
@@ -132,7 +132,7 @@ static struct rte_mempool_handler handler_mp_sc = {
.get_count = common_ring_get_count,
 };

-static struct rte_mempool_handler handler_sp_mc = {
+static const struct rte_mempool_handler handler_sp_mc = {
.name = "ring_sp_mc",
.alloc = common_ring_alloc,
.free = common_ring_free,
diff --git a/lib/librte_mempool/rte_mempool_handler.c 
b/lib/librte_mempool/rte_mempool_handler.c
index 78611f8..eab1e69 100644
--- a/lib/librte_mempool/rte_mempool_handler.c
+++ b/lib/librte_mempool/rte_mempool_handler.c
@@ -45,7 +45,7 @@ struct rte_mempool_handler_table rte_mempool_handler_table = {

 /* add a new handler in rte_mempool_handler_table, return its index */
 int
-rte_mempool_handler_register(struct rte_mempool_handler *h)
+rte_mempool_handler_register(const struct rte_mempool_handler *h)
 {
struct rte_mempool_handler *handler;
int16_t handler_idx;
diff --git a/lib/librte_mempool/rte_mempool_stack.c 
b/lib/librte_mempool/rte_mempool_stack.c
index 6e25028..ad49436 100644
--- a/lib/librte_mempool/rte_mempool_stack.c
+++ b/lib/librte_mempool/rte_mempool_stack.c
@@ -133,7 +133,7 @@ common_stack_free(void *p)
rte_free(p);
 }

-static struct rte_mempool_handler handler_stack = {
+static const struct rte_mempool_handler handler_stack = {
.name = "stack",
.alloc = common_stack_alloc,
.free = common_stack_free,
-- 
2.5.5



[dpdk-dev] [PATCH v2 3/3] test: add autotest for external mempool stack handler

2016-05-19 Thread David Hunt
Signed-off-by: David Hunt 
---
 app/test/test_mempool.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index f55d126..b98804a 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -570,6 +570,7 @@ test_mempool(void)
struct rte_mempool *mp_cache = NULL;
struct rte_mempool *mp_nocache = NULL;
struct rte_mempool *mp_ext = NULL;
+   struct rte_mempool *mp_stack = NULL;

rte_atomic32_init(&synchro);

@@ -619,6 +620,27 @@ test_mempool(void)
}
rte_mempool_obj_iter(mp_ext, my_obj_init, NULL);

+   /* create a mempool with an external handler */
+   mp_stack = rte_mempool_create_empty("test_stack",
+   MEMPOOL_SIZE,
+   MEMPOOL_ELT_SIZE,
+   RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
+   SOCKET_ID_ANY, 0);
+
+   if (mp_stack == NULL) {
+   printf("cannot allocate mp_stack mempool\n");
+   goto err;
+   }
+   if (rte_mempool_set_handler(mp_stack, "stack") < 0) {
+   printf("cannot set stack handler\n");
+   goto err;
+   }
+   if (rte_mempool_populate_default(mp_stack) < 0) {
+   printf("cannot populate mp_stack mempool\n");
+   goto err;
+   }
+   rte_mempool_obj_iter(mp_stack, my_obj_init, NULL);
+
/* retrieve the mempool from its name */
if (rte_mempool_lookup("test_nocache") != mp_nocache) {
printf("Cannot lookup mempool from its name\n");
@@ -652,6 +674,10 @@ test_mempool(void)
if (test_mempool_xmem_misc() < 0)
goto err;

+   /* test the stack handler */
+   if (test_mempool_basic(mp_stack) < 0)
+   goto err;
+
rte_mempool_list_dump(stdout);

return 0;
-- 
2.5.5



[dpdk-dev] v2 mempool: add stack (lifo) mempool handler

2016-05-19 Thread Hunt, David

On 5/19/2016 3:48 PM, David Hunt wrote:
> This patch set adds a fifo stack handler to the external mempool
> manager.

Apologies, cut and paste error, should be lifo stack handler. I thought 
I'd caught all of these...


> This patch set depends on the 3 part external mempool handler
> patch set (v5 of the series):
> http://dpdk.org/ml/archives/dev/2016-May/039364.html
>
> v2 changes:
>* updated based on mailing list feedback (Thanks Stephen)
>* checkpatch fixes.
>
>



[dpdk-dev] [PATCH 1/2] mempool: add stack (fifo) mempool handler

2016-05-19 Thread Hunt, David


On 5/5/2016 10:28 PM, Stephen Hemminger wrote:
> Overall, this is ok, but why can't it be the default?

Backward compatibility would probably be the main reason, to have the 
least impact when recompiling.

> Lots of little things should be cleaned up

I've submitted a v2, and addressed all your issues. Thanks for the review.

Regards,
Dave.


--snip---




[dpdk-dev] Underlinked libs and overlinked applications - an issue?

2016-05-19 Thread Christian Ehrhardt
Hi,
I was working on the new 16.04 build system to adapt deb packaging to it.
I remember somewhen back in the DPDK 2.2 and shared+combined library days I
had some issues with over/underlinking - but it seems those are still
existent or came back.

After a build in almost default config (just disabled the kernel modules)
and set RTE_MACHINE to default I find the following.

#1
The libraries are all only linked against external things - even clearly
using internal structures:
ldd usr/lib/x86_64-linux-gnu/librte_lpm.so.2
   linux-vdso.so.1 =>  (0x7fff7e7a5000)
   libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f175d4dd000)
   /lib64/ld-linux-x86-64.so.2 (0x558d3afbf000)

#2
The Application then seem to try to make up for that by realizing all that
is missing.
But looking at the app alone it seems overlinked by that - it is not using
all of these on its own.
ldd usr/bin/cmdline_test
   linux-vdso.so.1 =>  (0x7ffeec9ea000)
   librte_distributor.so.1 => not found
   librte_reorder.so.1 => not found
[...]
   librte_jobstats.so.1 => not found
[...]
And for example none of the librte_jobstats.so.1 symbols are used
"directly" in there.


I'm still digging into that concept of using a linker script for all of
that and some of the new implications by that. And eventually thing "work",
but this linking at least feels wrong to me.


So I wanted to ask - is that intentional - or should that be fixed?
If it should be fixed are there obvious suggestions where/how?
And if it is intentional - could one be so nice to elaborate it a bit for
me - thanks in advance.


Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd


[dpdk-dev] [PATCH] arm64: change rte_memcpy to inline function

2016-05-19 Thread Thomas Monjalon
2016-05-19 21:48, Jianbo Liu:
> On 13 May 2016 at 23:49, Thomas Monjalon  wrote:
> > 2016-05-10 14:01, Jianbo Liu:
> >> Other APP may call rte_memcpy by function pointer,
> >> so change it to an inline function.
> >
> > Any example in mind?
> >
> It's for ODP-DPDK.

Given that ODP is open (dataplane), you should also consider ppc64 and tile.

> >> --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h
> >> -#define rte_memcpy(d, s, n)  memcpy((d), (s), (n))
> >> +static inline void *
> >> +rte_memcpy(void *dst, const void *src, size_t n)
> >> +{
> >> + return memcpy(dst, src, n);
> >> +}
> >
> > It has no sense if other archs (arm32, ppc64, tile) are not updated.
> >
> But it also an inline function on x86.

In x86, it was implemented as a function because there is some code.
If you want to make sure it is always a function, even in the case
of just calling memcpy from libc, you should put a doxygen comment in
the generic part and adapt every archs.


[dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned

2016-05-19 Thread Thomas Monjalon
2016-05-19 19:05, Jerin Jacob:
> On Thu, May 19, 2016 at 12:18:57PM +, Ananyev, Konstantin wrote:
> > > On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote:
> > > > On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote:
> > > > > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote:
> > I wonder does anyone really use mbuf port field?
> > My though was - could we to drop it completely?
> > Actually, after discussing it with Bruce offline, an interesting idea came 
> > out:
> > if we'll drop port and make mbuf_prefree() to reset nb_segs=1, then
> > we can reduce RX rearm_data to 4B. So with that layout:
> > 
> > struct rte_mbuf {
> > 
> >  MARKER cacheline0;
> > 
> > void *buf_addr;   
> > phys_addr_t buf_physaddr; 
> > uint16_t buf_len;
> > uint8_t nb_segs;
> > uint8_t reserved_1byte;   /* former port */
> > 
> > MARKER32 rearm_data;
> > uint16_t data_off;
> >uint16_t refcnt;
> >
> > uint64_t ol_flags;
> > ...
> > 
> > We can keep buf_len at its place and avoid 2B gap, while making rearm_data
> > 4B long and 4B aligned.
> 
> Couple of comments,
> - IMO, It is good if nb_segs can move under rearm_data, as some
> drivers(not in ixgbe may be) can write nb_segs in one shot also
> in segmented rx handler case
> - I think, it makes sense to keep port in mbuf so that application
> can make use of it(Not sure what real application developers think of
> this)

I agree we could try to remove the port id from mbuf.
These mbuf data are a common base to pass infos between drivers and apps.
If you need to store some data which are read and write from the app only,
you can have use the private zone (see rte_pktmbuf_priv_size).

> - if Writing 4B and 8B consume same cycles(at least in arm64) then I think it
> makes sense to make it as 8B wide with maximum pre-built constants are 
> possible.
> 
> > 
> > Another similar alternative, is to make mbuf_prefree() to set refcnt=1
> > (as it update it anyway). Then we can remove refcnt from the RX rearm_data,
> > and again make rearm_data 4B long and 4B aligned:
> > 
> > struct rte_mbuf {
> > 
> >  MARKER cacheline0;
> > 
> > void *buf_addr;   
> > phys_addr_t buf_physaddr; 
> > uint16_t buf_len;
> > uint16_t refcnt;
> > 
> > MARKER32 rearm_data;
> > uint16_t data_off;
> > uint8_t nb_segs;
> > uint8_t port;
> 
> The only problem I think with this approach is that, port data type cannot be
> extended to uint16_t in future.

It is a major problem. The port id should be extended to uint16_t.


[dpdk-dev] virtio: crash when using multiple processes (16.04 regression)

2016-05-19 Thread Yoni Gilad
Hi,

We have encountered a crash in virtio_xmit_pkts (specifically, in the call to 
virtqueue_notify) when running DPDK in a multi-process setup. This is a 
regression in DPDK 16.04.

The culprit seems to be the field vtpci_ops in the virtio_hw structure. This 
field is stored in shared memory, but points to a struct in the primary 
process's address space. If the same struct was loaded in a different address 
in the secondary process, it will lead to a crash or other issues when this 
field is dereferenced there. The referenced virtio_pci_ops struct contains 
function pointers, which can also be different in the secondary process.

Regards,
Yoni Gilad



[dpdk-dev] [PATCH v2] vhost: add support for dynamic vhost PMD creation

2016-05-19 Thread Ferruh Yigit
On 5/19/2016 9:33 AM, Thomas Monjalon wrote:
> 2016-05-18 18:10, Ferruh Yigit:
>> Add rte_eth_from_vhost() API to create vhost PMD dynamically from
>> applications.
> 
> How is it different from rte_eth_dev_attach() calling rte_eal_vdev_init()?
> 

When used rte_eth_dev_attach(), application also needs to do:
rte_eth_dev_configure()
rte_eth_rx_queue_setup()
rte_eth_tx_queue_setup()
rte_eth_dev_start()

rte_eth_from_vhost() does these internally, easier to use for applications.


Regards,
ferruh


[dpdk-dev] virtio: crash when using multiple processes (16.04 regression)

2016-05-19 Thread Thomas Monjalon
2016-05-19 16:20, Yoni Gilad:
> We have encountered a crash in virtio_xmit_pkts (specifically, in the call to 
> virtqueue_notify) when running DPDK in a multi-process setup. This is a 
> regression in DPDK 16.04.

Thanks a lot for reporting.

2 tips to improve such bug report:

- Send it to the maintainer of virtio (and cc this list).
You can find them in the MAINTAINERS file. I've cc'ed them.

- Try to test early the release candidates to have it fixed before
the bug is really released.

> The culprit seems to be the field vtpci_ops in the virtio_hw structure. This 
> field is stored in shared memory, but points to a struct in the primary 
> process's address space. If the same struct was loaded in a different address 
> in the secondary process, it will lead to a crash or other issues when this 
> field is dereferenced there. The referenced virtio_pci_ops struct contains 
> function pointers, which can also be different in the secondary process.




[dpdk-dev] [PATCH v2] vhost: add support for dynamic vhost PMD creation

2016-05-19 Thread Thomas Monjalon
2016-05-19 17:28, Ferruh Yigit:
> On 5/19/2016 9:33 AM, Thomas Monjalon wrote:
> > 2016-05-18 18:10, Ferruh Yigit:
> >> Add rte_eth_from_vhost() API to create vhost PMD dynamically from
> >> applications.
> > 
> > How is it different from rte_eth_dev_attach() calling rte_eal_vdev_init()?
> > 
> 
> When used rte_eth_dev_attach(), application also needs to do:
> rte_eth_dev_configure()
> rte_eth_rx_queue_setup()
> rte_eth_tx_queue_setup()
> rte_eth_dev_start()
> 
> rte_eth_from_vhost() does these internally, easier to use for applications.

This argument is not sufficient.
We are not going to add new APIs just for wrapping others.


[dpdk-dev] [PATCH] qede: fix 32-bit build with debug enabled

2016-05-19 Thread Thomas Monjalon
Some 64-bit variables are printed for debug.
%PRIx64 qualifier must be used because %lx is not long enough
on 32-bit systems

Signed-off-by: Thomas Monjalon 
---
 drivers/net/qede/base/bcm_osal.c| 4 ++--
 drivers/net/qede/base/ecore_cxt.c   | 6 +++---
 drivers/net/qede/base/ecore_mcp.c   | 4 ++--
 drivers/net/qede/base/ecore_sriov.c | 8 
 drivers/net/qede/base/ecore_vf.c| 6 +++---
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/qede/base/bcm_osal.c b/drivers/net/qede/base/bcm_osal.c
index 9540c4b..ae5a8bc 100644
--- a/drivers/net/qede/base/bcm_osal.c
+++ b/drivers/net/qede/base/bcm_osal.c
@@ -115,7 +115,7 @@ void *osal_dma_alloc_coherent(struct ecore_dev *p_dev,
}
*phys = mz->phys_addr;
DP_VERBOSE(p_dev, ECORE_MSG_PROBE,
-  "size=%zu phys=0x%lx virt=%p on socket=%u\n",
+  "size=%zu phys=0x%" PRIx64 " virt=%p on socket=%u\n",
   mz->len, mz->phys_addr, mz->addr, socket_id);
return mz->addr;
 }
@@ -144,7 +144,7 @@ void *osal_dma_alloc_coherent_aligned(struct ecore_dev 
*p_dev,
}
*phys = mz->phys_addr;
DP_VERBOSE(p_dev, ECORE_MSG_PROBE,
-  "aligned memory size=%zu phys=0x%lx virt=%p core=%d\n",
+  "aligned memory size=%zu phys=0x%" PRIx64 " virt=%p 
core=%d\n",
   mz->len, mz->phys_addr, mz->addr, core_id);
return mz->addr;
 }
diff --git a/drivers/net/qede/base/ecore_cxt.c 
b/drivers/net/qede/base/ecore_cxt.c
index 8436621..1201c1a 100644
--- a/drivers/net/qede/base/ecore_cxt.c
+++ b/drivers/net/qede/base/ecore_cxt.c
@@ -876,8 +876,8 @@ ecore_ilt_blk_alloc(struct ecore_hwfn *p_hwfn,
ilt_shadow[line].size = size;

DP_VERBOSE(p_hwfn, ECORE_MSG_ILT,
-  "ILT shadow: Line [%d] Physical 0x%lx "
-  "Virtual %p Size %d\n",
+  "ILT shadow: Line [%d] Physical 0x%" PRIx64
+  " Virtual %p Size %d\n",
   line, (u64)p_phys, p_virt, size);

sz_left -= size;
@@ -1474,7 +1474,7 @@ static void ecore_ilt_init_pf(struct ecore_hwfn *p_hwfn)
DP_VERBOSE(p_hwfn, ECORE_MSG_ILT,
"Setting RT[0x%08x] from"
" ILT[0x%08x] [Client is %d] to"
-   " Physical addr: 0x%lx\n",
+   " Physical addr: 0x%" PRIx64 "\n",
rt_offst, line, i,
(u64)(p_shdw[line].p_phys >> 12));
}
diff --git a/drivers/net/qede/base/ecore_mcp.c 
b/drivers/net/qede/base/ecore_mcp.c
index bdc6a5e..9dd2eed 100644
--- a/drivers/net/qede/base/ecore_mcp.c
+++ b/drivers/net/qede/base/ecore_mcp.c
@@ -1197,8 +1197,8 @@ enum _ecore_status_t 
ecore_mcp_fill_shmem_func_info(struct ecore_hwfn *p_hwfn,
DP_VERBOSE(p_hwfn, (ECORE_MSG_SP | ECORE_MSG_IFUP),
   "Read configuration from shmem: pause_on_host %02x"
" protocol %02x BW [%02x - %02x]"
-   " MAC %02x:%02x:%02x:%02x:%02x:%02x wwn port %lx"
-   " node %lx ovlan %04x\n",
+   " MAC %02x:%02x:%02x:%02x:%02x:%02x wwn port %" PRIx64
+   " node %" PRIx64 " ovlan %04x\n",
   info->pause_on_host, info->protocol,
   info->bandwidth_min, info->bandwidth_max,
   info->mac[0], info->mac[1], info->mac[2],
diff --git a/drivers/net/qede/base/ecore_sriov.c 
b/drivers/net/qede/base/ecore_sriov.c
index 7cd48ea..1b3119d 100644
--- a/drivers/net/qede/base/ecore_sriov.c
+++ b/drivers/net/qede/base/ecore_sriov.c
@@ -297,9 +297,9 @@ static enum _ecore_status_t ecore_iov_allocate_vfdb(struct 
ecore_hwfn *p_hwfn)
return ECORE_NOMEM;

DP_VERBOSE(p_hwfn, ECORE_MSG_IOV,
-  "PF's Requests mailbox [%p virt 0x%lx phys],  Response"
-  " mailbox [%p virt 0x%lx phys] Bulletins"
-  " [%p virt 0x%lx phys]\n",
+  "PF's Requests mailbox [%p virt 0x%" PRIx64 " phys], "
+  "Response mailbox [%p virt 0x%" PRIx64 " phys] Bulletins"
+  " [%p virt 0x%" PRIx64 " phys]\n",
   p_iov_info->mbx_msg_virt_addr,
   (u64)p_iov_info->mbx_msg_phys_addr,
   p_iov_info->mbx_reply_virt_addr,
@@ -1250,7 +1250,7 @@ static void ecore_iov_vf_mbx_acquire(struct ecore_hwfn 
*p_hwfn,

DP_VERBOSE(p_hwfn, ECORE_MSG_IOV,
   "VF[%d] ACQUIRE_RESPONSE: pfdev_info- chip_num=0x%x,"
-  " db_size=%d, idx_per_sb=%d, pf_cap=0x%lx\n"
+  " db_size=%d, idx_per_sb=%d, pf_cap=0x%" PRIx64 "\n"
   "resources- n_rxq-%d, n_txq-%d, n_sbs-%d, n_macs-%d,"
   " n_vlans

[dpdk-dev] [PATCH] scripts: remove unused map files merger

2016-05-19 Thread Thomas Monjalon
This script was forgotten when dropping the combined library.

Fixes: 948fd64befc3 ("mk: replace the combined library with a linker script")

Signed-off-by: Thomas Monjalon 
---
 MAINTAINERS   |  1 -
 scripts/merge-maps.sh | 29 -
 2 files changed, 30 deletions(-)
 delete mode 100755 scripts/merge-maps.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 9dd0738..3e8558f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -56,7 +56,6 @@ F: scripts/auto-config-h.sh
 F: scripts/depdirs-rule.sh
 F: scripts/gen-build-mk.sh
 F: scripts/gen-config-h.sh
-F: scripts/merge-maps.sh
 F: scripts/relpath.sh
 F: doc/build-sdk-quick.txt
 F: doc/guides/prog_guide/build_app.rst
diff --git a/scripts/merge-maps.sh b/scripts/merge-maps.sh
deleted file mode 100755
index edc88de..000
--- a/scripts/merge-maps.sh
+++ /dev/null
@@ -1,29 +0,0 @@
-#!/bin/sh
-
-FILES=$(find "$RTE_SDK"/lib "$RTE_SDK"/drivers -name "*_version.map")
-SYMBOLS=$(grep -h "{" $FILES | sort -u | sed 's/{//')
-
-first=0
-prev_sym="none"
-
-for s in $SYMBOLS; do
-   echo "$s {"
-   echo "global:"
-   echo ""
-   for f in $FILES; do
-   sed -n "/$s {/,/}/p" "$f" | sed '/^$/d' | grep -v global | grep 
-v local | sed -e '1d' -e '$d'
-   done | sort -u
-   echo ""
-   if [ $first -eq 0 ]; then
-   first=1;
-   echo "local: *;";
-   fi
-   if [ "$prev_sym" = "none" ]; then
-   echo "};";
-   prev_sym=$s;
-   else
-   echo "} $prev_sym;";
-   prev_sym=$s;
-   fi
-   echo ""
-done
-- 
2.7.0



[dpdk-dev] Underlinked libs and overlinked applications - an issue?

2016-05-19 Thread Thomas Monjalon
2016-05-19 17:38, Christian Ehrhardt:
> Hi,
> I was working on the new 16.04 build system to adapt deb packaging to it.
> I remember somewhen back in the DPDK 2.2 and shared+combined library days I
> had some issues with over/underlinking - but it seems those are still
> existent or came back.

I would say it has always been like that.
Thanks for raising the issue.

> After a build in almost default config (just disabled the kernel modules)
> and set RTE_MACHINE to default I find the following.
> 
> #1
> The libraries are all only linked against external things - even clearly
> using internal structures:
> ldd usr/lib/x86_64-linux-gnu/librte_lpm.so.2
>linux-vdso.so.1 =>  (0x7fff7e7a5000)
>libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f175d4dd000)
>/lib64/ld-linux-x86-64.so.2 (0x558d3afbf000)

The DT_NEEDED entries are created only for external dependencies.
Probably we should create ones for internal dependencies based on the
variable DEPDIRS-y.

> #2
> The Application then seem to try to make up for that by realizing all that
> is missing.
> But looking at the app alone it seems overlinked by that - it is not using
> all of these on its own.
> ldd usr/bin/cmdline_test
>linux-vdso.so.1 =>  (0x7ffeec9ea000)
>librte_distributor.so.1 => not found
>librte_reorder.so.1 => not found
> [...]
>librte_jobstats.so.1 => not found
> [...]
> And for example none of the librte_jobstats.so.1 symbols are used
> "directly" in there.

Yes every libraries are put for every apps in rte.app.mk.
Probably that we could better use DEPDIRS-y for apps.



[dpdk-dev] [PATCH] qede: fix 32-bit build with debug enabled

2016-05-19 Thread Harish Patil
>
>Some 64-bit variables are printed for debug.
>%PRIx64 qualifier must be used because %lx is not long enough
>on 32-bit systems
>
>Signed-off-by: Thomas Monjalon 
>---
> drivers/net/qede/base/bcm_osal.c| 4 ++--
> drivers/net/qede/base/ecore_cxt.c   | 6 +++---
> drivers/net/qede/base/ecore_mcp.c   | 4 ++--
> drivers/net/qede/base/ecore_sriov.c | 8 
> drivers/net/qede/base/ecore_vf.c| 6 +++---
> 5 files changed, 14 insertions(+), 14 deletions(-)
>
>diff --git a/drivers/net/qede/base/bcm_osal.c
>b/drivers/net/qede/base/bcm_osal.c
>index 9540c4b..ae5a8bc 100644
>--- a/drivers/net/qede/base/bcm_osal.c
>+++ b/drivers/net/qede/base/bcm_osal.c
>@@ -115,7 +115,7 @@ void *osal_dma_alloc_coherent(struct ecore_dev *p_dev,
>   }
>   *phys = mz->phys_addr;
>   DP_VERBOSE(p_dev, ECORE_MSG_PROBE,
>- "size=%zu phys=0x%lx virt=%p on socket=%u\n",
>+ "size=%zu phys=0x%" PRIx64 " virt=%p on socket=%u\n",
>  mz->len, mz->phys_addr, mz->addr, socket_id);
>   return mz->addr;
> }
>@@ -144,7 +144,7 @@ void *osal_dma_alloc_coherent_aligned(struct
>ecore_dev *p_dev,
>   }
>   *phys = mz->phys_addr;
>   DP_VERBOSE(p_dev, ECORE_MSG_PROBE,
>- "aligned memory size=%zu phys=0x%lx virt=%p core=%d\n",
>+ "aligned memory size=%zu phys=0x%" PRIx64 " virt=%p 
>core=%d\n",
>  mz->len, mz->phys_addr, mz->addr, core_id);
>   return mz->addr;
> }
>diff --git a/drivers/net/qede/base/ecore_cxt.c
>b/drivers/net/qede/base/ecore_cxt.c
>index 8436621..1201c1a 100644
>--- a/drivers/net/qede/base/ecore_cxt.c
>+++ b/drivers/net/qede/base/ecore_cxt.c
>@@ -876,8 +876,8 @@ ecore_ilt_blk_alloc(struct ecore_hwfn *p_hwfn,
>   ilt_shadow[line].size = size;
> 
>   DP_VERBOSE(p_hwfn, ECORE_MSG_ILT,
>- "ILT shadow: Line [%d] Physical 0x%lx "
>- "Virtual %p Size %d\n",
>+ "ILT shadow: Line [%d] Physical 0x%" PRIx64
>+ " Virtual %p Size %d\n",
>  line, (u64)p_phys, p_virt, size);
> 
>   sz_left -= size;
>@@ -1474,7 +1474,7 @@ static void ecore_ilt_init_pf(struct ecore_hwfn
>*p_hwfn)
>   DP_VERBOSE(p_hwfn, ECORE_MSG_ILT,
>   "Setting RT[0x%08x] from"
>   " ILT[0x%08x] [Client is %d] to"
>-  " Physical addr: 0x%lx\n",
>+  " Physical addr: 0x%" PRIx64 "\n",
>   rt_offst, line, i,
>   (u64)(p_shdw[line].p_phys >> 12));
>   }
>diff --git a/drivers/net/qede/base/ecore_mcp.c
>b/drivers/net/qede/base/ecore_mcp.c
>index bdc6a5e..9dd2eed 100644
>--- a/drivers/net/qede/base/ecore_mcp.c
>+++ b/drivers/net/qede/base/ecore_mcp.c
>@@ -1197,8 +1197,8 @@ enum _ecore_status_t
>ecore_mcp_fill_shmem_func_info(struct ecore_hwfn *p_hwfn,
>   DP_VERBOSE(p_hwfn, (ECORE_MSG_SP | ECORE_MSG_IFUP),
>  "Read configuration from shmem: pause_on_host %02x"
>   " protocol %02x BW [%02x - %02x]"
>-  " MAC %02x:%02x:%02x:%02x:%02x:%02x wwn port %lx"
>-  " node %lx ovlan %04x\n",
>+  " MAC %02x:%02x:%02x:%02x:%02x:%02x wwn port %" PRIx64
>+  " node %" PRIx64 " ovlan %04x\n",
>  info->pause_on_host, info->protocol,
>  info->bandwidth_min, info->bandwidth_max,
>  info->mac[0], info->mac[1], info->mac[2],
>diff --git a/drivers/net/qede/base/ecore_sriov.c
>b/drivers/net/qede/base/ecore_sriov.c
>index 7cd48ea..1b3119d 100644
>--- a/drivers/net/qede/base/ecore_sriov.c
>+++ b/drivers/net/qede/base/ecore_sriov.c
>@@ -297,9 +297,9 @@ static enum _ecore_status_t
>ecore_iov_allocate_vfdb(struct ecore_hwfn *p_hwfn)
>   return ECORE_NOMEM;
> 
>   DP_VERBOSE(p_hwfn, ECORE_MSG_IOV,
>- "PF's Requests mailbox [%p virt 0x%lx phys],  Response"
>- " mailbox [%p virt 0x%lx phys] Bulletins"
>- " [%p virt 0x%lx phys]\n",
>+ "PF's Requests mailbox [%p virt 0x%" PRIx64 " phys], "
>+ "Response mailbox [%p virt 0x%" PRIx64 " phys] Bulletins"
>+ " [%p virt 0x%" PRIx64 " phys]\n",
>  p_iov_info->mbx_msg_virt_addr,
>  (u64)p_iov_info->mbx_msg_phys_addr,
>  p_iov_info->mbx_reply_virt_addr,
>@@ -1250,7 +1250,7 @@ static void ecore_iov_vf_mbx_acquire(struct
>ecore_hwfn *p_hwfn,
> 
>   DP_VERBOSE(p_hwfn, ECORE_MSG_IOV,
>  "VF[%d] ACQUIRE_RESPONSE: pfdev_info- chip_num=0x%x,"
>- " db_size=%d, idx_per_sb=%d, pf_cap=0x%lx\n"
>+ " db_size=%d, idx_per_sb=%d, pf_cap=0x%" PRIx64 "\n"
>  "resources- n_rxq-%d, n_txq-%d,