[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Ananyev, Konstantin

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: Friday, September 26, 2014 10:36 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
> architecture
> 
> The atomic operations implemented with assembly code in DPDK only
> support x86. This patch add architecture specific atomic operations for
> IBM Power architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387 
> 
>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318 
>  2 files changed, 705 insertions(+), 0 deletions(-)
>  create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644 
> lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
...
> +
> diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> new file mode 100644
> index 000..fe5666e
> --- /dev/null
> +
...
>+#define   rte_arch_rmb() asm volatile("sync" : : : "memory")
>+
> +#define  rte_arch_compiler_barrier() do {\
> + asm volatile ("" : : : "memory");   \
> +} while(0)

I don't know much about PPC architecture, but as I remember it uses a  
weakly-ordering memory model.
Is that correct?
If so, then you probably need rte_arch_compiler_barrier() to be "sync" 
instruction (like mb()s above) .
The reason is that IA has much stronger memory ordering model and there are a 
lot of places in the code where it implies that  ordering.
For example - ring enqueue/dequeue functions.   

Konstantin



[dpdk-dev] i40e: Steps and required configurations of how to achieve the best performance!

2014-10-16 Thread Zhang, Helin
Hi Thomas

Yes, your proposal it the perfect one, also the most complicated one. I was 
thinking of that one as well, but we did not have enough time for that in our 
1.8 timeframe.
In the long run, I agree with you to implement EAL function to access PCI 
config space directly. I will try to put it in our plan as soon as possible, if 
no objections.

For now, I think the quickest and easiest way might be to write out a script of 
using ?setpci?, the Linux command. It is harmless for our code base, and we can 
remove it when we have better choice. What do you think?

Thank you very much for the great comments on this topic! I really like it!

Regards,
Helin

From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
Sent: Wednesday, October 15, 2014 5:42 PM
To: Zhang, Helin
Cc: dev at dpdk.org; David Marchand
Subject: Re: [dpdk-dev] i40e: Steps and required configurations of how to 
achieve the best performance!


Hi Helin,



2014-09-19 03:43, Zhang, Helin:

> My idea on it could be,

> 1. Write a script to use ?setpci? to configure pci configuration.

> End user can decide which PCI device needs to be changed.

> 2. Add code to change that PCI configuration in i40e PMD only, as

> it seems nobody else need it till now.



The second solution seems better because more integrated and automatic.

But I would like to have some EAL functions to access to PCI configuration.

These functions would have Linux and BSD implementations.

Then the PMD could change the configuration if it's allowed by a run-time

option and would notify the change with a warning/log.



Thanks for keeping us notified of your progress.

--

Thomas


[dpdk-dev] [PATCH] Add Rx error statistics for Fortville

2014-10-16 Thread Zhang, Helin
Hi Thomas

Thank you very much for the detailed guidance! It is really helpful for me.

Regards,
Helin

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 15, 2014 8:22 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org; Liu, Jijiang
> Subject: Re: [dpdk-dev] [PATCH] Add Rx error statistics for Fortville
> 
> Helin,
> 
> As you are in charge of i40e, here are 2 tips to acknowledge patches:
> 
> 1) title should take this format:
>   i40e: add Rx error statistics
> 
> > Acked-by: Helin Zhang 
> 
> 2) This line should be added right after the Signed-off-by.
> And the rest of the email (patch body) can be removed.
> This way, your answer would be faster to read.
> 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > > Sent: Wednesday, October 15, 2014 11:15 AM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [PATCH] Add Rx error statistics for Fortville
> 
> This header is not needed also.
> 
> > > This patch adds incoming packet error statistics in the i40e_ethdev.c 
> > > file.
> > >
> > > Signed-off-by: Jijiang Liu 
> 
> [I remove the rest of the original email because I have no comment on it]
> 
> Thanks
> --
> Thomas


[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API

2014-10-16 Thread Liu, Jijiang


> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Thursday, October 16, 2014 12:01 AM
> To: Liu, Jijiang; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > Sent: Saturday, October 11, 2014 6:56 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
> >
> > Introduce a new filter framewok in librte_ether. As to the
> > implemetation discussion, please refer to
> > http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN
> > tunnel filter implementation is based on it.
> >
> > Signed-off-by: Jijiang Liu 
> > Acked-by: Helin Zhang 
> > Acked-by: Jingjing Wu 
> >
> > ---
> >  lib/librte_ether/Makefile   |1 +
> >  lib/librte_ether/rte_eth_ctrl.h |  152
> > +++
> >  lib/librte_ether/rte_ethdev.c   |   32 
> >  lib/librte_ether/rte_ethdev.h   |   56 +++---
> >  4 files changed, 229 insertions(+), 12 deletions(-)  create mode
> > 100644 lib/librte_ether/rte_eth_ctrl.h
> >
> [...]
> > +++ b/lib/librte_ether/rte_eth_ctrl.h
> 
> [...]
> 
> > +/**
> > + * Tunnel Packet filter configuration.
> > + */
> > +struct rte_eth_tunnel_filter_conf {
> > +   struct ether_addr *outer_mac;  /**< Outer MAC address fiter. */
> > +   struct ether_addr *inner_mac;  /**< Inner MAC address fiter. */
> > +   uint16_t inner_vlan;   /**< Inner VLAN fiter. */
> > +   enum rte_tunnel_iptype ip_type; /**< IP address type. */
> > +   union {
> > +   uint32_t ipv4_addr;/**< IPv4 source address to match. */
> > +   uint32_t ipv6_addr[4]; /**< IPv6 source address to match. */
> > +   } ip_addr; /**< IPv4/IPv6 source address to match (union of above).
> > */
> > +
> > +   uint8_t filter_type;   /**< Filter type. */
> 
> This should be enum rte_tunnel_filter_type filter_type, and not uint8_t
> filter_type.

I will fix this.

> > +   uint8_t to_queue;   /**< Use MAC and VLAN to point to a queue.
> > */
> > +   enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */
> > +   uint32_t tenant_id;/** < Tenant number. */
> > +   uint16_t queue_id; /** < queue number. */
> > +};
> > +
> [...]


[dpdk-dev] to the intel dpdk engineers and all contributors

2014-10-16 Thread Wiles, Roger Keith

On Oct 15, 2014, at 1:46 PM, daniel chapiesky  wrote:

> I just watched the closing remarks by Tim Driscol at the dpdk summit
> 
> http://youtu.be/r-JA5NBybrs
> 
> At time 4:30, he mentioned the "shock to the system" of developers
> expecting a pat on the back and instead receiving critiques of their
> code.
> 
> I realized that I was one of those who failed to acknowledge the incredible
> work the Intel Engineers and other contributors have produced.
> 
> Please let me acknowledge all of you and your efforts with a few comments:
> 
> 1) Kudos!!:
> 
> 2) The Packet Framework made me run around waving my hands in the air
> yelling: "This is freaking awesome! I don't have to write it myself!!!"
> 
> 3) The layered architecture is elegant.
> 
> 4) Examples The examples are wonderful! Those who wrote the examples
> are my heroes.
> 
> 5) Docs? Clear, to the point, and better than the other projects we depend
> upon (you know who you are)
> 
> 6) 6Wind - Thank you for taking on the management of the repository and
> website - your coordination effort is truly appreciated
> 
> 7) Did I say the Packet Framework saved me so much time I was actually able
> to cut back my coffee intake by 10%
> 
> 8) Windriver - PktGen!  (though I really want to know more about
> mcos?.)

If you need more information about MCOS let me know as I did not write a lot of 
docs for it :-(

Thanks
++Keith
> 
> Finally,
> 
> I recently received a pat on the back for the application I have developed.
> In truth, that pat should be passed on, since my application depends so
> heavily on DPDK.
> 
> Thank you.
> 
> I encourage others to let the Intel Engineers and contributors know how
> much we appreciate the time and effort they have given to DPDK.
> 
> Sincerely,
> 
> 
> Daniel Chapiesky
> AllSource

Keith Wiles, Principal Technologist with CTO office, Wind River mobile 
972-213-5533



[dpdk-dev] kernel panic when stop my test demo

2014-10-16 Thread Lilijun
On 2014/10/15 18:08, Richardson, Bruce wrote:
> 
> 
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Lilijun
>> Sent: Wednesday, October 15, 2014 10:43 AM
>> To: dev at dpdk.org; stephen at networkplumber.org
>> Subject: Re: [dpdk-dev] kernel panic when stop my test demo
>>
>> Hi all,
>>
>> After adding unmap uio resources operations in process signal handler 
>> functions,
>> An new error was found as follows:
>> Call Trace:
>>  [] uio_release+0x40/0x60 [uio]
>>  [] __fput+0xe9/0x270
>>  [] fput+0xe/0x10
>>  [] task_work_run+0xa7/0xe0
>>  [] do_notify_resume+0x97/0xb0
>>  [] int_signal+0x12/0x17
>>
>> The code for unmap uio resources is shown:
>> static void pci_dev_uio_unmap(struct rte_pci_device *pci_dev, uint8_t 
>> port_id)
>> {
>> int i;
>>
>> RTE_LOG(INFO, EAL, "begin unmap port %d uio resource! \n", port_id);
>> if (NULL == pci_dev)
>> {
>> RTE_LOG(ERR, EAL, "begin unmap port %d uio resource! \n", 
>> port_id);
>> return;
>> }
>>
>> for (i = 0; i != PCI_MAX_RESOURCE; i++)
>> {
>> /* skip empty BAR */
>> if (0 == pci_dev->mem_resource[i].phys_addr)
>> continue;
>> if (munmap(pci_dev->mem_resource[i].addr, pci_dev-
>>> mem_resource[i].len)
>> == 
>> -1){
>> RTE_LOG(ERR, EAL, "Error with munmap\n");
>> return;
>> }
>> }
>> if (close(pci_dev->intr_handle.fd) == -1){
>> RTE_LOG(ERR, EAL, "Error closing interrupt handle\n");
>> return;
>> }
>> pci_dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
>> RTE_LOG(INFO, EAL, "unmap port %d uio resource successfully!\n",
>> port_id);
>> }
>>
>> Does anyone has some ideas?
>>
>> Thanks for any help.
>> Jerry
>>
>> On 2014/10/14 19:58, Lilijun wrote:
>>> Hi Stephen and all,
>>>
>>> I have a same problem as this older email describes on Aug 14, 2013.
>>> Any help will be appreciated.
>>>
>>> The details is shown as follows.
>>> The key step implementation of my demo is:
>>> 1. Firstly, call rte_eal_init() to do some initialization.
>>> 2. Switch the driver of my Intel  82599 NIC from ixgbe.ko to igb_uio.ko
>>> like tools/dpdk_nic_bind.py written in C source code.
>>> 3. Configure rte_dev and start it.
>>> 4. Do some rx/tx tests.
>>> 5. call rte_eth_dev_stop(dpdk_port_id) to stop the hardware as your history
>> emails.
>>> 6. Switch the driver of the NIC from igb_uio.ko to ixgbe.ko.
>>> 7. Kill the demo using commands: kill -9.
> 
> Just to clarify one point - you have an application running which was using 
> the NICs with DPDK while you remove the uio driver and replace it with ixgbe? 
I would expect doing such a thing to cause problems as stopping the device does 
not cause the NIC BAR memory to be unmapped from the DPDK process.
Therefore removing the driver providing that memory map and getting another 
driver to start using those same BARs would not be recommended.
> 

Thanks for your reply.
Yes, I want to change the NIC driver by replacing the uio driver with ixgbe in 
order to recover the NIC to origin kernel ether-net devices while keeping the 
application running.
Then my application can use the NICs with DPDK or with kernel ixgbe driver on 
demand.
I am confusing with how to release all uio resources when stop my application.

Would you like to give me any suggestions for my requirements?

> /Bruce
> 
> 
> .
> 




[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Chao CH Zhu
Konstantin,

In my understanding, compiler barrier is a kind of software barrier which 
prevents the compiler from moving memory accesses across the barrier. This 
should be architecture-independent. And the "sync" instruction is a 
hardware barrier which depends on PowerPC architecture. So I think the 
compiler barrier should be the same on x86 and PowerPC. Any comments? 
Please correct me if I was wrong.

Thanks a lot! 

Best Regards!
--
Chao Zhu 




From:   "Ananyev, Konstantin" 
To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
Date:   2014/10/16 08:38
Subject:RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
Power   architecture




Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> Sent: Friday, September 26, 2014 10:36 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
architecture
> 
> The atomic operations implemented with assembly code in DPDK only
> support x86. This patch add architecture specific atomic operations for
> IBM Power architecture.
> 
> Signed-off-by: Chao Zhu 
> ---
>  .../common/include/powerpc/arch/rte_atomic.h   |  387 

>  .../common/include/powerpc/arch/rte_atomic_arch.h  |  318 

>  2 files changed, 705 insertions(+), 0 deletions(-)
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
>  create mode 100644 
lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> 
...
> +
> diff --git 
a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> new file mode 100644
> index 000..fe5666e
> --- /dev/null
> +
...
>+#definerte_arch_rmb() asm volatile("sync" : : : 
"memory")
>+
> +#define   rte_arch_compiler_barrier() do {\
> +  asm volatile ("" : : : "memory");   \
> +} while(0)

I don't know much about PPC architecture, but as I remember it uses a 
weakly-ordering memory model.
Is that correct?
If so, then you probably need rte_arch_compiler_barrier() to be "sync" 
instruction (like mb()s above) .
The reason is that IA has much stronger memory ordering model and there 
are a lot of places in the code where it implies that  ordering.
For example - ring enqueue/dequeue functions. 

Konstantin




[dpdk-dev] kernel panic when stop my test demo

2014-10-16 Thread Richardson, Bruce


> -Original Message-
> From: Lilijun [mailto:jerry.lilijun at huawei.com]
> Sent: Thursday, October 16, 2014 3:40 AM
> To: Richardson, Bruce; dev at dpdk.org; stephen at networkplumber.org
> Subject: Re: [dpdk-dev] kernel panic when stop my test demo
> 
> On 2014/10/15 18:08, Richardson, Bruce wrote:
> >
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Lilijun
> >> Sent: Wednesday, October 15, 2014 10:43 AM
> >> To: dev at dpdk.org; stephen at networkplumber.org
> >> Subject: Re: [dpdk-dev] kernel panic when stop my test demo
> >>
> >> Hi all,
> >>
> >> After adding unmap uio resources operations in process signal handler
> functions,
> >> An new error was found as follows:
> >> Call Trace:
> >>  [] uio_release+0x40/0x60 [uio]
> >>  [] __fput+0xe9/0x270
> >>  [] fput+0xe/0x10
> >>  [] task_work_run+0xa7/0xe0
> >>  [] do_notify_resume+0x97/0xb0
> >>  [] int_signal+0x12/0x17
> >>
> >> The code for unmap uio resources is shown:
> >> static void pci_dev_uio_unmap(struct rte_pci_device *pci_dev, uint8_t
> port_id)
> >> {
> >> int i;
> >>
> >> RTE_LOG(INFO, EAL, "begin unmap port %d uio resource! \n", 
> >> port_id);
> >> if (NULL == pci_dev)
> >> {
> >> RTE_LOG(ERR, EAL, "begin unmap port %d uio resource! \n",
> port_id);
> >> return;
> >> }
> >>
> >> for (i = 0; i != PCI_MAX_RESOURCE; i++)
> >> {
> >> /* skip empty BAR */
> >> if (0 == pci_dev->mem_resource[i].phys_addr)
> >> continue;
> >> if (munmap(pci_dev->mem_resource[i].addr, pci_dev-
> >>> mem_resource[i].len)
> >> == 
> >> -1){
> >> RTE_LOG(ERR, EAL, "Error with munmap\n");
> >> return;
> >> }
> >> }
> >> if (close(pci_dev->intr_handle.fd) == -1){
> >> RTE_LOG(ERR, EAL, "Error closing interrupt handle\n");
> >> return;
> >> }
> >> pci_dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
> >> RTE_LOG(INFO, EAL, "unmap port %d uio resource successfully!\n",
> >> port_id);
> >> }
> >>
> >> Does anyone has some ideas?
> >>
> >> Thanks for any help.
> >> Jerry
> >>
> >> On 2014/10/14 19:58, Lilijun wrote:
> >>> Hi Stephen and all,
> >>>
> >>> I have a same problem as this older email describes on Aug 14, 2013.
> >>> Any help will be appreciated.
> >>>
> >>> The details is shown as follows.
> >>> The key step implementation of my demo is:
> >>> 1. Firstly, call rte_eal_init() to do some initialization.
> >>> 2. Switch the driver of my Intel  82599 NIC from ixgbe.ko to igb_uio.ko
> >>> like tools/dpdk_nic_bind.py written in C source code.
> >>> 3. Configure rte_dev and start it.
> >>> 4. Do some rx/tx tests.
> >>> 5. call rte_eth_dev_stop(dpdk_port_id) to stop the hardware as your 
> >>> history
> >> emails.
> >>> 6. Switch the driver of the NIC from igb_uio.ko to ixgbe.ko.
> >>> 7. Kill the demo using commands: kill -9.
> >
> > Just to clarify one point - you have an application running which was using 
> > the
> NICs with DPDK while you remove the uio driver and replace it with ixgbe?
> I would expect doing such a thing to cause problems as stopping the device 
> does
> not cause the NIC BAR memory to be unmapped from the DPDK process.
> Therefore removing the driver providing that memory map and getting another
> driver to start using those same BARs would not be recommended.
> >
> 
> Thanks for your reply.
> Yes, I want to change the NIC driver by replacing the uio driver with ixgbe in
> order to recover the NIC to origin kernel ether-net devices while keeping the
> application running.
> Then my application can use the NICs with DPDK or with kernel ixgbe driver on
> demand.
> I am confusing with how to release all uio resources when stop my application.
> 
> Would you like to give me any suggestions for my requirements?
> 

Right now, there is no way to do this without changing the internals of the 
DPDK itself. The BARs from the NIC are mmapped permanently into the processes 
address space on initialization of the application, and are never released. 
You'd basically need to write code to un-initialize the DPDK and then 
reinitialize it at a later point.
Might an alternative be to actually have two separate applications or binaries 
that appear as one, or work as one? Then you could shut down the dpdk binary 
before removing the uio driver, and switch over to the ixgbe driver and use the 
other application. However, I realise that getting a seamless transition could 
be difficult there.

/Bruce


[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Richardson, Bruce
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao CH Zhu
> Sent: Thursday, October 16, 2014 4:14 AM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
> architecture
> 
> Konstantin,
> 
> In my understanding, compiler barrier is a kind of software barrier which
> prevents the compiler from moving memory accesses across the barrier. This
> should be architecture-independent. And the "sync" instruction is a
> hardware barrier which depends on PowerPC architecture. So I think the
> compiler barrier should be the same on x86 and PowerPC. Any comments?
> Please correct me if I was wrong.
> 
I would agree with that assessment, as far as it goes, in that a compiler 
barrier is going to be the same on both architectures. However, we also need to 
start thinking about actual use cases - how to we specify the barriers in a 
piece of code where we need a full memory barrier on PPC and only a compiler 
barrier on IA? 
My suggestion would be to do first as you propose and have proper primitives 
for the different barrier types defined correctly for each platform - with the 
compiler barrier being, presumably, common across each one. Then, as a second 
step, we probably need to look at defining "logical" barrier types (for want of 
a better term) that can then be used in the code and which would be different 
across platforms.

Does this make sense to do this way? Is it the best solution? Do we want to 
define the basic primitives or are we only ever likely to need the logical 
barrier types?

/Bruce


[dpdk-dev] [PATCH v2 0/6] i40e VMDQ support

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

v2:
- Fix a few typos.
- Add comments for RX mq mode flags.
- Remove '\n' from some log messages.
- Remove 'Acked-by' in commit log.

v1:
Define extra VMDQ arguments to expand VMDQ configuration. This also
includes change in igb and ixgbe PMD driver. In the meanwhile, fix 2
defects in rte_ether library.

Add full VMDQ support in i40e PMD driver. renamed some functions, setup
VMDQ VSI after it's enabled in application. It also make some improvement
on macaddr add/delete to support setting multiple macaddr for single or
multiple pools.

Finally, change i40e rx/tx_queue_setup and dev_start/stop functions to
configure/switch queues belonging to VMDQ pools.


Chen Jing D(Mark) (6):
  ether: enhancement for VMDQ support
  igb: change for VMDQ arguments expansion
  ixgbe: change for VMDQ arguments expansion
  i40e: add VMDQ support
  i40e: macaddr add/del enhancement
  i40e: Add full VMDQ pools support

 config/common_linuxapp  |1 +
 lib/librte_ether/rte_ethdev.c   |   12 +-
 lib/librte_ether/rte_ethdev.h   |   43 +++-
 lib/librte_pmd_e1000/igb_ethdev.c   |3 +
 lib/librte_pmd_i40e/i40e_ethdev.c   |  499 ++-
 lib/librte_pmd_i40e/i40e_ethdev.h   |   21 ++-
 lib/librte_pmd_i40e/i40e_rxtx.c |  125 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 8 files changed, 536 insertions(+), 169 deletions(-)

-- 
1.7.7.6



[dpdk-dev] [PATCH v2 2/6] igb: change for VMDQ arguments expansion

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Assign new VMDQ arguments with correct values.

Signed-off-by: Chen Jing D(Mark) 
---
 lib/librte_pmd_e1000/igb_ethdev.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index c9acdc5..dc0ea6d 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -1286,18 +1286,21 @@ eth_igb_infos_get(struct rte_eth_dev *dev,
dev_info->max_rx_queues = 16;
dev_info->max_tx_queues = 16;
dev_info->max_vmdq_pools = ETH_8_POOLS;
+   dev_info->vmdq_queue_num = 16;
break;

case e1000_82580:
dev_info->max_rx_queues = 8;
dev_info->max_tx_queues = 8;
dev_info->max_vmdq_pools = ETH_8_POOLS;
+   dev_info->vmdq_queue_num = 8;
break;

case e1000_i350:
dev_info->max_rx_queues = 8;
dev_info->max_tx_queues = 8;
dev_info->max_vmdq_pools = ETH_8_POOLS;
+   dev_info->vmdq_queue_num = 8;
break;

case e1000_i354:
-- 
1.7.7.6



[dpdk-dev] [PATCH v2 1/6] ether: enhancement for VMDQ support

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

The change includes several parts:
1. Clear pool bitmap when trying to remove specific MAC.
2. Define RSS, DCB and VMDQ flags to combine rx_mq_mode.
3. Use 'struct' to replace 'union', which to expand the rx_adv_conf
   arguments to better support RSS, DCB and VMDQ.
4. Fix bug in rte_eth_dev_config_restore function, which will restore
   all MAC address to default pool.
5. Define additional 3 arguments for better VMDQ support.

Signed-off-by: Chen Jing D(Mark) 
---
 lib/librte_ether/rte_ethdev.c |   12 ++
 lib/librte_ether/rte_ethdev.h |   43 ++--
 2 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fd1010a..86f4409 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -771,7 +771,8 @@ rte_eth_dev_config_restore(uint8_t port_id)
continue;

/* add address to the hardware */
-   if  (*dev->dev_ops->mac_addr_add)
+   if  (*dev->dev_ops->mac_addr_add &&
+   dev->data->mac_pool_sel[i] & (1ULL << pool))
(*dev->dev_ops->mac_addr_add)(dev, &addr, i, pool);
else {
PMD_DEBUG_TRACE("port %d: MAC address array not 
supported\n",
@@ -1249,10 +1250,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info)
}
dev = &rte_eth_devices[port_id];

-   /* Default device offload capabilities to zero */
-   dev_info->rx_offload_capa = 0;
-   dev_info->tx_offload_capa = 0;
-   dev_info->if_index = 0;
+   /* Set all fields with zero */
+   memset(dev_info, 0, sizeof(*dev_info));
FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
(*dev->dev_ops->dev_infos_get)(dev, dev_info);
dev_info->pci_dev = dev->pci_dev;
@@ -2022,6 +2021,9 @@ rte_eth_dev_mac_addr_remove(uint8_t port_id, struct 
ether_addr *addr)
/* Update address in NIC data structure */
ether_addr_copy(&null_mac_addr, &dev->data->mac_addrs[index]);

+   /* reset pool bitmap */
+   dev->data->mac_pool_sel[index] = 0;
+
return 0;
 }

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 50df654..4c83aa5 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -252,20 +252,37 @@ struct rte_eth_thresh {
 };

 /**
+ *  Simple flags to indicate RX mq mode, which can be used independently or 
combined
+ *  in enum rte_eth_rx_mq_mode definition.
+ */
+#define ETH_MQ_RX_RSS_FLAG  0x1
+#define ETH_MQ_RX_DCB_FLAG  0x2
+#define ETH_MQ_RX_VMDQ_FLAG 0x4
+
+/**
  *  A set of values to identify what method is to be used to route
  *  packets to multiple queues.
  */
 enum rte_eth_rx_mq_mode {
-   ETH_MQ_RX_NONE = 0,  /**< None of DCB,RSS or VMDQ mode */
-
-   ETH_MQ_RX_RSS,   /**< For RX side, only RSS is on */
-   ETH_MQ_RX_DCB,   /**< For RX side,only DCB is on. */
-   ETH_MQ_RX_DCB_RSS,   /**< Both DCB and RSS enable */
-
-   ETH_MQ_RX_VMDQ_ONLY, /**< Only VMDQ, no RSS nor DCB */
-   ETH_MQ_RX_VMDQ_RSS,  /**< RSS mode with VMDQ */
-   ETH_MQ_RX_VMDQ_DCB,  /**< Use VMDQ+DCB to route traffic to queues */
-   ETH_MQ_RX_VMDQ_DCB_RSS, /**< Enable both VMDQ and DCB in VMDq */
+   /**< None of DCB,RSS or VMDQ mode */
+   ETH_MQ_RX_NONE = 0,
+
+   /**< For RX side, only RSS is on */
+   ETH_MQ_RX_RSS = ETH_MQ_RX_RSS_FLAG,
+   /**< For RX side,only DCB is on. */
+   ETH_MQ_RX_DCB = ETH_MQ_RX_DCB_FLAG,
+   /**< Both DCB and RSS enable */
+   ETH_MQ_RX_DCB_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_DCB_FLAG,
+
+   /**< Only VMDQ, no RSS nor DCB */
+   ETH_MQ_RX_VMDQ_ONLY = ETH_MQ_RX_VMDQ_FLAG,
+   /**< RSS mode with VMDQ */
+   ETH_MQ_RX_VMDQ_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_VMDQ_FLAG,
+   /**< Use VMDQ+DCB to route traffic to queues */
+   ETH_MQ_RX_VMDQ_DCB = ETH_MQ_RX_VMDQ_FLAG | ETH_MQ_RX_DCB_FLAG,
+   /**< Enable both VMDQ and DCB in VMDq */
+   ETH_MQ_RX_VMDQ_DCB_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_DCB_FLAG |
+ETH_MQ_RX_VMDQ_FLAG,
 };

 /**
@@ -840,7 +857,7 @@ struct rte_eth_conf {
 Read the datasheet of given ethernet controller
 for details. The possible values of this field
 are defined in implementation of each driver. 
*/
-   union {
+   struct {
struct rte_eth_rss_conf rss_conf; /**< Port RSS configuration */
struct rte_eth_vmdq_dcb_conf vmdq_dcb_conf;
/**< Port vmdq+dcb configuration. */
@@ -906,6 +923,10 @@ struct rte_eth_dev_info {
uint16_t max_vmdq_pools; /**< Maximum number of VMDq pools. */
uint32_t rx_offload_capa; /**< Device RX offload capabilities. */
uint32_t tx_offload_capa; /**< Device 

[dpdk-dev] [PATCH v2 3/6] ixgbe: change for VMDQ arguments expansion

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Assign new VMDQ arguments with correct values.

Signed-off-by: Chen Jing D(Mark) 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index f4b590b..d0f9bcb 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -1933,6 +1933,7 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->max_vmdq_pools = ETH_16_POOLS;
else
dev_info->max_vmdq_pools = ETH_64_POOLS;
+   dev_info->vmdq_queue_num = dev_info->max_rx_queues;
dev_info->rx_offload_capa =
DEV_RX_OFFLOAD_VLAN_STRIP |
DEV_RX_OFFLOAD_IPV4_CKSUM |
-- 
1.7.7.6



[dpdk-dev] [PATCH v2 5/6] i40e: macaddr add/del enhancement

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Change i40e_macaddr_add and i40e_macaddr_remove functions to support
multiple macaddr add/delete. In the meanwhile, support macaddr ops
on different pools.

Signed-off-by: Chen Jing D(Mark) 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |   89 +---
 1 files changed, 42 insertions(+), 47 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index ad65e25..c0e9f48 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -1532,45 +1532,37 @@ i40e_priority_flow_ctrl_set(__rte_unused struct 
rte_eth_dev *dev,
 static void
 i40e_macaddr_add(struct rte_eth_dev *dev,
 struct ether_addr *mac_addr,
-__attribute__((unused)) uint32_t index,
-__attribute__((unused)) uint32_t pool)
+__rte_unused uint32_t index,
+uint32_t pool)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   struct i40e_vsi *vsi = pf->main_vsi;
-   struct ether_addr old_mac;
+   struct i40e_vsi *vsi;
int ret;

-   if (!is_valid_assigned_ether_addr(mac_addr)) {
-   PMD_DRV_LOG(ERR, "Invalid ethernet address");
-   return;
-   }
-
-   if (is_same_ether_addr(mac_addr, &(pf->dev_addr))) {
-   PMD_DRV_LOG(INFO, "Ignore adding permanent mac address");
+   /* If VMDQ not enabled or configured, return */
+   if (pool != 0 && (!(pf->flags | I40E_FLAG_VMDQ) || 
!pf->nb_cfg_vmdq_vsi)) {
+   PMD_DRV_LOG(ERR, "VMDQ not %s, can't set mac to pool %u",
+   pf->flags | I40E_FLAG_VMDQ ? "configured" : "enabled",
+   pool);
return;
}

-   /* Write mac address */
-   ret = i40e_aq_mac_address_write(hw, I40E_AQC_WRITE_TYPE_LAA_ONLY,
-   mac_addr->addr_bytes, NULL);
-   if (ret != I40E_SUCCESS) {
-   PMD_DRV_LOG(ERR, "Failed to write mac address");
+   if (pool > pf->nb_cfg_vmdq_vsi) {
+   PMD_DRV_LOG(ERR, "Pool number %u invalid. Max pool is %u",
+   pool, pf->nb_cfg_vmdq_vsi);
return;
}

-   (void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN);
-   (void)rte_memcpy(hw->mac.addr, mac_addr->addr_bytes,
-   ETHER_ADDR_LEN);
+   if (pool == 0)
+   vsi = pf->main_vsi;
+   else
+   vsi = pf->vmdq[pool - 1].vsi;

ret = i40e_vsi_add_mac(vsi, mac_addr);
if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter");
return;
}
-
-   ether_addr_copy(mac_addr, &pf->dev_addr);
-   i40e_vsi_delete_mac(vsi, &old_mac);
 }

 /* Remove a MAC address, and update filters */
@@ -1578,36 +1570,39 @@ static void
 i40e_macaddr_remove(struct rte_eth_dev *dev, uint32_t index)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct i40e_vsi *vsi = pf->main_vsi;
-   struct rte_eth_dev_data *data = I40E_VSI_TO_DEV_DATA(vsi);
+   struct i40e_vsi *vsi;
+   struct rte_eth_dev_data *data = dev->data;
struct ether_addr *macaddr;
int ret;
-   struct i40e_hw *hw =
-   I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-
-   if (index >= vsi->max_macaddrs)
-   return;
+   uint32_t i;
+   uint64_t pool_sel;

macaddr = &(data->mac_addrs[index]);
-   if (!is_valid_assigned_ether_addr(macaddr))
-   return;
-
-   ret = i40e_aq_mac_address_write(hw, I40E_AQC_WRITE_TYPE_LAA_ONLY,
-   hw->mac.perm_addr, NULL);
-   if (ret != I40E_SUCCESS) {
-   PMD_DRV_LOG(ERR, "Failed to write mac address");
-   return;
-   }
-
-   (void)rte_memcpy(hw->mac.addr, hw->mac.perm_addr, ETHER_ADDR_LEN);

-   ret = i40e_vsi_delete_mac(vsi, macaddr);
-   if (ret != I40E_SUCCESS)
-   return;
+   pool_sel = dev->data->mac_pool_sel[index];
+
+   for (i = 0; i < sizeof(pool_sel) * CHAR_BIT; i++) {
+   if (pool_sel & (1ULL << i)) {
+   if (i == 0)
+   vsi = pf->main_vsi;
+   else {
+   /* No VMDQ pool enabled or configured */
+   if (!(pf->flags | I40E_FLAG_VMDQ) ||
+   (i > pf->nb_cfg_vmdq_vsi)) {
+   PMD_DRV_LOG(ERR, "No VMDQ pool enabled"
+   "/configured");
+   return;
+   }
+   vsi = pf->vmdq[i - 1].vsi;
+   }
+

[dpdk-dev] [PATCH v2 4/6] i40e: add VMDQ support

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

The change includes several parts:
1. Get maximum number of VMDQ pools supported in dev_init.
2. Fill VMDQ info in i40e_dev_info_get.
3. Setup VMDQ pools in i40e_dev_configure.
4. i40e_vsi_setup change to support creation of VMDQ VSI.

Signed-off-by: Chen Jing D(Mark) 
---
 config/common_linuxapp|1 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  237 -
 lib/librte_pmd_i40e/i40e_ethdev.h |   17 +++-
 3 files changed, 225 insertions(+), 30 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 5bee910..d0bb3f7 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -208,6 +208,7 @@ CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y
 CONFIG_RTE_LIBRTE_I40E_ALLOW_UNSUPPORTED_SFP=n
 CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
 CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4
+CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
 # interval up to 8160 us, aligned to 2 (or default value)
 CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index a00d6ca..ad65e25 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -168,6 +168,7 @@ static int i40e_get_cap(struct i40e_hw *hw);
 static int i40e_pf_parameter_init(struct rte_eth_dev *dev);
 static int i40e_pf_setup(struct i40e_pf *pf);
 static int i40e_vsi_init(struct i40e_vsi *vsi);
+static int i40e_vmdq_setup(struct rte_eth_dev *dev);
 static void i40e_stat_update_32(struct i40e_hw *hw, uint32_t reg,
bool offset_loaded, uint64_t *offset, uint64_t *stat);
 static void i40e_stat_update_48(struct i40e_hw *hw,
@@ -269,21 +270,11 @@ static struct eth_driver rte_i40e_pmd = {
 };

 static inline int
-i40e_prev_power_of_2(int n)
+i40e_align_floor(int n)
 {
-   int p = n;
-
-   --p;
-   p |= p >> 1;
-   p |= p >> 2;
-   p |= p >> 4;
-   p |= p >> 8;
-   p |= p >> 16;
-   if (p == (n - 1))
-   return n;
-   p >>= 1;
-
-   return ++p;
+   if (n == 0)
+   return 0;
+   return (1 << (sizeof(n) * CHAR_BIT - 1 - __builtin_clz(n)));
 }

 static inline int
@@ -500,7 +491,7 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
if (!dev->data->mac_addrs) {
PMD_INIT_LOG(ERR, "Failed to allocated memory "
"for storing mac address");
-   goto err_get_mac_addr;
+   goto err_mac_alloc;
}
ether_addr_copy((struct ether_addr *)hw->mac.perm_addr,
&dev->data->mac_addrs[0]);
@@ -521,8 +512,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,

return 0;

+err_mac_alloc:
+   i40e_vsi_release(pf->main_vsi);
 err_setup_pf_switch:
-   rte_free(pf->main_vsi);
 err_get_mac_addr:
 err_configure_lan_hmc:
(void)i40e_shutdown_lan_hmc(hw);
@@ -541,6 +533,27 @@ err_get_capabilities:
 static int
 i40e_dev_configure(struct rte_eth_dev *dev)
 {
+   int ret;
+   enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+
+   /* VMDQ setup.
+*  Needs to move VMDQ setting out of i40e_pf_config_mq_rx() as VMDQ and
+*  RSS setting have different requirements.
+*  General PMD driver call sequence are NIC init, configure,
+*  rx/tx_queue_setup and dev_start. In rx/tx_queue_setup() function, it
+*  will try to lookup the VSI that specific queue belongs to if VMDQ
+*  applicable. So, VMDQ setting has to be done before
+*  rx/tx_queue_setup(). This function is good  to place vmdq_setup.
+*  For RSS setting, it will try to calculate actual configured RX queue
+*  number, which will be available after rx_queue_setup(). dev_start()
+*  function is good to place RSS setup.
+*/
+   if (mq_mode & ETH_MQ_RX_VMDQ_FLAG) {
+   ret = i40e_vmdq_setup(dev);
+   if (ret)
+   return ret;
+   }
+
return i40e_dev_init_vlan(dev);
 }

@@ -1389,6 +1402,16 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_UDP_CKSUM |
DEV_TX_OFFLOAD_TCP_CKSUM |
DEV_TX_OFFLOAD_SCTP_CKSUM;
+
+   if (pf->flags | I40E_FLAG_VMDQ) {
+   dev_info->max_vmdq_pools = pf->max_nb_vmdq_vsi;
+   dev_info->vmdq_queue_base = dev_info->max_rx_queues;
+   dev_info->vmdq_queue_num = pf->vmdq_nb_qps *
+   pf->max_nb_vmdq_vsi;
+   dev_info->vmdq_pool_base = I40E_VMDQ_POOL_BASE;
+   dev_info->max_rx_queues += dev_info->vmdq_queue_num;
+   dev_info->max_tx_queues += dev_info->vmdq_queue_num;
+   }
 }

 static int
@@ -1814,7 +1837,7 @@ i40e_pf_parameter_init(struct rte_eth_dev *dev)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->d

[dpdk-dev] [PATCH v2 6/6] i40e: Add full VMDQ pools support

2014-10-16 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

1. Function i40e_vsi_* name change to i40e_dev_* since PF can contains
   more than 1 VSI after VMDQ enabled.
2. i40e_dev_rx/tx_queue_setup change to have capability of setup
   queues that belongs to VMDQ pools.
3. Add queue mapping. This will do a convertion between queue index
   that application used and real NIC queue index.
3. i40e_dev_start/stop change to have capability switching VMDQ queues.
4. i40e_pf_config_rss change to calculate actual main VSI queue numbers
   after VMDQ pools introduced.

Signed-off-by: Chen Jing D(Mark) 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  175 ++---
 lib/librte_pmd_i40e/i40e_ethdev.h |4 +-
 lib/librte_pmd_i40e/i40e_rxtx.c   |  125 ++-
 3 files changed, 227 insertions(+), 77 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index c0e9f48..cf303d0 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -167,7 +167,7 @@ static int i40e_dev_rss_reta_query(struct rte_eth_dev *dev,
 static int i40e_get_cap(struct i40e_hw *hw);
 static int i40e_pf_parameter_init(struct rte_eth_dev *dev);
 static int i40e_pf_setup(struct i40e_pf *pf);
-static int i40e_vsi_init(struct i40e_vsi *vsi);
+static int i40e_dev_rxtx_init(struct i40e_pf *pf);
 static int i40e_vmdq_setup(struct rte_eth_dev *dev);
 static void i40e_stat_update_32(struct i40e_hw *hw, uint32_t reg,
bool offset_loaded, uint64_t *offset, uint64_t *stat);
@@ -770,8 +770,8 @@ i40e_dev_start(struct rte_eth_dev *dev)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   struct i40e_vsi *vsi = pf->main_vsi;
-   int ret;
+   struct i40e_vsi *main_vsi = pf->main_vsi;
+   int ret, i;

if ((dev->data->dev_conf.link_duplex != ETH_LINK_AUTONEG_DUPLEX) &&
(dev->data->dev_conf.link_duplex != ETH_LINK_FULL_DUPLEX)) {
@@ -782,26 +782,37 @@ i40e_dev_start(struct rte_eth_dev *dev)
}

/* Initialize VSI */
-   ret = i40e_vsi_init(vsi);
+   ret = i40e_dev_rxtx_init(pf);
if (ret != I40E_SUCCESS) {
-   PMD_DRV_LOG(ERR, "Failed to init VSI");
+   PMD_DRV_LOG(ERR, "Failed to init rx/tx queues");
goto err_up;
}

/* Map queues with MSIX interrupt */
-   i40e_vsi_queues_bind_intr(vsi);
-   i40e_vsi_enable_queues_intr(vsi);
+   i40e_vsi_queues_bind_intr(main_vsi);
+   i40e_vsi_enable_queues_intr(main_vsi);
+
+   /* Map VMDQ VSI queues with MSIX interrupt */
+   for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) {
+   i40e_vsi_queues_bind_intr(pf->vmdq[i].vsi);
+   i40e_vsi_enable_queues_intr(pf->vmdq[i].vsi);
+   }

/* Enable all queues which have been configured */
-   ret = i40e_vsi_switch_queues(vsi, TRUE);
+   ret = i40e_dev_switch_queues(pf, TRUE);
if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Failed to enable VSI");
goto err_up;
}

/* Enable receiving broadcast packets */
-   if ((vsi->type == I40E_VSI_MAIN) || (vsi->type == I40E_VSI_VMDQ2)) {
-   ret = i40e_aq_set_vsi_broadcast(hw, vsi->seid, true, NULL);
+   ret = i40e_aq_set_vsi_broadcast(hw, main_vsi->seid, true, NULL);
+   if (ret != I40E_SUCCESS)
+   PMD_DRV_LOG(INFO, "fail to set vsi broadcast");
+
+   for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) {
+   ret = i40e_aq_set_vsi_broadcast(hw, pf->vmdq[i].vsi->seid,
+   true, NULL);
if (ret != I40E_SUCCESS)
PMD_DRV_LOG(INFO, "fail to set vsi broadcast");
}
@@ -816,7 +827,8 @@ i40e_dev_start(struct rte_eth_dev *dev)
return I40E_SUCCESS;

 err_up:
-   i40e_vsi_switch_queues(vsi, FALSE);
+   i40e_dev_switch_queues(pf, FALSE);
+   i40e_dev_clear_queues(dev);

return ret;
 }
@@ -825,17 +837,26 @@ static void
 i40e_dev_stop(struct rte_eth_dev *dev)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct i40e_vsi *vsi = pf->main_vsi;
+   struct i40e_vsi *main_vsi = pf->main_vsi;
+   int i;

/* Disable all queues */
-   i40e_vsi_switch_queues(vsi, FALSE);
+   i40e_dev_switch_queues(pf, FALSE);
+
+   /* un-map queues with interrupt registers */
+   i40e_vsi_disable_queues_intr(main_vsi);
+   i40e_vsi_queues_unbind_intr(main_vsi);
+
+   for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) {
+   i40e_vsi_disable_queues_intr(pf->vmdq[i].vsi);
+   i40e_vsi_queues_unbind_intr(pf->vmdq[i].vsi);
+   }
+
+   /* Clear all queues and release memory */
+   i40e_dev_clear_queues(dev);

/* Set link down */
i40e_dev_set_link_down(dev);
-
-   /* un-map que

[dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e

2014-10-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Monday, October 13, 2014 5:13 PM
> To: Liu, Jijiang; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet
> identification in librte_pmd_i40e
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> > Sent: Saturday, October 11, 2014 6:55 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet
> identification
> > in librte_pmd_i40e
> >
> > Support tunneling UDP port configuration on i40e in librte_pmd_i40e.
> > Currently, only VxLAN is implemented, which include
> >  -  VxLAN UDP port initialization
> >  -  Implement the APIs to configure VxLAN UDP port in librte_pmd_i40e.
> >
> > Signed-off-by: Jijiang Liu 
> > Acked-by: Helin Zhang 
> > Acked-by: Jingjing Wu 
> > Acked-by: Jing Chen 
> >

[...]

> > index 7c5b6a8..369bc3b 100644
> > --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> > +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> > @@ -638,6 +638,10 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
> > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
> > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> > mb->ol_flags = pkt_flags;
> > +
> > +   mb->packet_type = (uint16_t)((qword1 &
> > +   I40E_RXD_QW1_PTYPE_MASK) >>
> > +   I40E_RXD_QW1_PTYPE_SHIFT);
> > if (pkt_flags & PKT_RX_RSS_HASH)
> > mb->hash.rss = rte_le_to_cpu_32(\
> > rxdp->wb.qword0.hi_dword.rss);
> > @@ -873,6 +877,8 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf
> > **rx_pkts, uint16_t nb_pkts)
> > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
> > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
> > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> > +   rxm->packet_type = (uint16_t)((qword1 &
> > I40E_RXD_QW1_PTYPE_MASK) >>
> > +   I40E_RXD_QW1_PTYPE_SHIFT);
> > rxm->ol_flags = pkt_flags;
> > if (pkt_flags & PKT_RX_RSS_HASH)
> > rxm->hash.rss =
> > @@ -1027,6 +1033,9 @@ i40e_recv_scattered_pkts(void *rx_queue,
> > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
> > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
> > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> > +   first_seg->packet_type = (uint8_t)((qword1 &
> > +   I40E_RXD_QW1_PTYPE_MASK) >>
> > +   I40E_RXD_QW1_PTYPE_SHIFT);

Another comment is that packet_type is uint16_t, so you should change that 
uint8_t to uint16_t.

Thanks!
> > first_seg->ol_flags = pkt_flags;
> > if (pkt_flags & PKT_RX_RSS_HASH)
> > rxm->hash.rss =
> > --
> > 1.7.7.6



[dpdk-dev] [PATCH v2 5/7] Split spinlock operations to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits the spinlock operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 .../common/include/arch/i686/rte_spinlock.h|  180 ++
 .../common/include/arch/x86_64/rte_spinlock.h  |  180 ++
 .../common/include/generic/rte_spinlock.h  |  169 +
 lib/librte_eal/common/include/rte_spinlock.h   |  258 
 5 files changed, 531 insertions(+), 260 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 6cf7505..9b9a73d 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -35,7 +35,7 @@ INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
-INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
+INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif

-GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
 ARCH_INC := $(GENERIC_INC) rte_prefetch.h

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h 
b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
new file mode 100644
index 000..f61e31c
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h
@@ -0,0 +1,180 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_I686_H_
+#define _RTE_SPINLOCK_I686_H_
+
+/**
+ * @file
+ *
+ * RTE Spinlocks
+ *
+ * This file defines an API for read-write locks, which are implemented
+ * in an architecture-specific way. This kind of lock simply waits in
+ * a loop repeatedly checking until the lock becomes available.
+ *
+ * All locks must be initialised before use, and only initialised once.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_spinlock.h"
+
+#ifndef RTE_FORCE_INTRINSICS
+/**
+ * Take the spinlock.
+ *
+ * @param sl
+ *   A pointer to the spinlock.
+ */
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+   int lock_val = 1;
+   asm volatile (
+   "1:\n"
+   "xchg %[locked], %[lv]\n"
+   "test %[lv], %[lv]\n"
+   "jz 3f\n"
+   "2:\n"
+   "pause\n"
+   "cmpl $0, %[locked]\n"
+  

[dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific

2014-10-16 Thread Chao Zhu
This patch first add architecture specific directories to eal header
file directory. Then split the atomic operations to architecture specific and
generic files. Architecture specific files are put into the
corresponding architecture directory and common header are put into
generic directory.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |   11 +-
 .../common/include/arch/i686/rte_atomic.h  |  669 
 .../common/include/arch/x86_64/rte_atomic.h|  631 +++
 lib/librte_eal/common/include/generic/rte_atomic.h |  795 ++
 .../common/include/i686/arch/rte_atomic.h  |  373 ---
 lib/librte_eal/common/include/rte_atomic.h | 1133 
 .../common/include/x86_64/arch/rte_atomic.h|  335 --
 7 files changed, 2102 insertions(+), 1845 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 7f27966..8ab363b 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@

 include $(RTE_SDK)/mk/rte.vars.mk

-INC := rte_atomic.h rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,11 +46,14 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif

-ARCH_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h
+ARCH_INC := $(GENERIC_INC)

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
-SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/arch := \
-   $(addprefix include/$(RTE_ARCH)/arch/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
+   $(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC))
+SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/generic := \
+   $(addprefix include/generic/,$(GENERIC_INC))

 # add libc if configured
 DEPDIRS-$(CONFIG_RTE_LIBC) += lib/libc
diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h 
b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
new file mode 100644
index 000..67efb19
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_atomic.h
@@ -0,0 +1,669 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Inspired from FreeBSD src/sys/i386/include/atomic.h
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_I686_H_
+#define _RTE_ATOMIC_I686_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include "generic/rte_atomic.h"
+
+/**
+ * @file
+ * Atomic Operations on i686
+ */
+
+#if RTE_MAX_LCORE == 1
+#define MPLOCKED/**< No need to insert MP lock prefix. 
*/
+#else
+#define MPLOCKED"loc

[dpdk-dev] [PATCH v2 4/7] Split prefetch operations to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits the prefetch operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can implement their own functions.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 .../common/include/arch/i686/rte_prefetch.h|   88 
 .../common/include/arch/x86_64/rte_prefetch.h  |   88 
 lib/librte_eal/common/include/rte_prefetch.h   |   88 
 4 files changed, 178 insertions(+), 90 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index c6aedf9..6cf7505 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
-INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
+INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif

 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
-ARCH_INC := $(GENERIC_INC)
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h 
b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
new file mode 100644
index 000..2625512
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h
@@ -0,0 +1,88 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_I686_H_
+#define _RTE_PREFETCH_I686_H_
+
+/**
+ * @file
+ *
+ * Prefetch operations.
+ *
+ * This file defines an API for prefetch macros / inline-functions,
+ * which are architecture-dependent. Prefetching occurs when a
+ * processor requests an instruction or data from memory to cache
+ * before it is actually needed, potentially speeding up the execution of the
+ * program.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Prefetch a cache line into all cache levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch0(volatile void *p)
+{
+   asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th cache level.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch1(volatile void *p)
+{
+   asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p));
+}
+
+/**
+ * Prefetch a cache line into all cache levels except the 0th and 1th cache
+ * levels.
+ * @param p
+ *   Address to prefetch
+ */
+static inline void rte_prefetch2(volatile void *p)
+{
+   asm volatile ("prefetcht2 %[p]

[dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK

2014-10-16 Thread Chao Zhu
The set of patches split x86 architecture specific operations from DPDK and put 
them to the
arch directories of i686 and x86_64 architecture. This will make the adpotion 
of DPDK much easier
on other computer architecture. For a new architecture, just add an 
architecture specific
directory and necessary building configuration files, then DPDK eal library can 
support it. 
This is an upgrade version of the former patch.

Chao Zhu (7):
  Split atomic operations to architecture specific
  Split byte order operations to architecture specific
  Split CPU cycle operation to architecture specific
  Split prefetch operations to architecture specific
  Split spinlock operations to architecture specific
  Split memcpy operation to architecture specific
  Split CPU flags operations to architecture specific

 lib/librte_eal/common/Makefile |   21 +-
 lib/librte_eal/common/eal_common_cpuflags.c|  190 
 .../common/include/arch/i686/rte_atomic.h  |  669 
 .../common/include/arch/i686/rte_byteorder.h   |  194 
 .../common/include/arch/i686/rte_cpuflags.h|  364 +++
 .../common/include/arch/i686/rte_cycles.h  |  158 +++
 .../common/include/arch/i686/rte_memcpy.h  |  376 +++
 .../common/include/arch/i686/rte_prefetch.h|   88 ++
 .../common/include/arch/i686/rte_spinlock.h|  180 
 .../common/include/arch/x86_64/rte_atomic.h|  631 +++
 .../common/include/arch/x86_64/rte_byteorder.h |  195 
 .../common/include/arch/x86_64/rte_cpuflags.h  |  364 +++
 .../common/include/arch/x86_64/rte_cycles.h|  158 +++
 .../common/include/arch/x86_64/rte_memcpy.h|  376 +++
 .../common/include/arch/x86_64/rte_prefetch.h  |   88 ++
 .../common/include/arch/x86_64/rte_spinlock.h  |  180 
 lib/librte_eal/common/include/generic/rte_atomic.h |  795 ++
 .../common/include/generic/rte_byteorder.h |  124 +++
 lib/librte_eal/common/include/generic/rte_cycles.h |  190 
 .../common/include/generic/rte_spinlock.h  |  169 +++
 .../common/include/i686/arch/rte_atomic.h  |  373 ---
 lib/librte_eal/common/include/rte_atomic.h | 1133 
 lib/librte_eal/common/include/rte_byteorder.h  |  270 -
 lib/librte_eal/common/include/rte_cpuflags.h   |  182 
 lib/librte_eal/common/include/rte_cycles.h |  266 -
 lib/librte_eal/common/include/rte_memcpy.h |  376 ---
 lib/librte_eal/common/include/rte_prefetch.h   |   88 --
 lib/librte_eal/common/include/rte_spinlock.h   |  258 -
 .../common/include/x86_64/arch/rte_atomic.h|  335 --
 29 files changed, 5311 insertions(+), 3480 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_atomic.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h
 delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h
 delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h



[dpdk-dev] [PATCH v2 2/7] Split byte order operations to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits the byte order operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 .../common/include/arch/i686/rte_byteorder.h   |  194 ++
 .../common/include/arch/x86_64/rte_byteorder.h |  195 ++
 .../common/include/generic/rte_byteorder.h |  124 +
 lib/librte_eal/common/include/rte_byteorder.h  |  270 
 5 files changed, 515 insertions(+), 272 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h
 delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 8ab363b..62a39cd 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -31,7 +31,7 @@

 include $(RTE_SDK)/mk/rte.vars.mk

-INC := rte_branch_prediction.h rte_byteorder.h rte_common.h
+INC := rte_branch_prediction.h rte_common.h
 INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif

-GENERIC_INC := rte_atomic.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h
 ARCH_INC := $(GENERIC_INC)

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h 
b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
new file mode 100644
index 000..de5cc83
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h
@@ -0,0 +1,194 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_I686_H_
+#define _RTE_BYTEORDER_I686_H_
+
+/**
+ * @file
+ *
+ * Byte Swap Operations
+ *
+ * This file defines a architecture specific API for byte swap operations. 
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+   register uint16_t x = _x;
+   asm volatile ("xchgb %b[x1],%h[x2]"
+ : [x1] "=Q" (x)
+ : [x2] "0" (x)
+ );
+   return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+   register uint32_t x = _x;
+   asm volatile ("bswap %[x]"
+ : [x] "+r" (x)
+ );
+   return x;
+} 
+ 
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* Compat./Leg. mode */
+static in

[dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits the CPU TSC read operations from DPDK and push them to
architecture specific arch directories, so that other processors that
don't have tsc register can be can implement its'own functions.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 .../common/include/arch/i686/rte_cycles.h  |  158 
 .../common/include/arch/x86_64/rte_cycles.h|  158 
 lib/librte_eal/common/include/generic/rte_cycles.h |  190 ++
 lib/librte_eal/common/include/rte_cycles.h |  266 
 5 files changed, 508 insertions(+), 268 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h
 delete mode 100644 lib/librte_eal/common/include/rte_cycles.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 62a39cd..c6aedf9 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -32,7 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk

 INC := rte_branch_prediction.h rte_common.h
-INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
+INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
 INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h
 INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
@@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
 INC += rte_warnings.h
 endif

-GENERIC_INC := rte_atomic.h rte_byteorder.h
+GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h
 ARCH_INC := $(GENERIC_INC)

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
diff --git a/lib/librte_eal/common/include/arch/i686/rte_cycles.h 
b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
new file mode 100644
index 000..a813e9b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_cycles.h
@@ -0,0 +1,158 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+/*   BSD LICENSE
+ *
+ *   Copyright(c) 2013 6WIND.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of 6WIND S.A. nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CON

[dpdk-dev] [PATCH v2 7/7] Split CPU flags operations to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits CPU flags related operations from DPDK and push them
to architecture specific arch directories, so that other processor
architecture can implement it's own CPU flag functions to support DPDK.

Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 lib/librte_eal/common/eal_common_cpuflags.c|  190 --
 .../common/include/arch/i686/rte_cpuflags.h|  364 
 .../common/include/arch/x86_64/rte_cpuflags.h  |  364 
 lib/librte_eal/common/include/rte_cpuflags.h   |  182 --
 5 files changed, 730 insertions(+), 374 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h
 delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index e09d509..79f378e 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -36,7 +36,7 @@ INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h 
rte_lcore.h
 INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
-INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
+INC += rte_string_fns.h rte_version.h rte_tailq_elem.h
 INC += rte_eal_memconfig.h rte_malloc_heap.h
 INC += rte_hexdump.h rte_devargs.h rte_dev.h
 INC += rte_common_vect.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif

 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
-ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h rte_cpuflags.h

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/eal_common_cpuflags.c 
b/lib/librte_eal/common/eal_common_cpuflags.c
index 9e79179..6fd360c 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -30,10 +30,6 @@
  *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
-#include 
-#include 
-#include 
-#include 
 #include 

 /*
@@ -50,192 +46,6 @@
 #endif

 /**
- * Enumeration of CPU registers
- */
-enum cpu_register_t {
-   REG_EAX = 0,
-   REG_EBX,
-   REG_ECX,
-   REG_EDX,
-};
-
-typedef uint32_t cpuid_registers_t[4];
-
-#define CPU_FLAG_NAME_MAX_LEN 64
-
-/**
- * Struct to hold a processor feature entry
- */
-struct feature_entry {
-   uint32_t leaf;  /**< cpuid leaf */
-   uint32_t subleaf;   /**< cpuid subleaf */
-   uint32_t reg;   /**< cpuid register */
-   uint32_t bit;   /**< cpuid register bit */
-   char name[CPU_FLAG_NAME_MAX_LEN];   /**< String for printing */
-};
-
-#define FEAT_DEF(name, leaf, subleaf, reg, bit) \
-   [RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name },
-
-/**
- * An array that holds feature entries
- */
-static const struct feature_entry cpu_feature_table[] = {
-   FEAT_DEF(SSE3, 0x0001, 0, REG_ECX,  0)
-   FEAT_DEF(PCLMULQDQ, 0x0001, 0, REG_ECX,  1)
-   FEAT_DEF(DTES64, 0x0001, 0, REG_ECX,  2)
-   FEAT_DEF(MONITOR, 0x0001, 0, REG_ECX,  3)
-   FEAT_DEF(DS_CPL, 0x0001, 0, REG_ECX,  4)
-   FEAT_DEF(VMX, 0x0001, 0, REG_ECX,  5)
-   FEAT_DEF(SMX, 0x0001, 0, REG_ECX,  6)
-   FEAT_DEF(EIST, 0x0001, 0, REG_ECX,  7)
-   FEAT_DEF(TM2, 0x0001, 0, REG_ECX,  8)
-   FEAT_DEF(SSSE3, 0x0001, 0, REG_ECX,  9)
-   FEAT_DEF(CNXT_ID, 0x0001, 0, REG_ECX, 10)
-   FEAT_DEF(FMA, 0x0001, 0, REG_ECX, 12)
-   FEAT_DEF(CMPXCHG16B, 0x0001, 0, REG_ECX, 13)
-   FEAT_DEF(XTPR, 0x0001, 0, REG_ECX, 14)
-   FEAT_DEF(PDCM, 0x0001, 0, REG_ECX, 15)
-   FEAT_DEF(PCID, 0x0001, 0, REG_ECX, 17)
-   FEAT_DEF(DCA, 0x0001, 0, REG_ECX, 18)
-   FEAT_DEF(SSE4_1, 0x0001, 0, REG_ECX, 19)
-   FEAT_DEF(SSE4_2, 0x0001, 0, REG_ECX, 20)
-   FEAT_DEF(X2APIC, 0x0001, 0, REG_ECX, 21)
-   FEAT_DEF(MOVBE, 0x0001, 0, REG_ECX, 22)
-   FEAT_DEF(POPCNT, 0x0001, 0, REG_ECX, 23)
-   FEAT_DEF(TSC_DEADLINE, 0x0001, 0, REG_ECX, 24)
-   FEAT_DEF(AES, 0x0001, 0, REG_ECX, 25)
-   FEAT_DEF(XSAVE, 0x0001, 0, REG_ECX, 26)
-   FEAT_DEF(OSXSAVE, 0x0001, 0, REG_ECX, 27)
-   FEAT_DEF(AVX, 0x0001, 0, REG_ECX, 28)
-   FEAT_DEF(F16C, 0x0001, 0, REG_ECX, 29)
-   FEAT_DEF(RDRAND, 0x0001, 0, REG_ECX, 30)
-
-   FEAT_DEF(FPU, 0x0001, 0, REG_EDX,  0)
-   FEAT_DEF(VME, 0x0001, 0, REG_EDX,  1)
-   FEAT_DEF(DE, 0x0001, 0, REG_EDX,  2)
-   FEAT

[dpdk-dev] [PATCH v2 6/7] Split memcpy operation to architecture specific

2014-10-16 Thread Chao Zhu
This patch splits the SSE based memory copy function from DPDK and push
them to architecture specific arch directories. Other processor
architecture can implement it's own vector based memory copy functions.
Signed-off-by: Chao Zhu 
---
 lib/librte_eal/common/Makefile |4 +-
 .../common/include/arch/i686/rte_memcpy.h  |  376 
 .../common/include/arch/x86_64/rte_memcpy.h|  376 
 lib/librte_eal/common/include/rte_memcpy.h |  376 
 4 files changed, 754 insertions(+), 378 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h
 delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h

diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 9b9a73d..e09d509 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk

 INC := rte_branch_prediction.h rte_common.h
 INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h
-INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h
+INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h
 INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h
 INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h
 INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h
@@ -47,7 +47,7 @@ INC += rte_warnings.h
 endif

 GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h
-ARCH_INC := $(GENERIC_INC) rte_prefetch.h
+ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h

 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC))
 SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \
diff --git a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
new file mode 100644
index 000..ba750b1
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h
@@ -0,0 +1,376 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_I686_H_
+#define _RTE_MEMCPY_I686_H_
+
+/**
+ * @file
+ *
+ * Functions for SSE implementation of memcpy().
+ */
+
+#include 
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */
+#endif
+
+/**
+ * Copy 16 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+   __m128i reg_a;
+   asm volatile (
+   "movdqu (%[src]), %[reg_a]\n\t"
+   "movdqu %[reg_a], (%[dst])\n\t"
+   : [reg_a] "=x" (reg_a)
+   : [src] "r" (src),
+ [dst] "r"(dst)
+   : "memory"
+   );
+}
+
+/**
+ * Copy 32 bytes from one location to another using optimised SSE
+ * instructions. The locations should not overlap.
+ *
+ * @param dst
+ *   Pointer to the destination of the data.
+ * @param src
+ *   Pointer to the source data.
+ */
+static inline void
+rte_mov32(uint8_

[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Ananyev, Konstantin


> 
> 
> From: Chao CH Zhu [mailto:bjzhuc at cn.ibm.com]
> Sent: Thursday, October 16, 2014 4:14 AM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
> architecture
> 
> Konstantin,
> 
> In my understanding, compiler barrier is a kind of software barrier which 
> prevents the compiler from moving memory accesses across
> the barrier.

Yes, compiler_barrier() right now only guarantees that the compiler wouldn't 
reorder instructions across it while emitting the code.

> This should be architecture-independent. And the "sync" instruction is a 
> hardware barrier which depends on PowerPC
> architecture.

I understand what "sync" does.

>So I think the compiler barrier should be the same on x86 and PowerPC. Any 
>comments? Please correct me if I was
> wrong.

The thing is that current DPDK code will not work correctly on system with weak 
memory ordering -
IA has quite strict memory ordering model and there is a code inside DPDK that 
relies on the fact that CPU would follow that model.
For such places in the code - compiler barrier is enough for IA, but is not 
enough for PPC. 

Do you worry about the names here- compiler barrier will become a HW one? :)?
In that case what you probably can do:
Create a new architecture dependent macro: rte_barrier().
That  would expand into rte_compiler_barrier() for IA and to rte_mb() for PPC.
Got through all references of rte_compiler_barrier() inside DPDK and replace it 
with rte_barrier().

Konstantin

> 
> Thanks a lot!
> 
> Best Regards!
> --
> Chao Zhu
> 
> 
> 
> 
> From: ? ? ? ?"Ananyev, Konstantin" 
> To: ? ? ? ?Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" 
> Date: ? ? ? ?2014/10/16 08:38
> Subject: ? ? ? ?RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM 
> Power ? ? ? ?architecture
> 
> 
> 
> 
> 
> Hi,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu
> > Sent: Friday, September 26, 2014 10:36 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
> > architecture
> >
> > The atomic operations implemented with assembly code in DPDK only
> > support x86. This patch add architecture specific atomic operations for
> > IBM Power architecture.
> >
> > Signed-off-by: Chao Zhu 
> > ---
> > ?.../common/include/powerpc/arch/rte_atomic.h ? ? ? | ?387 
> > 
> > ?.../common/include/powerpc/arch/rte_atomic_arch.h ?| ?318 
> > ?2 files changed, 705 insertions(+), 0 deletions(-)
> > ?create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic.h
> > ?create mode 100644 
> > lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> >
> ...
> > +
> > diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> > b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h
> > new file mode 100644
> > index 000..fe5666e
> > --- /dev/null
> > +
> ...
> >+#define ? ? ? ? ? ? ? ? rte_arch_rmb() asm volatile("sync" : : : "memory")
> >+
> > +#define ? ? ? ? ? ? ? ? rte_arch_compiler_barrier() do { ? ? ? ? ? ? ? ? ? 
> > ? ? ? ? ? ? ? ?\
> > + ? ? ? ? ? ? ? ? asm volatile ("" : : : "memory"); ? ? ? ? ? ? ? ? \
> > +} while(0)
> 
> I don't know much about PPC architecture, but as I remember it uses a 
> ?weakly-ordering memory model.
> Is that correct?
> If so, then you probably need rte_arch_compiler_barrier() to be "sync" 
> instruction (like mb()s above) .
> The reason is that IA has much stronger memory ordering model and there are a 
> lot of places in the code where it implies
> that ?ordering.
> For example - ring enqueue/dequeue functions.
> 
> Konstantin



[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture

2014-10-16 Thread Ananyev, Konstantin


> -Original Message-
> From: Richardson, Bruce
> Sent: Thursday, October 16, 2014 10:43 AM
> To: Chao CH Zhu; Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power 
> architecture
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao CH Zhu
> > Sent: Thursday, October 16, 2014 4:14 AM
> > To: Ananyev, Konstantin
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power
> > architecture
> >
> > Konstantin,
> >
> > In my understanding, compiler barrier is a kind of software barrier which
> > prevents the compiler from moving memory accesses across the barrier. This
> > should be architecture-independent. And the "sync" instruction is a
> > hardware barrier which depends on PowerPC architecture. So I think the
> > compiler barrier should be the same on x86 and PowerPC. Any comments?
> > Please correct me if I was wrong.
> >
> I would agree with that assessment, as far as it goes, in that a compiler 
> barrier is going to be the same on both architectures. However,
> we also need to start thinking about actual use cases - how to we specify the 
> barriers in a piece of code where we need a full memory
> barrier on PPC and only a compiler barrier on IA?
> My suggestion would be to do first as you propose and have proper primitives 
> for the different barrier types defined correctly for
> each platform - with the compiler barrier being, presumably, common across 
> each one. Then, as a second step, we probably need to
> look at defining "logical" barrier types (for want of a better term) that can 
> then be used in the code and which would be different
> across platforms.

Yeh, as I said in other mail, what we probably can do:

Create a new architecture dependent macro: rte_barrier().
That  would expand into rte_compiler_barrier() for IA and to rte_mb() for PPC.
Got through all references of rte_compiler_barrier() inside DPDK and replace it 
with rte_barrier().

BTW, for my own curiosity:
Is there any good use for compiler_barrier() on systems with weakly ordered 
memory model? 

> 
> Does this make sense to do this way? Is it the best solution? Do we want to 
> define the basic primitives or are we only ever likely to
> need the logical barrier types?
> 
> /Bruce


[dpdk-dev] Possibility to unbind interface by DPDK

2014-10-16 Thread Walukiewicz, Miroslaw
I have a question regarding unbinding Linux interface from EAL.

This feature was present up to dpdk 1.4 and next it was removed.

It was available  under RTE_EAL_UNBIND_PORTS flag.

Is there a possibility to get this feature back in the next releases?

Unbinding interfaces from EAL makes possible reading network interface 
parameters like IP address, MTU, VLAN configuration  from dpdk applications.

When Linux interface is unbound before application start this information is 
lost for application.

Mirek


[dpdk-dev] Possibility to unbind interface by DPDK

2014-10-16 Thread Lilijun
On 2014/10/16 19:45, Walukiewicz, Miroslaw wrote:
> I have a question regarding unbinding Linux interface from EAL.
> 
> This feature was present up to dpdk 1.4 and next it was removed.
> 
> It was available  under RTE_EAL_UNBIND_PORTS flag.
> 
> Is there a possibility to get this feature back in the next releases?
> 
> Unbinding interfaces from EAL makes possible reading network interface 
> parameters like IP address, MTU, VLAN configuration  from dpdk applications.
> 
> When Linux interface is unbound before application start this information is 
> lost for application.

The same problem was found.
Might an alternative be to actually bind the NICs to DPDK uio driver like the 
dpdk_nic_bind.py scipts after getting that NIC parameters in your application .

> 
> Mirek
> 
> 




[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API

2014-10-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Saturday, October 11, 2014 6:56 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
> 
> Introduce a new filter framewok in librte_ether. As to the implemetation
> discussion, please refer to
> http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN
> tunnel filter implementation is based on
> it.
> 
> Signed-off-by: Jijiang Liu 
> Acked-by: Helin Zhang 
> Acked-by: Jingjing Wu 
> 
[..]

> new file mode 100644
> index 000..574e9ff
> --- /dev/null
> +++ b/lib/librte_ether/rte_eth_ctrl.h

[...]
> +/**
> + * All generic operations to filters
> + */
> +enum rte_filter_op {
> + /**< used to check whether the type filter is supported */

Shouldn't be this comment below?

> + RTE_ETH_FILTER_OP_NONE = 0,
> + RTE_ETH_FILTER_OP_ADD,  /**< add filter entry */
> + RTE_ETH_FILTER_OP_UPDATE,   /**< update filter entry */
> + RTE_ETH_FILTER_OP_DELETE,   /**< delete filter entry */
> + RTE_ETH_FILTER_OP_GET,  /**< get filter entry */
> + RTE_ETH_FILTER_OP_SET,  /**< configurations */
> + /**< get information of filter, such as status or statistics */

Same here

> + RTE_ETH_FILTER_OP_GET_INFO,
> + RTE_ETH_FILTER_OP_MAX,
> +};
> +



[dpdk-dev] [PATCH 1/4] lib/librte_ether: new filter APIs definition

2014-10-16 Thread Thomas Monjalon
2014-10-10 07:28, De Lara Guarch, Pablo:
> > > > > Define new APIs to support configure multi-kind filters using same
> > > > > APIs
> > > > >  - rte_eth_dev_filter_supported
> > > > >  - rte_eth_dev_filter_ctrl
> > > > >
> > > > > As to the implemetation discussion, please refer to
> > > > > http://dpdk.org/ml/archives/dev/2014-September/005179.html, and
> > > > > control packet filter implementation is based on it.
> > > >
> > > > This patch is also present on the patchset Support flow director
> > > > programming on Fortville.
> > > > Should this patchset be rejected then or just this patch? In second
> > > > case, could you send a v2 without this patch?
> > >
> > > I think this patch does not only present on the flow director patchset, 
> > > but
> > > also on mac vlan support patchset, vxlan patchset, and so on. All of them
> > are
> > > using the same new filter APIs. If any patchset is applied, others may
> > require
> > > some modification (just as you said to remove this pacth).
> > >
> > Additional, without the patch, this patchset cannot work separately. More
> > than one features depend on the new filter APIs, but none patchset contains
> > the new filter APIs is applied currently.  That's why each patchset has such
> > patch.
> 
> I see, then probably the best idea would have been send this patch separately,
>  and just say that these patchsets depend on this patch, basically because
>  if you try to apply all these patches, you are going to get failures.

Yes, sending a separated patch and explicitly base your patch on this one
would be really easier to understand. And more generally, it's easier when
things are explained. You won't have to pay for the extra words you put in
your cover letter ;)

There is another problem with this patch: there are many versions around with
different logs and even different authors!

-- 
Thomas


[dpdk-dev] [PATCH v4 00/10] VM Power Management

2014-10-16 Thread Carew, Alan
Hi Thomas,

> > However with a DPDK solution it would be possible to re-use the message bus
> > to pass information like device stats, application state, D-state requests
> > etc. to the host and allow for management layer(e.g. OpenStack) to make
> > informed decisions.
> 
> I think that management informations should be transmitted in a management
> channel. Such solution should exist in OpenStack.

Perhaps it does, but this solution is not exclusive to OpenStack and just a 
potential use case.

> 
> > Also, the scope of adding power management to qemu/KVM would be huge;
> > while the easier path is not always the best and the problem of power
> > management in VMs is both a DPDK problem (given that librte_power only
> > worked on the host) and a general virtualization problem that would be
> > better solved by those with direct knowledge of Qemu/KVM architecture
> > and influence on the direction of the Qemu project.
> 
> Being a huge effort is not an argument.

I agree completely and was implied by what followed the conjunction.

> Please check with Qemu community, they'll welcome it.
> 
> > As it stands, the host backend is simply an example application that can
> > be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has
> > obvious leanings to Qemu, but even this could be easily swapped out for
> > XenBus, IVSHMEM, IP etc.
> >
> > If power management is to be eventually supported by Hypervisors directly
> > then we could also enable to option to switch to that environment, currently
> > the librte_power implementations (VM or Host) can be selected dynamically
> > (environment auto-detection) or explicitly via rte_power_set_env(), adding
> > an arbitrary number of environments is relatively easy.
> 
> Yes, you are adding a new layer to workaround hypervisor lacks. And this layer
> will handle native support when it will exist. But if you implement native
> support now, we don't need this extra layer.

Indeed, but we have a solution implemented now and yes it is a workaround, that 
is until Hypervisors support such functionality. It is possible that whatever 
solutions for power management present themselves in the future may require 
workarounds also, us-vhost is an example of such a workaround introduced to 
DPDK.

> 
> > I hope this helps to clarify the approach.
> 
> Thanks for your explanation.

Thanks for the feedback.

> 
> --
> Thomas

Alan.


[dpdk-dev] filter_ctl PMD API idea

2014-10-16 Thread Thomas Monjalon
2014-09-08 15:06, Wu, Jingjing:
> Any comments or advises? 
> 
> Thanks!
> 
> Fortville Filter features' development will be started based on this design 
> this week.

Thanks Jingjing for explaining your plan before working on it.
There were no comment for 1 month so we'll assume everybody is OK.
Now your work is done and it's time to integrate it.

This design is used in many pending patchsets.
Now I wait for an unique patch out of any patchset in order to do some
comments about implementation.
Then it will be applied with i40e filters using this API.
So we'll have a new API implemented only for i40e.
But when DPDK 1.8 will be out, I expect to receive patches replacing old API
with this new one for igb and ixgbe.

Last request, please could you write a brief email summarizing all filters
of Intel NICs from an user perspective, and which ones are implemented in
DPDK, with which API?

Thanks


> > -Original Message-
> > Hi, all
> > 
> > When we develop filters feature in i40e driver for Intel? Ethernet 
> > Controller XL710/X710
> > [Fortville] (For both 10G/40G), we found that there are lots of new 
> > filters, there are also
> > some changes on the existing filters, comparing to ixgbe.
> > If we keep the way to add new ops in rte_eth_dev for each new filter, it 
> > can work.
> > But we suggest to use a more generic API for all filters to avoid a 
> > superset dev_ops. It needs
> > to be cleaner and easy-to-use. There is a need for technical discussion.
> > 
> > Here is the early design idea we are looking for comments.
> > 
> > 1.   Create two new APIs
> > -
> > rte_eth_filter_supported(uint8_t port, uint16_t filter_type);
> > /* check whether this filter type is supported for the queried port */
> > rte_eth_filter_ctl(uint8_t port, uint16_t filter_type, uint16_t filter_op, 
> > void *arg);
> > /* configure filters, will call new ops eth_filter_ctl in eth_dev_ops */
> > -
> > 
> > 2.   Define filter types, operations, and structures in new header file
> > lib/librte_eth/rte_eth_filter.h.
> > -
> > #define RTE_ETH_FILTER_RSS  1
> > #define RTE_ETH_FILTER_SYN  2
> > #define RTE_ETH_FILTER_5TUPLE   3
> > #define RTE_ETH_FILTER_FDIR 4
> >  
> > 
> > #define RTE_ETH_FILTER_OP_GET   1
> > #define RTE_ETH_FILTER_OP_ADD   2
> > #define RTE_ETH_FILTER_OP_DELETE3
> > #define RTE_ETH_FILTER_OP_SET   4
> > < other operations if want to define>...
> > 
> > /* structures defined for corresponding filter type and operation */
> > /* take RTE_ETH_FILTER_FDIR and OP_SET for example*/
> > 
> > struct rte_eth_filter_fdir_cfg {
> > #define RTE_ETH_FILTER_FDIR_SET_MASK   0
> > #define RTE_ETH_FILTER_FDIR_SET_OFFSET  1
> > ?? 
> > uint16_t cfg_type;
> > ??/* sub operation to defined what specific configuration it will take,
> > ???and which following fields are meaningful*/
> > 
> > ??/* fields, can be a union or combine of required specific items*/
> > 
> > 
> > };
> > 
> > -
> > By this way, It is easy to add more filter types or operation in future.
> > And the difference among the same filter and operation can be distinguish 
> > by sub command
> > in defined structure, e.g. ?cfg_type? in above rte_eth_filter_fdir_cfg 
> > structure.
> > 
> > 3.   Define ops in driver (take i40e for example)
> > -
> > static struct eth_dev_ops i40e_eth_dev_ops = {
> >  . filter_ctl = i40e_filter_ctl,
> > };
> > -
> > Then the functions in drivers can be implemented separately.
> > 
> > 4.   Use case In test-pmd/cmdline.c
> > -
> > #include 
> > /* add or change commands e.g. fdir_set (arg1) (arg2) ?? */
> > 
> > static void
> > cmd_fdir_parsed()
> > {
> > ??
> > ??/* take setting fdir mask for example*/
> > ??struct rte_eth_filter_fdir_cfg cfg;
> > 
> > ??if (rte_eth_filter_supported(port, RTE_ETH_FILTER_FDIR)) {
> > ??  cfg.cfg_type = RTE_ETH_FILTER_FDIR_SET_MASK;
> > ??  /* fill the corresponding fields in cfg*/
> > ??  ??
> > ??  rte_eth_filter_ctl(port, RTE_ETH_FILTER_FDIR, RTE_ETH_FILTER_OP_SET, 
> > &cfg);
> > ??}
> > ??
> > }
> > -
> > 
> > 
> > Any comments are welcome!
> > 
> > At the time being, only Intel PMD is only available on dpdk.org. We are 
> > lack of understanding
> > on the other non-Intel PMD, the current design did not take them into 
> > account. But we are
> > looking for the inputs from those PMD developers, we strongly look forward 
> > to those PMD
> > are released as open source.
> > 
> > Thanks!
> > Jingjing



[dpdk-dev] [v2 20/23] librte_cfgfile: interpret config files

2014-10-16 Thread Thomas Monjalon
Hi Cristian,

2014-06-04 19:08, Cristian Dumitrescu:
> This library provides a tool to interpret config files that have standard
> structure.
> 
> It is used by the Packet Framework examples/ip_pipeline sample application.
> 
> It originates from examples/qos_sched sample application and now it makes
> this code available as a library for other sample applications to use.
> The code duplication with qos_sched sample app to be addressed later.

4 months ago, you said that this duplication will be adressed later.
Neither you nor anyone at Intel submitted a patch to clean up that.
I just want to be sure that "later" doesn't mean "never" because
I'm accepting another "later" word for cleaning old filtering API.

Maybe you just forgot it so please prove me that I'm right to accept
"later" clean-up, in general.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v5 1/8]i40e:support VxLAN packet identification in librte_ether

2014-10-16 Thread Thomas Monjalon
2014-10-11 13:55, Jijiang Liu:
> Add data structures and APIs in librte_ether for supporting tunneling UDP 
> port configuration on i40e,
> Currently, only VxLAN is implemented, which include
>  -  VxLAN UDP port initialization
>  -  Add APIs to configure VxLAN UDP port

Please could you explain in the commit log how it is related to filtering?

[...]

> + /**
> + * Add tunneling UDP port configuration of Ethernet device

tunneling UDP or UDP tunneling?

Please, explain what the device could do with these informations.
Offloading? Filtering?

> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param tunnel_udp
> + *   Where to store the current Tunneling UDP configuration
> + *   of the Ethernet device.

Many words are useless here.
"UDP tunneling configuration" should be sufficient.

> + * @param count
> + *   How many configurations are going to added.

It's a verbose commenting style, but why not.
Typo: "to be added".

> + *
> + * @return
> + *   - (0) if successful.
> + *   - (-ENODEV) if port identifier is invalid.
> + *   - (-ENOTSUP) if hardware doesn't support tunnel type.
> + */
> +int
> +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> + struct rte_eth_udp_tunnel *tunnel_udp,
> + uint8_t count);

Thanks
-- 
Thomas


[dpdk-dev] EAL : Input/output error on DPDK 1.7.1

2014-10-16 Thread Raghav K
Hey,
I observe continuous burst of I/O Errors, as indicated below, with the testpmd 
application with DPDK 1.7.1.This seems to originate from 
eal_intr_process_interrupts() function. I seemed to have setup the DPDK 
prerequisites alright.
Another recent post seemed to suggest moving back to 1.7.0, however I would 
like to persist with 1.7.1.
Any help/pointers in resolving this would be greatly appreciated. 
Much thanks,Raghav
root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app# 
./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2
EAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
reading from file descriptor 21: Input/output errorEAL: Error reading from file 
descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: 
Input/output errorEAL: Error reading from file descriptor 21: Input/output 
errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
reading from file descriptor 21: Input/output errorEAL: Error reading from file 
descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: 
Input/output errorEAL: Error reading from file descriptor 21: Input/output 
errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
reading from file descriptor 21: Input/output errorEAL: Error reading from file 
descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: 
Input/output errorEAL: Error reading from file descriptor 21: Input/output 
errorEAL: Error reading from file descriptor 21: Input/output error

root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py --status
Network devices using DPDK-compatible 
driver:02:01.0 '82545EM Gigabit 
Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0 '82545EM 
Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000
Network devices using kernel 
driver===:02:00.0 '82545EM Gigabit Ethernet 
Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio *Active*:02:03.0 
'82545EM Gigabit Ethernet Controller (Copper)' if=eth3 drv=e1000 unused=igb_uio 
:02:05.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth4 drv=e1000 
unused=igb_uio :02:06.0 '82545EM Gigabit Ethernet Controller (Copper)' 
if=eth5 drv=e1000 unused=igb_uio 
Other network devices=
  


[dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile.

2014-10-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alan Carew
> Sent: Sunday, October 12, 2014 8:36 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 04/10] VM Power Management application
> and Makefile.
> 
> For launching CLI thread and Monitor thread and initialising
> resources.
> Requires a minimum of two lcores to run, additional cores specified by eal
> core
> mask are not used.
> 
> Signed-off-by: Alan Carew 
> ---
>  examples/vm_power_manager/Makefile |  57 ++
>  examples/vm_power_manager/main.c   | 117
> +
>  examples/vm_power_manager/main.h   |  52 +
>  3 files changed, 226 insertions(+)
>  create mode 100644 examples/vm_power_manager/Makefile
>  create mode 100644 examples/vm_power_manager/main.c
>  create mode 100644 examples/vm_power_manager/main.h
[...]
> +# Default target, can be overriden by command line or environment
> +RTE_TARGET ?= x86_64-default-linuxapp-gcc

Tiny comment here. Target should be x86_64-native-linuxapp-gcc

> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# binary name
> +APP = vm_power_mgr
> +
> +# all source are stored in SRCS-y
> +SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c
> +SRCS-y += channel_monitor.c
> +
> +CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/
> +CFLAGS += $(WERROR_FLAGS)
> +



[dpdk-dev] [PATCH] librte_eal: FreeBSD contigmem prevent possible buffer overrun during module unload.

2014-10-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alan Carew
> Sent: Tuesday, October 14, 2014 1:19 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] librte_eal: FreeBSD contigmem prevent
> possible buffer overrun during module unload.
> 
> The maximum mount contiguous memory regions for FreeBSD is limited by
> RTE_CONTIGMEM_MAX_NUM_BUFS, a pointer to each region is stored in
> static void * contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS]
> 
> A user can specify a greater amount via hw.contigmem.num_buffers,
> while the allocation logic will prevent this allocation from occuring the 
> logic
> in contigmem_unload() will attempt to free hw.contigmem.num_buffers and
> an
> overrun occurs.
> 
> This patch limits the freeing to a maximum of
> RTE_CONTIGMEM_MAX_NUM_BUFS.
> 
> Signed-off-by: Alan Carew 

Acked-by: Pablo de Lara 



[dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e

2014-10-16 Thread Thomas Monjalon
2014-10-11 13:55, Jijiang Liu:
>  #
> +# Compile tunneling UDP port support
> +#
> +CONFIG_RTE_LIBRTE_TUNNEL_UDP_PORT=4789
> +
> +#

1) this option is not to "Compile tunneling UDP port support"
2) why is it a compile time option? should it be an API parameter or
a runtime option?

> + uint16_t packet_type; /**< Packet type, which indicates packet 
> format */

It's not very clear what packet type is.
There is maybe a more precise description, or is it hardware dependent?

>  static struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf,
> - struct i40e_vsi *vsi);
> + struct i40e_vsi *vsi);

It's not related to VXLAN.

-- 
Thomas


[dpdk-dev] [PATCH v5 3/8]app/test-pmd:test VxLAN packet identification

2014-10-16 Thread Thomas Monjalon
2014-10-11 13:55, Jijiang Liu:
> - "tx_checksum set mask (port_id)\n"
> + "tx_checksum set (mask) (port_id)\n"
>   "Enable hardware insertion of checksum offload with"
> - " the 4-bit mask, 0~0xf, in packets sent on a port.\n"
> + " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
>   "bit 0 - insert ip   checksum offload if set\n"
>   "bit 1 - insert udp  checksum offload if set\n"
>   "bit 2 - insert tcp  checksum offload if set\n"
>   "bit 3 - insert sctp checksum offload if set\n"
> + "bit 4 - insert inner ip  checksum offload if 
> set\n"
> + "bit 5 - insert inner udp checksum offload if 
> set\n"
> + "bit 6 - insert inner tcp checksum offload if 
> set\n"
> + "bit 7 - insert inner sctp checksum offload if 
> set\n"
>   "Please check the NIC datasheet for HW limits.\n\n"
[...]
>   .help_str = "enable hardware insertion of L3/L4checksum with a given "
> - "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
> ip"
> - "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP",
> + "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
> ip "
> + "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP, Bit 4 for inner ip "
> + "Bit 5 for inner UDP, Bit 6 for inner TCP, Bit 7 for inner SCTP",
>   .tokens = {
>   (void *)&cmd_tx_cksum_set_tx_cksum,
>   (void *)&cmd_tx_cksum_set_set,

How is it related to VXLAN?
I may have missed something. But if not, I note the name of the reviewers ;)

-- 
Thomas


[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API

2014-10-16 Thread Thomas Monjalon
I don't review the common API as it should be done in an unique place
and there are many copies in different patchsets. Let's focus on tunnels.

2014-10-11 13:55, Jijiang Liu:
> +/ TUNNEL FILTER DATA DEFINATION *** */

We cannot miss this comment :)

> +#define ETH_TUNNEL_FILTER_OMAC  0x01
> +#define ETH_TUNNEL_FILTER_OIP   0x02
> +#define ETH_TUNNEL_FILTER_TENID 0x04
> +#define ETH_TUNNEL_FILTER_IMAC  0x08
> +#define ETH_TUNNEL_FILTER_IVLAN 0x10
> +#define ETH_TUNNEL_FILTER_IIP   0x20
> +
> +#define RTE_TUNNEL_FLAGS_TO_QUEUE 1

These values requires some comments.

> +/*
> + * Tunneled filter type
> + */
> +enum rte_tunnel_filter_type {
> + RTE_TUNNEL_FILTER_TYPE_NONE = 0,
> + RTE_TUNNEL_FILTER_OIP = ETH_TUNNEL_FILTER_OIP,
> + RTE_TUNNEL_FILTER_IMAC_IVLAN =
> + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN,
> + RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID =
> + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN |
> + ETH_TUNNEL_FILTER_TENID,
> + RTE_TUNNEL_FILTER_IMAC_TENID =
> + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_TENID,
> + RTE_TUNNEL_FILTER_IMAC = ETH_TUNNEL_FILTER_IMAC,
> + RTE_TUNNEL_FILTER_OMAC_TENID_IMAC =
> + ETH_TUNNEL_FILTER_OMAC | ETH_TUNNEL_FILTER_TENID |
> + ETH_TUNNEL_FILTER_IMAC,
> + RTE_TUNNEL_FILTER_IIP = ETH_TUNNEL_FILTER_IIP,
> + RTE_TUNNEL_FILTER_TYPE_MAX,
> +};

It's absolutely impossible to understand. Keep in mind the first goal of an
API: be used (which imply to be understood by users).
And I really don't understand why you define values for combination of
previous flags. Please, keep it simple.

-- 
Thomas


[dpdk-dev] [PATCH v5 0/8]Support VxLAN on Fortville

2014-10-16 Thread Thomas Monjalon
This test report brings a lot of details. It's a good thing but we should
find a way to remove the "administrative words".
It should start with the tested-by line to allow copy paste in the commit log.

2014-10-11 07:56, Liu, Yong:
> Patch name:   Support VxLAN on Fortville
> Brief description:Verify vxlan checksum detect/offload and tunnel filter 
> work fine.
> Test Flag:Tested-by 
> Tester name:  yong.liu at intel.com
> Test environment:
>   OS: Fedora20 3.15.8-200.fc20.x86_64
>   GCC: gcc version 4.8.3 20140624
>   CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
>   NIC: Intel Corporation Ethernet Controller XL710 for 
> 40GbE QSFP+ [8086:1583]
> Test Tool Chain information:  
>   N/A
> Commit ID:ee1a5470faa751c1fd07d23b86659fe7a68fd251
> 
> Detailed Testing information  
> DPDK SW Configuration:
>   Default x86_64-native-linuxapp-gcc configuration
> Test Result Summary:  Total 6 cases, 6 passed, 0 failed
> 
> Test Case - name:
>   vxlan_ipv4_detect
> Test Case - Description:
>   Check testpmd can receive and detect vxlan packet 
> Test Case -command / instruction:
>   Start testpmd with vxlan enabled and rss disabled

Why RSS is disabled?

>   testpmd -c  -n 4 -- -i --tunnel-type=1 --disble-rss 
> --rxq=4 --txq=4 --nb-cores=8 --nb-ports=2

--disble-rss: typo in command line. It raises doubts on the test.

-- 
Thomas


[dpdk-dev] [PATCH v5 7/8]i40e:support VxLAN Tx checksum offload

2014-10-16 Thread Thomas Monjalon
2014-10-11 13:55, Jijiang Liu:
> Support VxLAN Tx checksum offload, which include
>   - outer L3(IP) checksum offload
>   - inner L3(IP) checksum offload
>   - inner L4(UDP, TCP and SCTP) checksum offload
[...]
> +
> + /* fields to support tunnelling packet TX offloads */

I know that previous comment is "fields to support TX offloads",
but I'd prefer "for TX offloading of tunnels".
Maybe that "encapsulation" is better than "tunnel".
Just my opinion.

> + union {
> + /**< combined inner l2/l3 lengths as single var */
> + uint16_t inner_l2_l3_len;
> +
> + struct {
> + /**< inner L3 (IP) Header Length. */
> + uint16_t inner_l3_len:9;
> +
> + /**< L2 (MAC) Header Length. */
> + uint16_t inner_l2_len:7;
> + };
> + };

I would like to highlight that you are using 2 bytes in the second cache line
of the mbuf.
It deserves at least a line in the commit log.
Actually I'd prefer a separate patch for mbuf modifications.

Thanks
-- 
Thomas


[dpdk-dev] filter_ctl PMD API idea

2014-10-16 Thread Wu, Jingjing


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, October 17, 2014 12:07 AM
> To: Wu, Jingjing
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] filter_ctl PMD API idea
> 
> 2014-09-08 15:06, Wu, Jingjing:
> > Any comments or advises?
> >
> > Thanks!
> >
> > Fortville Filter features' development will be started based on this design 
> > this week.
> 
> Thanks Jingjing for explaining your plan before working on it.
> There were no comment for 1 month so we'll assume everybody is OK.
> Now your work is done and it's time to integrate it.
> 
It's great, thanks.

> This design is used in many pending patchsets.
> Now I wait for an unique patch out of any patchset in order to do some
> comments about implementation.

OK, an unique patch of this new API definition will be sent soon. And I hope it 
can be reviewed
With high priority, due to many pending patchsets we need to rework.

> Then it will be applied with i40e filters using this API.
> So we'll have a new API implemented only for i40e.
> But when DPDK 1.8 will be out, I expect to receive patches replacing old API
> with this new one for igb and ixgbe.

Fine, now I am working on integrating ixgbe's flow director to the new APIs.

> Last request, please could you write a brief email summarizing all filters
> of Intel NICs from an user perspective, and which ones are implemented in
> DPDK, with which API?
> 
OK.

> Thanks
> 
> 
> > > -Original Message-
> > > Hi, all
> > >
> > > When we develop filters feature in i40e driver for Intel? Ethernet 
> > > Controller XL710/X710
> > > [Fortville] (For both 10G/40G), we found that there are lots of new 
> > > filters, there are also
> > > some changes on the existing filters, comparing to ixgbe.
> > > If we keep the way to add new ops in rte_eth_dev for each new filter, it 
> > > can work.
> > > But we suggest to use a more generic API for all filters to avoid a 
> > > superset dev_ops. It
> needs
> > > to be cleaner and easy-to-use. There is a need for technical discussion.
> > >
> > > Here is the early design idea we are looking for comments.
> > >
> > > 1.   Create two new APIs
> > > -
> > > rte_eth_filter_supported(uint8_t port, uint16_t filter_type);
> > > /* check whether this filter type is supported for the queried port */
> > > rte_eth_filter_ctl(uint8_t port, uint16_t filter_type, uint16_t 
> > > filter_op, void *arg);
> > > /* configure filters, will call new ops eth_filter_ctl in eth_dev_ops */
> > > -
> > >
> > > 2.   Define filter types, operations, and structures in new header file
> > > lib/librte_eth/rte_eth_filter.h.
> > > -
> > > #define RTE_ETH_FILTER_RSS1
> > > #define RTE_ETH_FILTER_SYN2
> > > #define RTE_ETH_FILTER_5TUPLE 3
> > > #define RTE_ETH_FILTER_FDIR   4
> > >  
> > >
> > > #define RTE_ETH_FILTER_OP_GET 1
> > > #define RTE_ETH_FILTER_OP_ADD 2
> > > #define RTE_ETH_FILTER_OP_DELETE  3
> > > #define RTE_ETH_FILTER_OP_SET 4
> > > < other operations if want to define>...
> > >
> > > /* structures defined for corresponding filter type and operation */
> > > /* take RTE_ETH_FILTER_FDIR and OP_SET for example*/
> > >
> > > struct rte_eth_filter_fdir_cfg {
> > > #define RTE_ETH_FILTER_FDIR_SET_MASK   0
> > > #define RTE_ETH_FILTER_FDIR_SET_OFFSET  1
> > > ?? 
> > >   uint16_t cfg_type;
> > > ??/* sub operation to defined what specific configuration it will take,
> > > ???and which following fields are meaningful*/
> > > 
> > > ??/* fields, can be a union or combine of required specific items*/
> > > 
> > >
> > > };
> > >
> > > -
> > > By this way, It is easy to add more filter types or operation in future.
> > > And the difference among the same filter and operation can be distinguish 
> > > by sub
> command
> > > in defined structure, e.g. ?cfg_type? in above rte_eth_filter_fdir_cfg 
> > > structure.
> > >
> > > 3.   Define ops in driver (take i40e for example)
> > > -
> > > static struct eth_dev_ops i40e_eth_dev_ops = {
> > >  . filter_ctl = i40e_filter_ctl,
> > > };
> > > -
> > > Then the functions in drivers can be implemented separately.
> > >
> > > 4.   Use case In test-pmd/cmdline.c
> > > -
> > > #include 
> > > /* add or change commands e.g. fdir_set (arg1) (arg2) ?? */
> > >
> > > static void
> > > cmd_fdir_parsed()
> > > {
> > >   ??
> > > ??/* take setting fdir mask for example*/
> > > ??struct rte_eth_filter_fdir_cfg cfg;
> > >
> > > ??if (rte_eth_filter_supported(port, RTE_ETH_FILTER_FDIR)) {
> > > ??cfg.cfg_type = RTE_ETH_FILTER_FDIR_SET