[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
Hi, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu > Sent: Friday, September 26, 2014 10:36 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > architecture > > The atomic operations implemented with assembly code in DPDK only > support x86. This patch add architecture specific atomic operations for > IBM Power architecture. > > Signed-off-by: Chao Zhu > --- > .../common/include/powerpc/arch/rte_atomic.h | 387 > > .../common/include/powerpc/arch/rte_atomic_arch.h | 318 > 2 files changed, 705 insertions(+), 0 deletions(-) > create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic.h > create mode 100644 > lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > ... > + > diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > new file mode 100644 > index 000..fe5666e > --- /dev/null > + ... >+#define rte_arch_rmb() asm volatile("sync" : : : "memory") >+ > +#define rte_arch_compiler_barrier() do {\ > + asm volatile ("" : : : "memory"); \ > +} while(0) I don't know much about PPC architecture, but as I remember it uses a weakly-ordering memory model. Is that correct? If so, then you probably need rte_arch_compiler_barrier() to be "sync" instruction (like mb()s above) . The reason is that IA has much stronger memory ordering model and there are a lot of places in the code where it implies that ordering. For example - ring enqueue/dequeue functions. Konstantin
[dpdk-dev] i40e: Steps and required configurations of how to achieve the best performance!
Hi Thomas Yes, your proposal it the perfect one, also the most complicated one. I was thinking of that one as well, but we did not have enough time for that in our 1.8 timeframe. In the long run, I agree with you to implement EAL function to access PCI config space directly. I will try to put it in our plan as soon as possible, if no objections. For now, I think the quickest and easiest way might be to write out a script of using ?setpci?, the Linux command. It is harmless for our code base, and we can remove it when we have better choice. What do you think? Thank you very much for the great comments on this topic! I really like it! Regards, Helin From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Wednesday, October 15, 2014 5:42 PM To: Zhang, Helin Cc: dev at dpdk.org; David Marchand Subject: Re: [dpdk-dev] i40e: Steps and required configurations of how to achieve the best performance! Hi Helin, 2014-09-19 03:43, Zhang, Helin: > My idea on it could be, > 1. Write a script to use ?setpci? to configure pci configuration. > End user can decide which PCI device needs to be changed. > 2. Add code to change that PCI configuration in i40e PMD only, as > it seems nobody else need it till now. The second solution seems better because more integrated and automatic. But I would like to have some EAL functions to access to PCI configuration. These functions would have Linux and BSD implementations. Then the PMD could change the configuration if it's allowed by a run-time option and would notify the change with a warning/log. Thanks for keeping us notified of your progress. -- Thomas
[dpdk-dev] [PATCH] Add Rx error statistics for Fortville
Hi Thomas Thank you very much for the detailed guidance! It is really helpful for me. Regards, Helin > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, October 15, 2014 8:22 PM > To: Zhang, Helin > Cc: dev at dpdk.org; Liu, Jijiang > Subject: Re: [dpdk-dev] [PATCH] Add Rx error statistics for Fortville > > Helin, > > As you are in charge of i40e, here are 2 tips to acknowledge patches: > > 1) title should take this format: > i40e: add Rx error statistics > > > Acked-by: Helin Zhang > > 2) This line should be added right after the Signed-off-by. > And the rest of the email (patch body) can be removed. > This way, your answer would be faster to read. > > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > > > Sent: Wednesday, October 15, 2014 11:15 AM > > > To: dev at dpdk.org > > > Subject: [dpdk-dev] [PATCH] Add Rx error statistics for Fortville > > This header is not needed also. > > > > This patch adds incoming packet error statistics in the i40e_ethdev.c > > > file. > > > > > > Signed-off-by: Jijiang Liu > > [I remove the rest of the original email because I have no comment on it] > > Thanks > -- > Thomas
[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
> -Original Message- > From: De Lara Guarch, Pablo > Sent: Thursday, October 16, 2014 12:01 AM > To: Liu, Jijiang; dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > > Sent: Saturday, October 11, 2014 6:56 AM > > To: dev at dpdk.org > > Subject: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API > > > > Introduce a new filter framewok in librte_ether. As to the > > implemetation discussion, please refer to > > http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN > > tunnel filter implementation is based on it. > > > > Signed-off-by: Jijiang Liu > > Acked-by: Helin Zhang > > Acked-by: Jingjing Wu > > > > --- > > lib/librte_ether/Makefile |1 + > > lib/librte_ether/rte_eth_ctrl.h | 152 > > +++ > > lib/librte_ether/rte_ethdev.c | 32 > > lib/librte_ether/rte_ethdev.h | 56 +++--- > > 4 files changed, 229 insertions(+), 12 deletions(-) create mode > > 100644 lib/librte_ether/rte_eth_ctrl.h > > > [...] > > +++ b/lib/librte_ether/rte_eth_ctrl.h > > [...] > > > +/** > > + * Tunnel Packet filter configuration. > > + */ > > +struct rte_eth_tunnel_filter_conf { > > + struct ether_addr *outer_mac; /**< Outer MAC address fiter. */ > > + struct ether_addr *inner_mac; /**< Inner MAC address fiter. */ > > + uint16_t inner_vlan; /**< Inner VLAN fiter. */ > > + enum rte_tunnel_iptype ip_type; /**< IP address type. */ > > + union { > > + uint32_t ipv4_addr;/**< IPv4 source address to match. */ > > + uint32_t ipv6_addr[4]; /**< IPv6 source address to match. */ > > + } ip_addr; /**< IPv4/IPv6 source address to match (union of above). > > */ > > + > > + uint8_t filter_type; /**< Filter type. */ > > This should be enum rte_tunnel_filter_type filter_type, and not uint8_t > filter_type. I will fix this. > > + uint8_t to_queue; /**< Use MAC and VLAN to point to a queue. > > */ > > + enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */ > > + uint32_t tenant_id;/** < Tenant number. */ > > + uint16_t queue_id; /** < queue number. */ > > +}; > > + > [...]
[dpdk-dev] to the intel dpdk engineers and all contributors
On Oct 15, 2014, at 1:46 PM, daniel chapiesky wrote: > I just watched the closing remarks by Tim Driscol at the dpdk summit > > http://youtu.be/r-JA5NBybrs > > At time 4:30, he mentioned the "shock to the system" of developers > expecting a pat on the back and instead receiving critiques of their > code. > > I realized that I was one of those who failed to acknowledge the incredible > work the Intel Engineers and other contributors have produced. > > Please let me acknowledge all of you and your efforts with a few comments: > > 1) Kudos!!: > > 2) The Packet Framework made me run around waving my hands in the air > yelling: "This is freaking awesome! I don't have to write it myself!!!" > > 3) The layered architecture is elegant. > > 4) Examples The examples are wonderful! Those who wrote the examples > are my heroes. > > 5) Docs? Clear, to the point, and better than the other projects we depend > upon (you know who you are) > > 6) 6Wind - Thank you for taking on the management of the repository and > website - your coordination effort is truly appreciated > > 7) Did I say the Packet Framework saved me so much time I was actually able > to cut back my coffee intake by 10% > > 8) Windriver - PktGen! (though I really want to know more about > mcos?.) If you need more information about MCOS let me know as I did not write a lot of docs for it :-( Thanks ++Keith > > Finally, > > I recently received a pat on the back for the application I have developed. > In truth, that pat should be passed on, since my application depends so > heavily on DPDK. > > Thank you. > > I encourage others to let the Intel Engineers and contributors know how > much we appreciate the time and effort they have given to DPDK. > > Sincerely, > > > Daniel Chapiesky > AllSource Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
[dpdk-dev] kernel panic when stop my test demo
On 2014/10/15 18:08, Richardson, Bruce wrote: > > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Lilijun >> Sent: Wednesday, October 15, 2014 10:43 AM >> To: dev at dpdk.org; stephen at networkplumber.org >> Subject: Re: [dpdk-dev] kernel panic when stop my test demo >> >> Hi all, >> >> After adding unmap uio resources operations in process signal handler >> functions, >> An new error was found as follows: >> Call Trace: >> [] uio_release+0x40/0x60 [uio] >> [] __fput+0xe9/0x270 >> [] fput+0xe/0x10 >> [] task_work_run+0xa7/0xe0 >> [] do_notify_resume+0x97/0xb0 >> [] int_signal+0x12/0x17 >> >> The code for unmap uio resources is shown: >> static void pci_dev_uio_unmap(struct rte_pci_device *pci_dev, uint8_t >> port_id) >> { >> int i; >> >> RTE_LOG(INFO, EAL, "begin unmap port %d uio resource! \n", port_id); >> if (NULL == pci_dev) >> { >> RTE_LOG(ERR, EAL, "begin unmap port %d uio resource! \n", >> port_id); >> return; >> } >> >> for (i = 0; i != PCI_MAX_RESOURCE; i++) >> { >> /* skip empty BAR */ >> if (0 == pci_dev->mem_resource[i].phys_addr) >> continue; >> if (munmap(pci_dev->mem_resource[i].addr, pci_dev- >>> mem_resource[i].len) >> == >> -1){ >> RTE_LOG(ERR, EAL, "Error with munmap\n"); >> return; >> } >> } >> if (close(pci_dev->intr_handle.fd) == -1){ >> RTE_LOG(ERR, EAL, "Error closing interrupt handle\n"); >> return; >> } >> pci_dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; >> RTE_LOG(INFO, EAL, "unmap port %d uio resource successfully!\n", >> port_id); >> } >> >> Does anyone has some ideas? >> >> Thanks for any help. >> Jerry >> >> On 2014/10/14 19:58, Lilijun wrote: >>> Hi Stephen and all, >>> >>> I have a same problem as this older email describes on Aug 14, 2013. >>> Any help will be appreciated. >>> >>> The details is shown as follows. >>> The key step implementation of my demo is: >>> 1. Firstly, call rte_eal_init() to do some initialization. >>> 2. Switch the driver of my Intel 82599 NIC from ixgbe.ko to igb_uio.ko >>> like tools/dpdk_nic_bind.py written in C source code. >>> 3. Configure rte_dev and start it. >>> 4. Do some rx/tx tests. >>> 5. call rte_eth_dev_stop(dpdk_port_id) to stop the hardware as your history >> emails. >>> 6. Switch the driver of the NIC from igb_uio.ko to ixgbe.ko. >>> 7. Kill the demo using commands: kill -9. > > Just to clarify one point - you have an application running which was using > the NICs with DPDK while you remove the uio driver and replace it with ixgbe? I would expect doing such a thing to cause problems as stopping the device does not cause the NIC BAR memory to be unmapped from the DPDK process. Therefore removing the driver providing that memory map and getting another driver to start using those same BARs would not be recommended. > Thanks for your reply. Yes, I want to change the NIC driver by replacing the uio driver with ixgbe in order to recover the NIC to origin kernel ether-net devices while keeping the application running. Then my application can use the NICs with DPDK or with kernel ixgbe driver on demand. I am confusing with how to release all uio resources when stop my application. Would you like to give me any suggestions for my requirements? > /Bruce > > > . >
[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
Konstantin, In my understanding, compiler barrier is a kind of software barrier which prevents the compiler from moving memory accesses across the barrier. This should be architecture-independent. And the "sync" instruction is a hardware barrier which depends on PowerPC architecture. So I think the compiler barrier should be the same on x86 and PowerPC. Any comments? Please correct me if I was wrong. Thanks a lot! Best Regards! -- Chao Zhu From: "Ananyev, Konstantin" To: Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" Date: 2014/10/16 08:38 Subject:RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture Hi, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu > Sent: Friday, September 26, 2014 10:36 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture > > The atomic operations implemented with assembly code in DPDK only > support x86. This patch add architecture specific atomic operations for > IBM Power architecture. > > Signed-off-by: Chao Zhu > --- > .../common/include/powerpc/arch/rte_atomic.h | 387 > .../common/include/powerpc/arch/rte_atomic_arch.h | 318 > 2 files changed, 705 insertions(+), 0 deletions(-) > create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic.h > create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > ... > + > diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > new file mode 100644 > index 000..fe5666e > --- /dev/null > + ... >+#definerte_arch_rmb() asm volatile("sync" : : : "memory") >+ > +#define rte_arch_compiler_barrier() do {\ > + asm volatile ("" : : : "memory"); \ > +} while(0) I don't know much about PPC architecture, but as I remember it uses a weakly-ordering memory model. Is that correct? If so, then you probably need rte_arch_compiler_barrier() to be "sync" instruction (like mb()s above) . The reason is that IA has much stronger memory ordering model and there are a lot of places in the code where it implies that ordering. For example - ring enqueue/dequeue functions. Konstantin
[dpdk-dev] kernel panic when stop my test demo
> -Original Message- > From: Lilijun [mailto:jerry.lilijun at huawei.com] > Sent: Thursday, October 16, 2014 3:40 AM > To: Richardson, Bruce; dev at dpdk.org; stephen at networkplumber.org > Subject: Re: [dpdk-dev] kernel panic when stop my test demo > > On 2014/10/15 18:08, Richardson, Bruce wrote: > > > > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Lilijun > >> Sent: Wednesday, October 15, 2014 10:43 AM > >> To: dev at dpdk.org; stephen at networkplumber.org > >> Subject: Re: [dpdk-dev] kernel panic when stop my test demo > >> > >> Hi all, > >> > >> After adding unmap uio resources operations in process signal handler > functions, > >> An new error was found as follows: > >> Call Trace: > >> [] uio_release+0x40/0x60 [uio] > >> [] __fput+0xe9/0x270 > >> [] fput+0xe/0x10 > >> [] task_work_run+0xa7/0xe0 > >> [] do_notify_resume+0x97/0xb0 > >> [] int_signal+0x12/0x17 > >> > >> The code for unmap uio resources is shown: > >> static void pci_dev_uio_unmap(struct rte_pci_device *pci_dev, uint8_t > port_id) > >> { > >> int i; > >> > >> RTE_LOG(INFO, EAL, "begin unmap port %d uio resource! \n", > >> port_id); > >> if (NULL == pci_dev) > >> { > >> RTE_LOG(ERR, EAL, "begin unmap port %d uio resource! \n", > port_id); > >> return; > >> } > >> > >> for (i = 0; i != PCI_MAX_RESOURCE; i++) > >> { > >> /* skip empty BAR */ > >> if (0 == pci_dev->mem_resource[i].phys_addr) > >> continue; > >> if (munmap(pci_dev->mem_resource[i].addr, pci_dev- > >>> mem_resource[i].len) > >> == > >> -1){ > >> RTE_LOG(ERR, EAL, "Error with munmap\n"); > >> return; > >> } > >> } > >> if (close(pci_dev->intr_handle.fd) == -1){ > >> RTE_LOG(ERR, EAL, "Error closing interrupt handle\n"); > >> return; > >> } > >> pci_dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; > >> RTE_LOG(INFO, EAL, "unmap port %d uio resource successfully!\n", > >> port_id); > >> } > >> > >> Does anyone has some ideas? > >> > >> Thanks for any help. > >> Jerry > >> > >> On 2014/10/14 19:58, Lilijun wrote: > >>> Hi Stephen and all, > >>> > >>> I have a same problem as this older email describes on Aug 14, 2013. > >>> Any help will be appreciated. > >>> > >>> The details is shown as follows. > >>> The key step implementation of my demo is: > >>> 1. Firstly, call rte_eal_init() to do some initialization. > >>> 2. Switch the driver of my Intel 82599 NIC from ixgbe.ko to igb_uio.ko > >>> like tools/dpdk_nic_bind.py written in C source code. > >>> 3. Configure rte_dev and start it. > >>> 4. Do some rx/tx tests. > >>> 5. call rte_eth_dev_stop(dpdk_port_id) to stop the hardware as your > >>> history > >> emails. > >>> 6. Switch the driver of the NIC from igb_uio.ko to ixgbe.ko. > >>> 7. Kill the demo using commands: kill -9. > > > > Just to clarify one point - you have an application running which was using > > the > NICs with DPDK while you remove the uio driver and replace it with ixgbe? > I would expect doing such a thing to cause problems as stopping the device > does > not cause the NIC BAR memory to be unmapped from the DPDK process. > Therefore removing the driver providing that memory map and getting another > driver to start using those same BARs would not be recommended. > > > > Thanks for your reply. > Yes, I want to change the NIC driver by replacing the uio driver with ixgbe in > order to recover the NIC to origin kernel ether-net devices while keeping the > application running. > Then my application can use the NICs with DPDK or with kernel ixgbe driver on > demand. > I am confusing with how to release all uio resources when stop my application. > > Would you like to give me any suggestions for my requirements? > Right now, there is no way to do this without changing the internals of the DPDK itself. The BARs from the NIC are mmapped permanently into the processes address space on initialization of the application, and are never released. You'd basically need to write code to un-initialize the DPDK and then reinitialize it at a later point. Might an alternative be to actually have two separate applications or binaries that appear as one, or work as one? Then you could shut down the dpdk binary before removing the uio driver, and switch over to the ixgbe driver and use the other application. However, I realise that getting a seamless transition could be difficult there. /Bruce
[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao CH Zhu > Sent: Thursday, October 16, 2014 4:14 AM > To: Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > architecture > > Konstantin, > > In my understanding, compiler barrier is a kind of software barrier which > prevents the compiler from moving memory accesses across the barrier. This > should be architecture-independent. And the "sync" instruction is a > hardware barrier which depends on PowerPC architecture. So I think the > compiler barrier should be the same on x86 and PowerPC. Any comments? > Please correct me if I was wrong. > I would agree with that assessment, as far as it goes, in that a compiler barrier is going to be the same on both architectures. However, we also need to start thinking about actual use cases - how to we specify the barriers in a piece of code where we need a full memory barrier on PPC and only a compiler barrier on IA? My suggestion would be to do first as you propose and have proper primitives for the different barrier types defined correctly for each platform - with the compiler barrier being, presumably, common across each one. Then, as a second step, we probably need to look at defining "logical" barrier types (for want of a better term) that can then be used in the code and which would be different across platforms. Does this make sense to do this way? Is it the best solution? Do we want to define the basic primitives or are we only ever likely to need the logical barrier types? /Bruce
[dpdk-dev] [PATCH v2 0/6] i40e VMDQ support
From: "Chen Jing D(Mark)" v2: - Fix a few typos. - Add comments for RX mq mode flags. - Remove '\n' from some log messages. - Remove 'Acked-by' in commit log. v1: Define extra VMDQ arguments to expand VMDQ configuration. This also includes change in igb and ixgbe PMD driver. In the meanwhile, fix 2 defects in rte_ether library. Add full VMDQ support in i40e PMD driver. renamed some functions, setup VMDQ VSI after it's enabled in application. It also make some improvement on macaddr add/delete to support setting multiple macaddr for single or multiple pools. Finally, change i40e rx/tx_queue_setup and dev_start/stop functions to configure/switch queues belonging to VMDQ pools. Chen Jing D(Mark) (6): ether: enhancement for VMDQ support igb: change for VMDQ arguments expansion ixgbe: change for VMDQ arguments expansion i40e: add VMDQ support i40e: macaddr add/del enhancement i40e: Add full VMDQ pools support config/common_linuxapp |1 + lib/librte_ether/rte_ethdev.c | 12 +- lib/librte_ether/rte_ethdev.h | 43 +++- lib/librte_pmd_e1000/igb_ethdev.c |3 + lib/librte_pmd_i40e/i40e_ethdev.c | 499 ++- lib/librte_pmd_i40e/i40e_ethdev.h | 21 ++- lib/librte_pmd_i40e/i40e_rxtx.c | 125 +++-- lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 + 8 files changed, 536 insertions(+), 169 deletions(-) -- 1.7.7.6
[dpdk-dev] [PATCH v2 2/6] igb: change for VMDQ arguments expansion
From: "Chen Jing D(Mark)" Assign new VMDQ arguments with correct values. Signed-off-by: Chen Jing D(Mark) --- lib/librte_pmd_e1000/igb_ethdev.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/lib/librte_pmd_e1000/igb_ethdev.c b/lib/librte_pmd_e1000/igb_ethdev.c index c9acdc5..dc0ea6d 100644 --- a/lib/librte_pmd_e1000/igb_ethdev.c +++ b/lib/librte_pmd_e1000/igb_ethdev.c @@ -1286,18 +1286,21 @@ eth_igb_infos_get(struct rte_eth_dev *dev, dev_info->max_rx_queues = 16; dev_info->max_tx_queues = 16; dev_info->max_vmdq_pools = ETH_8_POOLS; + dev_info->vmdq_queue_num = 16; break; case e1000_82580: dev_info->max_rx_queues = 8; dev_info->max_tx_queues = 8; dev_info->max_vmdq_pools = ETH_8_POOLS; + dev_info->vmdq_queue_num = 8; break; case e1000_i350: dev_info->max_rx_queues = 8; dev_info->max_tx_queues = 8; dev_info->max_vmdq_pools = ETH_8_POOLS; + dev_info->vmdq_queue_num = 8; break; case e1000_i354: -- 1.7.7.6
[dpdk-dev] [PATCH v2 1/6] ether: enhancement for VMDQ support
From: "Chen Jing D(Mark)" The change includes several parts: 1. Clear pool bitmap when trying to remove specific MAC. 2. Define RSS, DCB and VMDQ flags to combine rx_mq_mode. 3. Use 'struct' to replace 'union', which to expand the rx_adv_conf arguments to better support RSS, DCB and VMDQ. 4. Fix bug in rte_eth_dev_config_restore function, which will restore all MAC address to default pool. 5. Define additional 3 arguments for better VMDQ support. Signed-off-by: Chen Jing D(Mark) --- lib/librte_ether/rte_ethdev.c | 12 ++ lib/librte_ether/rte_ethdev.h | 43 ++-- 2 files changed, 39 insertions(+), 16 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index fd1010a..86f4409 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -771,7 +771,8 @@ rte_eth_dev_config_restore(uint8_t port_id) continue; /* add address to the hardware */ - if (*dev->dev_ops->mac_addr_add) + if (*dev->dev_ops->mac_addr_add && + dev->data->mac_pool_sel[i] & (1ULL << pool)) (*dev->dev_ops->mac_addr_add)(dev, &addr, i, pool); else { PMD_DEBUG_TRACE("port %d: MAC address array not supported\n", @@ -1249,10 +1250,8 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info) } dev = &rte_eth_devices[port_id]; - /* Default device offload capabilities to zero */ - dev_info->rx_offload_capa = 0; - dev_info->tx_offload_capa = 0; - dev_info->if_index = 0; + /* Set all fields with zero */ + memset(dev_info, 0, sizeof(*dev_info)); FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get); (*dev->dev_ops->dev_infos_get)(dev, dev_info); dev_info->pci_dev = dev->pci_dev; @@ -2022,6 +2021,9 @@ rte_eth_dev_mac_addr_remove(uint8_t port_id, struct ether_addr *addr) /* Update address in NIC data structure */ ether_addr_copy(&null_mac_addr, &dev->data->mac_addrs[index]); + /* reset pool bitmap */ + dev->data->mac_pool_sel[index] = 0; + return 0; } diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 50df654..4c83aa5 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -252,20 +252,37 @@ struct rte_eth_thresh { }; /** + * Simple flags to indicate RX mq mode, which can be used independently or combined + * in enum rte_eth_rx_mq_mode definition. + */ +#define ETH_MQ_RX_RSS_FLAG 0x1 +#define ETH_MQ_RX_DCB_FLAG 0x2 +#define ETH_MQ_RX_VMDQ_FLAG 0x4 + +/** * A set of values to identify what method is to be used to route * packets to multiple queues. */ enum rte_eth_rx_mq_mode { - ETH_MQ_RX_NONE = 0, /**< None of DCB,RSS or VMDQ mode */ - - ETH_MQ_RX_RSS, /**< For RX side, only RSS is on */ - ETH_MQ_RX_DCB, /**< For RX side,only DCB is on. */ - ETH_MQ_RX_DCB_RSS, /**< Both DCB and RSS enable */ - - ETH_MQ_RX_VMDQ_ONLY, /**< Only VMDQ, no RSS nor DCB */ - ETH_MQ_RX_VMDQ_RSS, /**< RSS mode with VMDQ */ - ETH_MQ_RX_VMDQ_DCB, /**< Use VMDQ+DCB to route traffic to queues */ - ETH_MQ_RX_VMDQ_DCB_RSS, /**< Enable both VMDQ and DCB in VMDq */ + /**< None of DCB,RSS or VMDQ mode */ + ETH_MQ_RX_NONE = 0, + + /**< For RX side, only RSS is on */ + ETH_MQ_RX_RSS = ETH_MQ_RX_RSS_FLAG, + /**< For RX side,only DCB is on. */ + ETH_MQ_RX_DCB = ETH_MQ_RX_DCB_FLAG, + /**< Both DCB and RSS enable */ + ETH_MQ_RX_DCB_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_DCB_FLAG, + + /**< Only VMDQ, no RSS nor DCB */ + ETH_MQ_RX_VMDQ_ONLY = ETH_MQ_RX_VMDQ_FLAG, + /**< RSS mode with VMDQ */ + ETH_MQ_RX_VMDQ_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_VMDQ_FLAG, + /**< Use VMDQ+DCB to route traffic to queues */ + ETH_MQ_RX_VMDQ_DCB = ETH_MQ_RX_VMDQ_FLAG | ETH_MQ_RX_DCB_FLAG, + /**< Enable both VMDQ and DCB in VMDq */ + ETH_MQ_RX_VMDQ_DCB_RSS = ETH_MQ_RX_RSS_FLAG | ETH_MQ_RX_DCB_FLAG | +ETH_MQ_RX_VMDQ_FLAG, }; /** @@ -840,7 +857,7 @@ struct rte_eth_conf { Read the datasheet of given ethernet controller for details. The possible values of this field are defined in implementation of each driver. */ - union { + struct { struct rte_eth_rss_conf rss_conf; /**< Port RSS configuration */ struct rte_eth_vmdq_dcb_conf vmdq_dcb_conf; /**< Port vmdq+dcb configuration. */ @@ -906,6 +923,10 @@ struct rte_eth_dev_info { uint16_t max_vmdq_pools; /**< Maximum number of VMDq pools. */ uint32_t rx_offload_capa; /**< Device RX offload capabilities. */ uint32_t tx_offload_capa; /**< Device
[dpdk-dev] [PATCH v2 3/6] ixgbe: change for VMDQ arguments expansion
From: "Chen Jing D(Mark)" Assign new VMDQ arguments with correct values. Signed-off-by: Chen Jing D(Mark) --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index f4b590b..d0f9bcb 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -1933,6 +1933,7 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->max_vmdq_pools = ETH_16_POOLS; else dev_info->max_vmdq_pools = ETH_64_POOLS; + dev_info->vmdq_queue_num = dev_info->max_rx_queues; dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP | DEV_RX_OFFLOAD_IPV4_CKSUM | -- 1.7.7.6
[dpdk-dev] [PATCH v2 5/6] i40e: macaddr add/del enhancement
From: "Chen Jing D(Mark)" Change i40e_macaddr_add and i40e_macaddr_remove functions to support multiple macaddr add/delete. In the meanwhile, support macaddr ops on different pools. Signed-off-by: Chen Jing D(Mark) --- lib/librte_pmd_i40e/i40e_ethdev.c | 89 +--- 1 files changed, 42 insertions(+), 47 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index ad65e25..c0e9f48 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -1532,45 +1532,37 @@ i40e_priority_flow_ctrl_set(__rte_unused struct rte_eth_dev *dev, static void i40e_macaddr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr, -__attribute__((unused)) uint32_t index, -__attribute__((unused)) uint32_t pool) +__rte_unused uint32_t index, +uint32_t pool) { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); - struct i40e_vsi *vsi = pf->main_vsi; - struct ether_addr old_mac; + struct i40e_vsi *vsi; int ret; - if (!is_valid_assigned_ether_addr(mac_addr)) { - PMD_DRV_LOG(ERR, "Invalid ethernet address"); - return; - } - - if (is_same_ether_addr(mac_addr, &(pf->dev_addr))) { - PMD_DRV_LOG(INFO, "Ignore adding permanent mac address"); + /* If VMDQ not enabled or configured, return */ + if (pool != 0 && (!(pf->flags | I40E_FLAG_VMDQ) || !pf->nb_cfg_vmdq_vsi)) { + PMD_DRV_LOG(ERR, "VMDQ not %s, can't set mac to pool %u", + pf->flags | I40E_FLAG_VMDQ ? "configured" : "enabled", + pool); return; } - /* Write mac address */ - ret = i40e_aq_mac_address_write(hw, I40E_AQC_WRITE_TYPE_LAA_ONLY, - mac_addr->addr_bytes, NULL); - if (ret != I40E_SUCCESS) { - PMD_DRV_LOG(ERR, "Failed to write mac address"); + if (pool > pf->nb_cfg_vmdq_vsi) { + PMD_DRV_LOG(ERR, "Pool number %u invalid. Max pool is %u", + pool, pf->nb_cfg_vmdq_vsi); return; } - (void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN); - (void)rte_memcpy(hw->mac.addr, mac_addr->addr_bytes, - ETHER_ADDR_LEN); + if (pool == 0) + vsi = pf->main_vsi; + else + vsi = pf->vmdq[pool - 1].vsi; ret = i40e_vsi_add_mac(vsi, mac_addr); if (ret != I40E_SUCCESS) { PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter"); return; } - - ether_addr_copy(mac_addr, &pf->dev_addr); - i40e_vsi_delete_mac(vsi, &old_mac); } /* Remove a MAC address, and update filters */ @@ -1578,36 +1570,39 @@ static void i40e_macaddr_remove(struct rte_eth_dev *dev, uint32_t index) { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct i40e_vsi *vsi = pf->main_vsi; - struct rte_eth_dev_data *data = I40E_VSI_TO_DEV_DATA(vsi); + struct i40e_vsi *vsi; + struct rte_eth_dev_data *data = dev->data; struct ether_addr *macaddr; int ret; - struct i40e_hw *hw = - I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); - - if (index >= vsi->max_macaddrs) - return; + uint32_t i; + uint64_t pool_sel; macaddr = &(data->mac_addrs[index]); - if (!is_valid_assigned_ether_addr(macaddr)) - return; - - ret = i40e_aq_mac_address_write(hw, I40E_AQC_WRITE_TYPE_LAA_ONLY, - hw->mac.perm_addr, NULL); - if (ret != I40E_SUCCESS) { - PMD_DRV_LOG(ERR, "Failed to write mac address"); - return; - } - - (void)rte_memcpy(hw->mac.addr, hw->mac.perm_addr, ETHER_ADDR_LEN); - ret = i40e_vsi_delete_mac(vsi, macaddr); - if (ret != I40E_SUCCESS) - return; + pool_sel = dev->data->mac_pool_sel[index]; + + for (i = 0; i < sizeof(pool_sel) * CHAR_BIT; i++) { + if (pool_sel & (1ULL << i)) { + if (i == 0) + vsi = pf->main_vsi; + else { + /* No VMDQ pool enabled or configured */ + if (!(pf->flags | I40E_FLAG_VMDQ) || + (i > pf->nb_cfg_vmdq_vsi)) { + PMD_DRV_LOG(ERR, "No VMDQ pool enabled" + "/configured"); + return; + } + vsi = pf->vmdq[i - 1].vsi; + } +
[dpdk-dev] [PATCH v2 4/6] i40e: add VMDQ support
From: "Chen Jing D(Mark)" The change includes several parts: 1. Get maximum number of VMDQ pools supported in dev_init. 2. Fill VMDQ info in i40e_dev_info_get. 3. Setup VMDQ pools in i40e_dev_configure. 4. i40e_vsi_setup change to support creation of VMDQ VSI. Signed-off-by: Chen Jing D(Mark) --- config/common_linuxapp|1 + lib/librte_pmd_i40e/i40e_ethdev.c | 237 - lib/librte_pmd_i40e/i40e_ethdev.h | 17 +++- 3 files changed, 225 insertions(+), 30 deletions(-) diff --git a/config/common_linuxapp b/config/common_linuxapp index 5bee910..d0bb3f7 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -208,6 +208,7 @@ CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y CONFIG_RTE_LIBRTE_I40E_ALLOW_UNSUPPORTED_SFP=n CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 +CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4 # interval up to 8160 us, aligned to 2 (or default value) CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1 diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index a00d6ca..ad65e25 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -168,6 +168,7 @@ static int i40e_get_cap(struct i40e_hw *hw); static int i40e_pf_parameter_init(struct rte_eth_dev *dev); static int i40e_pf_setup(struct i40e_pf *pf); static int i40e_vsi_init(struct i40e_vsi *vsi); +static int i40e_vmdq_setup(struct rte_eth_dev *dev); static void i40e_stat_update_32(struct i40e_hw *hw, uint32_t reg, bool offset_loaded, uint64_t *offset, uint64_t *stat); static void i40e_stat_update_48(struct i40e_hw *hw, @@ -269,21 +270,11 @@ static struct eth_driver rte_i40e_pmd = { }; static inline int -i40e_prev_power_of_2(int n) +i40e_align_floor(int n) { - int p = n; - - --p; - p |= p >> 1; - p |= p >> 2; - p |= p >> 4; - p |= p >> 8; - p |= p >> 16; - if (p == (n - 1)) - return n; - p >>= 1; - - return ++p; + if (n == 0) + return 0; + return (1 << (sizeof(n) * CHAR_BIT - 1 - __builtin_clz(n))); } static inline int @@ -500,7 +491,7 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv, if (!dev->data->mac_addrs) { PMD_INIT_LOG(ERR, "Failed to allocated memory " "for storing mac address"); - goto err_get_mac_addr; + goto err_mac_alloc; } ether_addr_copy((struct ether_addr *)hw->mac.perm_addr, &dev->data->mac_addrs[0]); @@ -521,8 +512,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv, return 0; +err_mac_alloc: + i40e_vsi_release(pf->main_vsi); err_setup_pf_switch: - rte_free(pf->main_vsi); err_get_mac_addr: err_configure_lan_hmc: (void)i40e_shutdown_lan_hmc(hw); @@ -541,6 +533,27 @@ err_get_capabilities: static int i40e_dev_configure(struct rte_eth_dev *dev) { + int ret; + enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode; + + /* VMDQ setup. +* Needs to move VMDQ setting out of i40e_pf_config_mq_rx() as VMDQ and +* RSS setting have different requirements. +* General PMD driver call sequence are NIC init, configure, +* rx/tx_queue_setup and dev_start. In rx/tx_queue_setup() function, it +* will try to lookup the VSI that specific queue belongs to if VMDQ +* applicable. So, VMDQ setting has to be done before +* rx/tx_queue_setup(). This function is good to place vmdq_setup. +* For RSS setting, it will try to calculate actual configured RX queue +* number, which will be available after rx_queue_setup(). dev_start() +* function is good to place RSS setup. +*/ + if (mq_mode & ETH_MQ_RX_VMDQ_FLAG) { + ret = i40e_vmdq_setup(dev); + if (ret) + return ret; + } + return i40e_dev_init_vlan(dev); } @@ -1389,6 +1402,16 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM | DEV_TX_OFFLOAD_SCTP_CKSUM; + + if (pf->flags | I40E_FLAG_VMDQ) { + dev_info->max_vmdq_pools = pf->max_nb_vmdq_vsi; + dev_info->vmdq_queue_base = dev_info->max_rx_queues; + dev_info->vmdq_queue_num = pf->vmdq_nb_qps * + pf->max_nb_vmdq_vsi; + dev_info->vmdq_pool_base = I40E_VMDQ_POOL_BASE; + dev_info->max_rx_queues += dev_info->vmdq_queue_num; + dev_info->max_tx_queues += dev_info->vmdq_queue_num; + } } static int @@ -1814,7 +1837,7 @@ i40e_pf_parameter_init(struct rte_eth_dev *dev) { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->d
[dpdk-dev] [PATCH v2 6/6] i40e: Add full VMDQ pools support
From: "Chen Jing D(Mark)" 1. Function i40e_vsi_* name change to i40e_dev_* since PF can contains more than 1 VSI after VMDQ enabled. 2. i40e_dev_rx/tx_queue_setup change to have capability of setup queues that belongs to VMDQ pools. 3. Add queue mapping. This will do a convertion between queue index that application used and real NIC queue index. 3. i40e_dev_start/stop change to have capability switching VMDQ queues. 4. i40e_pf_config_rss change to calculate actual main VSI queue numbers after VMDQ pools introduced. Signed-off-by: Chen Jing D(Mark) --- lib/librte_pmd_i40e/i40e_ethdev.c | 175 ++--- lib/librte_pmd_i40e/i40e_ethdev.h |4 +- lib/librte_pmd_i40e/i40e_rxtx.c | 125 ++- 3 files changed, 227 insertions(+), 77 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index c0e9f48..cf303d0 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -167,7 +167,7 @@ static int i40e_dev_rss_reta_query(struct rte_eth_dev *dev, static int i40e_get_cap(struct i40e_hw *hw); static int i40e_pf_parameter_init(struct rte_eth_dev *dev); static int i40e_pf_setup(struct i40e_pf *pf); -static int i40e_vsi_init(struct i40e_vsi *vsi); +static int i40e_dev_rxtx_init(struct i40e_pf *pf); static int i40e_vmdq_setup(struct rte_eth_dev *dev); static void i40e_stat_update_32(struct i40e_hw *hw, uint32_t reg, bool offset_loaded, uint64_t *offset, uint64_t *stat); @@ -770,8 +770,8 @@ i40e_dev_start(struct rte_eth_dev *dev) { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); - struct i40e_vsi *vsi = pf->main_vsi; - int ret; + struct i40e_vsi *main_vsi = pf->main_vsi; + int ret, i; if ((dev->data->dev_conf.link_duplex != ETH_LINK_AUTONEG_DUPLEX) && (dev->data->dev_conf.link_duplex != ETH_LINK_FULL_DUPLEX)) { @@ -782,26 +782,37 @@ i40e_dev_start(struct rte_eth_dev *dev) } /* Initialize VSI */ - ret = i40e_vsi_init(vsi); + ret = i40e_dev_rxtx_init(pf); if (ret != I40E_SUCCESS) { - PMD_DRV_LOG(ERR, "Failed to init VSI"); + PMD_DRV_LOG(ERR, "Failed to init rx/tx queues"); goto err_up; } /* Map queues with MSIX interrupt */ - i40e_vsi_queues_bind_intr(vsi); - i40e_vsi_enable_queues_intr(vsi); + i40e_vsi_queues_bind_intr(main_vsi); + i40e_vsi_enable_queues_intr(main_vsi); + + /* Map VMDQ VSI queues with MSIX interrupt */ + for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) { + i40e_vsi_queues_bind_intr(pf->vmdq[i].vsi); + i40e_vsi_enable_queues_intr(pf->vmdq[i].vsi); + } /* Enable all queues which have been configured */ - ret = i40e_vsi_switch_queues(vsi, TRUE); + ret = i40e_dev_switch_queues(pf, TRUE); if (ret != I40E_SUCCESS) { PMD_DRV_LOG(ERR, "Failed to enable VSI"); goto err_up; } /* Enable receiving broadcast packets */ - if ((vsi->type == I40E_VSI_MAIN) || (vsi->type == I40E_VSI_VMDQ2)) { - ret = i40e_aq_set_vsi_broadcast(hw, vsi->seid, true, NULL); + ret = i40e_aq_set_vsi_broadcast(hw, main_vsi->seid, true, NULL); + if (ret != I40E_SUCCESS) + PMD_DRV_LOG(INFO, "fail to set vsi broadcast"); + + for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) { + ret = i40e_aq_set_vsi_broadcast(hw, pf->vmdq[i].vsi->seid, + true, NULL); if (ret != I40E_SUCCESS) PMD_DRV_LOG(INFO, "fail to set vsi broadcast"); } @@ -816,7 +827,8 @@ i40e_dev_start(struct rte_eth_dev *dev) return I40E_SUCCESS; err_up: - i40e_vsi_switch_queues(vsi, FALSE); + i40e_dev_switch_queues(pf, FALSE); + i40e_dev_clear_queues(dev); return ret; } @@ -825,17 +837,26 @@ static void i40e_dev_stop(struct rte_eth_dev *dev) { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct i40e_vsi *vsi = pf->main_vsi; + struct i40e_vsi *main_vsi = pf->main_vsi; + int i; /* Disable all queues */ - i40e_vsi_switch_queues(vsi, FALSE); + i40e_dev_switch_queues(pf, FALSE); + + /* un-map queues with interrupt registers */ + i40e_vsi_disable_queues_intr(main_vsi); + i40e_vsi_queues_unbind_intr(main_vsi); + + for (i = 0; i < pf->nb_cfg_vmdq_vsi; i++) { + i40e_vsi_disable_queues_intr(pf->vmdq[i].vsi); + i40e_vsi_queues_unbind_intr(pf->vmdq[i].vsi); + } + + /* Clear all queues and release memory */ + i40e_dev_clear_queues(dev); /* Set link down */ i40e_dev_set_link_down(dev); - - /* un-map que
[dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of De Lara Guarch, > Pablo > Sent: Monday, October 13, 2014 5:13 PM > To: Liu, Jijiang; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet > identification in librte_pmd_i40e > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > > Sent: Saturday, October 11, 2014 6:55 AM > > To: dev at dpdk.org > > Subject: [dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet > identification > > in librte_pmd_i40e > > > > Support tunneling UDP port configuration on i40e in librte_pmd_i40e. > > Currently, only VxLAN is implemented, which include > > - VxLAN UDP port initialization > > - Implement the APIs to configure VxLAN UDP port in librte_pmd_i40e. > > > > Signed-off-by: Jijiang Liu > > Acked-by: Helin Zhang > > Acked-by: Jingjing Wu > > Acked-by: Jing Chen > > [...] > > index 7c5b6a8..369bc3b 100644 > > --- a/lib/librte_pmd_i40e/i40e_rxtx.c > > +++ b/lib/librte_pmd_i40e/i40e_rxtx.c > > @@ -638,6 +638,10 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) > > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > > mb->ol_flags = pkt_flags; > > + > > + mb->packet_type = (uint16_t)((qword1 & > > + I40E_RXD_QW1_PTYPE_MASK) >> > > + I40E_RXD_QW1_PTYPE_SHIFT); > > if (pkt_flags & PKT_RX_RSS_HASH) > > mb->hash.rss = rte_le_to_cpu_32(\ > > rxdp->wb.qword0.hi_dword.rss); > > @@ -873,6 +877,8 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf > > **rx_pkts, uint16_t nb_pkts) > > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); > > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > > + rxm->packet_type = (uint16_t)((qword1 & > > I40E_RXD_QW1_PTYPE_MASK) >> > > + I40E_RXD_QW1_PTYPE_SHIFT); > > rxm->ol_flags = pkt_flags; > > if (pkt_flags & PKT_RX_RSS_HASH) > > rxm->hash.rss = > > @@ -1027,6 +1033,9 @@ i40e_recv_scattered_pkts(void *rx_queue, > > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); > > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > > + first_seg->packet_type = (uint8_t)((qword1 & > > + I40E_RXD_QW1_PTYPE_MASK) >> > > + I40E_RXD_QW1_PTYPE_SHIFT); Another comment is that packet_type is uint16_t, so you should change that uint8_t to uint16_t. Thanks! > > first_seg->ol_flags = pkt_flags; > > if (pkt_flags & PKT_RX_RSS_HASH) > > rxm->hash.rss = > > -- > > 1.7.7.6
[dpdk-dev] [PATCH v2 5/7] Split spinlock operations to architecture specific
This patch splits the spinlock operations from DPDK and push them to architecture specific arch directories, so that other processor architecture to support DPDK can be easily adopted. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- .../common/include/arch/i686/rte_spinlock.h| 180 ++ .../common/include/arch/x86_64/rte_spinlock.h | 180 ++ .../common/include/generic/rte_spinlock.h | 169 + lib/librte_eal/common/include/rte_spinlock.h | 258 5 files changed, 531 insertions(+), 260 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 6cf7505..9b9a73d 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -35,7 +35,7 @@ INC := rte_branch_prediction.h rte_common.h INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h -INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h +INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h INC += rte_eal_memconfig.h rte_malloc_heap.h INC += rte_hexdump.h rte_devargs.h rte_dev.h @@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y) INC += rte_warnings.h endif -GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h +GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h ARCH_INC := $(GENERIC_INC) rte_prefetch.h SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) diff --git a/lib/librte_eal/common/include/arch/i686/rte_spinlock.h b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h new file mode 100644 index 000..f61e31c --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_spinlock.h @@ -0,0 +1,180 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_SPINLOCK_I686_H_ +#define _RTE_SPINLOCK_I686_H_ + +/** + * @file + * + * RTE Spinlocks + * + * This file defines an API for read-write locks, which are implemented + * in an architecture-specific way. This kind of lock simply waits in + * a loop repeatedly checking until the lock becomes available. + * + * All locks must be initialised before use, and only initialised once. + * + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_spinlock.h" + +#ifndef RTE_FORCE_INTRINSICS +/** + * Take the spinlock. + * + * @param sl + * A pointer to the spinlock. + */ +static inline void +rte_spinlock_lock(rte_spinlock_t *sl) +{ + int lock_val = 1; + asm volatile ( + "1:\n" + "xchg %[locked], %[lv]\n" + "test %[lv], %[lv]\n" + "jz 3f\n" + "2:\n" + "pause\n" + "cmpl $0, %[locked]\n" +
[dpdk-dev] [PATCH v2 1/7] Split atomic operations to architecture specific
This patch first add architecture specific directories to eal header file directory. Then split the atomic operations to architecture specific and generic files. Architecture specific files are put into the corresponding architecture directory and common header are put into generic directory. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile | 11 +- .../common/include/arch/i686/rte_atomic.h | 669 .../common/include/arch/x86_64/rte_atomic.h| 631 +++ lib/librte_eal/common/include/generic/rte_atomic.h | 795 ++ .../common/include/i686/arch/rte_atomic.h | 373 --- lib/librte_eal/common/include/rte_atomic.h | 1133 .../common/include/x86_64/arch/rte_atomic.h| 335 -- 7 files changed, 2102 insertions(+), 1845 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h delete mode 100644 lib/librte_eal/common/include/rte_atomic.h delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 7f27966..8ab363b 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -31,7 +31,7 @@ include $(RTE_SDK)/mk/rte.vars.mk -INC := rte_atomic.h rte_branch_prediction.h rte_byteorder.h rte_common.h +INC := rte_branch_prediction.h rte_byteorder.h rte_common.h INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h @@ -46,11 +46,14 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y) INC += rte_warnings.h endif -ARCH_INC := rte_atomic.h +GENERIC_INC := rte_atomic.h +ARCH_INC := $(GENERIC_INC) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) -SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/arch := \ - $(addprefix include/$(RTE_ARCH)/arch/,$(ARCH_INC)) +SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \ + $(addprefix include/arch/$(RTE_ARCH)/,$(ARCH_INC)) +SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include/generic := \ + $(addprefix include/generic/,$(GENERIC_INC)) # add libc if configured DEPDIRS-$(CONFIG_RTE_LIBC) += lib/libc diff --git a/lib/librte_eal/common/include/arch/i686/rte_atomic.h b/lib/librte_eal/common/include/arch/i686/rte_atomic.h new file mode 100644 index 000..67efb19 --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_atomic.h @@ -0,0 +1,669 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * Inspired from FreeBSD src/sys/i386/include/atomic.h + * Copyright (c) 1998 Doug Rabson + * All rights reserved. + */ + +#ifndef _RTE_ATOMIC_I686_H_ +#define _RTE_ATOMIC_I686_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include "generic/rte_atomic.h" + +/** + * @file + * Atomic Operations on i686 + */ + +#if RTE_MAX_LCORE == 1 +#define MPLOCKED/**< No need to insert MP lock prefix. */ +#else +#define MPLOCKED"loc
[dpdk-dev] [PATCH v2 4/7] Split prefetch operations to architecture specific
This patch splits the prefetch operations from DPDK and push them to architecture specific arch directories, so that other processor architecture to support DPDK can implement their own functions. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- .../common/include/arch/i686/rte_prefetch.h| 88 .../common/include/arch/x86_64/rte_prefetch.h | 88 lib/librte_eal/common/include/rte_prefetch.h | 88 4 files changed, 178 insertions(+), 90 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index c6aedf9..6cf7505 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk INC := rte_branch_prediction.h rte_common.h INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h -INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h +INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h INC += rte_eal_memconfig.h rte_malloc_heap.h @@ -47,7 +47,7 @@ INC += rte_warnings.h endif GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h -ARCH_INC := $(GENERIC_INC) +ARCH_INC := $(GENERIC_INC) rte_prefetch.h SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \ diff --git a/lib/librte_eal/common/include/arch/i686/rte_prefetch.h b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h new file mode 100644 index 000..2625512 --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_prefetch.h @@ -0,0 +1,88 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_PREFETCH_I686_H_ +#define _RTE_PREFETCH_I686_H_ + +/** + * @file + * + * Prefetch operations. + * + * This file defines an API for prefetch macros / inline-functions, + * which are architecture-dependent. Prefetching occurs when a + * processor requests an instruction or data from memory to cache + * before it is actually needed, potentially speeding up the execution of the + * program. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Prefetch a cache line into all cache levels. + * @param p + * Address to prefetch + */ +static inline void rte_prefetch0(volatile void *p) +{ + asm volatile ("prefetcht0 %[p]" : [p] "+m" (*(volatile char *)p)); +} + +/** + * Prefetch a cache line into all cache levels except the 0th cache level. + * @param p + * Address to prefetch + */ +static inline void rte_prefetch1(volatile void *p) +{ + asm volatile ("prefetcht1 %[p]" : [p] "+m" (*(volatile char *)p)); +} + +/** + * Prefetch a cache line into all cache levels except the 0th and 1th cache + * levels. + * @param p + * Address to prefetch + */ +static inline void rte_prefetch2(volatile void *p) +{ + asm volatile ("prefetcht2 %[p]
[dpdk-dev] [PATCH v2 0/7] Patches to split architecture specific operations from DPDK
The set of patches split x86 architecture specific operations from DPDK and put them to the arch directories of i686 and x86_64 architecture. This will make the adpotion of DPDK much easier on other computer architecture. For a new architecture, just add an architecture specific directory and necessary building configuration files, then DPDK eal library can support it. This is an upgrade version of the former patch. Chao Zhu (7): Split atomic operations to architecture specific Split byte order operations to architecture specific Split CPU cycle operation to architecture specific Split prefetch operations to architecture specific Split spinlock operations to architecture specific Split memcpy operation to architecture specific Split CPU flags operations to architecture specific lib/librte_eal/common/Makefile | 21 +- lib/librte_eal/common/eal_common_cpuflags.c| 190 .../common/include/arch/i686/rte_atomic.h | 669 .../common/include/arch/i686/rte_byteorder.h | 194 .../common/include/arch/i686/rte_cpuflags.h| 364 +++ .../common/include/arch/i686/rte_cycles.h | 158 +++ .../common/include/arch/i686/rte_memcpy.h | 376 +++ .../common/include/arch/i686/rte_prefetch.h| 88 ++ .../common/include/arch/i686/rte_spinlock.h| 180 .../common/include/arch/x86_64/rte_atomic.h| 631 +++ .../common/include/arch/x86_64/rte_byteorder.h | 195 .../common/include/arch/x86_64/rte_cpuflags.h | 364 +++ .../common/include/arch/x86_64/rte_cycles.h| 158 +++ .../common/include/arch/x86_64/rte_memcpy.h| 376 +++ .../common/include/arch/x86_64/rte_prefetch.h | 88 ++ .../common/include/arch/x86_64/rte_spinlock.h | 180 lib/librte_eal/common/include/generic/rte_atomic.h | 795 ++ .../common/include/generic/rte_byteorder.h | 124 +++ lib/librte_eal/common/include/generic/rte_cycles.h | 190 .../common/include/generic/rte_spinlock.h | 169 +++ .../common/include/i686/arch/rte_atomic.h | 373 --- lib/librte_eal/common/include/rte_atomic.h | 1133 lib/librte_eal/common/include/rte_byteorder.h | 270 - lib/librte_eal/common/include/rte_cpuflags.h | 182 lib/librte_eal/common/include/rte_cycles.h | 266 - lib/librte_eal/common/include/rte_memcpy.h | 376 --- lib/librte_eal/common/include/rte_prefetch.h | 88 -- lib/librte_eal/common/include/rte_spinlock.h | 258 - .../common/include/x86_64/arch/rte_atomic.h| 335 -- 29 files changed, 5311 insertions(+), 3480 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_atomic.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/i686/rte_spinlock.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_atomic.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_spinlock.h create mode 100644 lib/librte_eal/common/include/generic/rte_atomic.h create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h create mode 100644 lib/librte_eal/common/include/generic/rte_spinlock.h delete mode 100644 lib/librte_eal/common/include/i686/arch/rte_atomic.h delete mode 100644 lib/librte_eal/common/include/rte_atomic.h delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h delete mode 100644 lib/librte_eal/common/include/rte_cycles.h delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h delete mode 100644 lib/librte_eal/common/include/rte_prefetch.h delete mode 100644 lib/librte_eal/common/include/rte_spinlock.h delete mode 100644 lib/librte_eal/common/include/x86_64/arch/rte_atomic.h
[dpdk-dev] [PATCH v2 2/7] Split byte order operations to architecture specific
This patch splits the byte order operations from DPDK and push them to architecture specific arch directories, so that other processor architecture to support DPDK can be easily adopted. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- .../common/include/arch/i686/rte_byteorder.h | 194 ++ .../common/include/arch/x86_64/rte_byteorder.h | 195 ++ .../common/include/generic/rte_byteorder.h | 124 + lib/librte_eal/common/include/rte_byteorder.h | 270 5 files changed, 515 insertions(+), 272 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/generic/rte_byteorder.h delete mode 100644 lib/librte_eal/common/include/rte_byteorder.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 8ab363b..62a39cd 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -31,7 +31,7 @@ include $(RTE_SDK)/mk/rte.vars.mk -INC := rte_branch_prediction.h rte_byteorder.h rte_common.h +INC := rte_branch_prediction.h rte_common.h INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h @@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y) INC += rte_warnings.h endif -GENERIC_INC := rte_atomic.h +GENERIC_INC := rte_atomic.h rte_byteorder.h ARCH_INC := $(GENERIC_INC) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) diff --git a/lib/librte_eal/common/include/arch/i686/rte_byteorder.h b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h new file mode 100644 index 000..de5cc83 --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_byteorder.h @@ -0,0 +1,194 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_BYTEORDER_I686_H_ +#define _RTE_BYTEORDER_I686_H_ + +/** + * @file + * + * Byte Swap Operations + * + * This file defines a architecture specific API for byte swap operations. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "generic/rte_byteorder.h" + +/* + * An architecture-optimized byte swap for a 16-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap16(). + */ +static inline uint16_t rte_arch_bswap16(uint16_t _x) +{ + register uint16_t x = _x; + asm volatile ("xchgb %b[x1],%h[x2]" + : [x1] "=Q" (x) + : [x2] "0" (x) + ); + return x; +} + +/* + * An architecture-optimized byte swap for a 32-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap32(). + */ +static inline uint32_t rte_arch_bswap32(uint32_t _x) +{ + register uint32_t x = _x; + asm volatile ("bswap %[x]" + : [x] "+r" (x) + ); + return x; +} + +/* + * An architecture-optimized byte swap for a 64-bit value. + * + * Do not use this function directly. The preferred function is rte_bswap64(). + */ +/* Compat./Leg. mode */ +static in
[dpdk-dev] [PATCH v2 3/7] Split CPU cycle operation to architecture specific
This patch splits the CPU TSC read operations from DPDK and push them to architecture specific arch directories, so that other processors that don't have tsc register can be can implement its'own functions. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- .../common/include/arch/i686/rte_cycles.h | 158 .../common/include/arch/x86_64/rte_cycles.h| 158 lib/librte_eal/common/include/generic/rte_cycles.h | 190 ++ lib/librte_eal/common/include/rte_cycles.h | 266 5 files changed, 508 insertions(+), 268 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cycles.h create mode 100644 lib/librte_eal/common/include/generic/rte_cycles.h delete mode 100644 lib/librte_eal/common/include/rte_cycles.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 62a39cd..c6aedf9 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -32,7 +32,7 @@ include $(RTE_SDK)/mk/rte.vars.mk INC := rte_branch_prediction.h rte_common.h -INC += rte_cycles.h rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h +INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_prefetch.h rte_random.h INC += rte_rwlock.h rte_spinlock.h rte_tailq.h rte_interrupts.h rte_alarm.h @@ -46,7 +46,7 @@ ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y) INC += rte_warnings.h endif -GENERIC_INC := rte_atomic.h rte_byteorder.h +GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h ARCH_INC := $(GENERIC_INC) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) diff --git a/lib/librte_eal/common/include/arch/i686/rte_cycles.h b/lib/librte_eal/common/include/arch/i686/rte_cycles.h new file mode 100644 index 000..a813e9b --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_cycles.h @@ -0,0 +1,158 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +/* BSD LICENSE + * + * Copyright(c) 2013 6WIND. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of 6WIND S.A. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CON
[dpdk-dev] [PATCH v2 7/7] Split CPU flags operations to architecture specific
This patch splits CPU flags related operations from DPDK and push them to architecture specific arch directories, so that other processor architecture can implement it's own CPU flag functions to support DPDK. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- lib/librte_eal/common/eal_common_cpuflags.c| 190 -- .../common/include/arch/i686/rte_cpuflags.h| 364 .../common/include/arch/x86_64/rte_cpuflags.h | 364 lib/librte_eal/common/include/rte_cpuflags.h | 182 -- 5 files changed, 730 insertions(+), 374 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_cpuflags.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_cpuflags.h delete mode 100644 lib/librte_eal/common/include/rte_cpuflags.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index e09d509..79f378e 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -36,7 +36,7 @@ INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h -INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h +INC += rte_string_fns.h rte_version.h rte_tailq_elem.h INC += rte_eal_memconfig.h rte_malloc_heap.h INC += rte_hexdump.h rte_devargs.h rte_dev.h INC += rte_common_vect.h @@ -47,7 +47,7 @@ INC += rte_warnings.h endif GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h -ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h +ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h rte_cpuflags.h SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \ diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c index 9e79179..6fd360c 100644 --- a/lib/librte_eal/common/eal_common_cpuflags.c +++ b/lib/librte_eal/common/eal_common_cpuflags.c @@ -30,10 +30,6 @@ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ -#include -#include -#include -#include #include /* @@ -50,192 +46,6 @@ #endif /** - * Enumeration of CPU registers - */ -enum cpu_register_t { - REG_EAX = 0, - REG_EBX, - REG_ECX, - REG_EDX, -}; - -typedef uint32_t cpuid_registers_t[4]; - -#define CPU_FLAG_NAME_MAX_LEN 64 - -/** - * Struct to hold a processor feature entry - */ -struct feature_entry { - uint32_t leaf; /**< cpuid leaf */ - uint32_t subleaf; /**< cpuid subleaf */ - uint32_t reg; /**< cpuid register */ - uint32_t bit; /**< cpuid register bit */ - char name[CPU_FLAG_NAME_MAX_LEN]; /**< String for printing */ -}; - -#define FEAT_DEF(name, leaf, subleaf, reg, bit) \ - [RTE_CPUFLAG_##name] = {leaf, subleaf, reg, bit, #name }, - -/** - * An array that holds feature entries - */ -static const struct feature_entry cpu_feature_table[] = { - FEAT_DEF(SSE3, 0x0001, 0, REG_ECX, 0) - FEAT_DEF(PCLMULQDQ, 0x0001, 0, REG_ECX, 1) - FEAT_DEF(DTES64, 0x0001, 0, REG_ECX, 2) - FEAT_DEF(MONITOR, 0x0001, 0, REG_ECX, 3) - FEAT_DEF(DS_CPL, 0x0001, 0, REG_ECX, 4) - FEAT_DEF(VMX, 0x0001, 0, REG_ECX, 5) - FEAT_DEF(SMX, 0x0001, 0, REG_ECX, 6) - FEAT_DEF(EIST, 0x0001, 0, REG_ECX, 7) - FEAT_DEF(TM2, 0x0001, 0, REG_ECX, 8) - FEAT_DEF(SSSE3, 0x0001, 0, REG_ECX, 9) - FEAT_DEF(CNXT_ID, 0x0001, 0, REG_ECX, 10) - FEAT_DEF(FMA, 0x0001, 0, REG_ECX, 12) - FEAT_DEF(CMPXCHG16B, 0x0001, 0, REG_ECX, 13) - FEAT_DEF(XTPR, 0x0001, 0, REG_ECX, 14) - FEAT_DEF(PDCM, 0x0001, 0, REG_ECX, 15) - FEAT_DEF(PCID, 0x0001, 0, REG_ECX, 17) - FEAT_DEF(DCA, 0x0001, 0, REG_ECX, 18) - FEAT_DEF(SSE4_1, 0x0001, 0, REG_ECX, 19) - FEAT_DEF(SSE4_2, 0x0001, 0, REG_ECX, 20) - FEAT_DEF(X2APIC, 0x0001, 0, REG_ECX, 21) - FEAT_DEF(MOVBE, 0x0001, 0, REG_ECX, 22) - FEAT_DEF(POPCNT, 0x0001, 0, REG_ECX, 23) - FEAT_DEF(TSC_DEADLINE, 0x0001, 0, REG_ECX, 24) - FEAT_DEF(AES, 0x0001, 0, REG_ECX, 25) - FEAT_DEF(XSAVE, 0x0001, 0, REG_ECX, 26) - FEAT_DEF(OSXSAVE, 0x0001, 0, REG_ECX, 27) - FEAT_DEF(AVX, 0x0001, 0, REG_ECX, 28) - FEAT_DEF(F16C, 0x0001, 0, REG_ECX, 29) - FEAT_DEF(RDRAND, 0x0001, 0, REG_ECX, 30) - - FEAT_DEF(FPU, 0x0001, 0, REG_EDX, 0) - FEAT_DEF(VME, 0x0001, 0, REG_EDX, 1) - FEAT_DEF(DE, 0x0001, 0, REG_EDX, 2) - FEAT
[dpdk-dev] [PATCH v2 6/7] Split memcpy operation to architecture specific
This patch splits the SSE based memory copy function from DPDK and push them to architecture specific arch directories. Other processor architecture can implement it's own vector based memory copy functions. Signed-off-by: Chao Zhu --- lib/librte_eal/common/Makefile |4 +- .../common/include/arch/i686/rte_memcpy.h | 376 .../common/include/arch/x86_64/rte_memcpy.h| 376 lib/librte_eal/common/include/rte_memcpy.h | 376 4 files changed, 754 insertions(+), 378 deletions(-) create mode 100644 lib/librte_eal/common/include/arch/i686/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/x86_64/rte_memcpy.h delete mode 100644 lib/librte_eal/common/include/rte_memcpy.h diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile index 9b9a73d..e09d509 100644 --- a/lib/librte_eal/common/Makefile +++ b/lib/librte_eal/common/Makefile @@ -33,7 +33,7 @@ include $(RTE_SDK)/mk/rte.vars.mk INC := rte_branch_prediction.h rte_common.h INC += rte_debug.h rte_eal.h rte_errno.h rte_launch.h rte_lcore.h -INC += rte_log.h rte_memcpy.h rte_memory.h rte_memzone.h rte_pci.h +INC += rte_log.h rte_memory.h rte_memzone.h rte_pci.h INC += rte_pci_dev_ids.h rte_per_lcore.h rte_random.h INC += rte_rwlock.h rte_tailq.h rte_interrupts.h rte_alarm.h INC += rte_string_fns.h rte_cpuflags.h rte_version.h rte_tailq_elem.h @@ -47,7 +47,7 @@ INC += rte_warnings.h endif GENERIC_INC := rte_atomic.h rte_byteorder.h rte_cycles.h rte_spinlock.h -ARCH_INC := $(GENERIC_INC) rte_prefetch.h +ARCH_INC := $(GENERIC_INC) rte_prefetch.h rte_memcpy.h SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include := $(addprefix include/,$(INC)) SYMLINK-$(CONFIG_RTE_LIBRTE_EAL)-include += \ diff --git a/lib/librte_eal/common/include/arch/i686/rte_memcpy.h b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h new file mode 100644 index 000..ba750b1 --- /dev/null +++ b/lib/librte_eal/common/include/arch/i686/rte_memcpy.h @@ -0,0 +1,376 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_MEMCPY_I686_H_ +#define _RTE_MEMCPY_I686_H_ + +/** + * @file + * + * Functions for SSE implementation of memcpy(). + */ + +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +#ifdef __INTEL_COMPILER +#pragma warning(disable:593) /* Stop unused variable warning (reg_a etc). */ +#endif + +/** + * Copy 16 bytes from one location to another using optimised SSE + * instructions. The locations should not overlap. + * + * @param dst + * Pointer to the destination of the data. + * @param src + * Pointer to the source data. + */ +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + __m128i reg_a; + asm volatile ( + "movdqu (%[src]), %[reg_a]\n\t" + "movdqu %[reg_a], (%[dst])\n\t" + : [reg_a] "=x" (reg_a) + : [src] "r" (src), + [dst] "r"(dst) + : "memory" + ); +} + +/** + * Copy 32 bytes from one location to another using optimised SSE + * instructions. The locations should not overlap. + * + * @param dst + * Pointer to the destination of the data. + * @param src + * Pointer to the source data. + */ +static inline void +rte_mov32(uint8_
[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
> > > From: Chao CH Zhu [mailto:bjzhuc at cn.ibm.com] > Sent: Thursday, October 16, 2014 4:14 AM > To: Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > architecture > > Konstantin, > > In my understanding, compiler barrier is a kind of software barrier which > prevents the compiler from moving memory accesses across > the barrier. Yes, compiler_barrier() right now only guarantees that the compiler wouldn't reorder instructions across it while emitting the code. > This should be architecture-independent. And the "sync" instruction is a > hardware barrier which depends on PowerPC > architecture. I understand what "sync" does. >So I think the compiler barrier should be the same on x86 and PowerPC. Any >comments? Please correct me if I was > wrong. The thing is that current DPDK code will not work correctly on system with weak memory ordering - IA has quite strict memory ordering model and there is a code inside DPDK that relies on the fact that CPU would follow that model. For such places in the code - compiler barrier is enough for IA, but is not enough for PPC. Do you worry about the names here- compiler barrier will become a HW one? :)? In that case what you probably can do: Create a new architecture dependent macro: rte_barrier(). That would expand into rte_compiler_barrier() for IA and to rte_mb() for PPC. Got through all references of rte_compiler_barrier() inside DPDK and replace it with rte_barrier(). Konstantin > > Thanks a lot! > > Best Regards! > -- > Chao Zhu > > > > > From: ? ? ? ?"Ananyev, Konstantin" > To: ? ? ? ?Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" > Date: ? ? ? ?2014/10/16 08:38 > Subject: ? ? ? ?RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM > Power ? ? ? ?architecture > > > > > > Hi, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao Zhu > > Sent: Friday, September 26, 2014 10:36 AM > > To: dev at dpdk.org > > Subject: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > > architecture > > > > The atomic operations implemented with assembly code in DPDK only > > support x86. This patch add architecture specific atomic operations for > > IBM Power architecture. > > > > Signed-off-by: Chao Zhu > > --- > > ?.../common/include/powerpc/arch/rte_atomic.h ? ? ? | ?387 > > > > ?.../common/include/powerpc/arch/rte_atomic_arch.h ?| ?318 > > ?2 files changed, 705 insertions(+), 0 deletions(-) > > ?create mode 100644 lib/librte_eal/common/include/powerpc/arch/rte_atomic.h > > ?create mode 100644 > > lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > > > ... > > + > > diff --git a/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > > b/lib/librte_eal/common/include/powerpc/arch/rte_atomic_arch.h > > new file mode 100644 > > index 000..fe5666e > > --- /dev/null > > + > ... > >+#define ? ? ? ? ? ? ? ? rte_arch_rmb() asm volatile("sync" : : : "memory") > >+ > > +#define ? ? ? ? ? ? ? ? rte_arch_compiler_barrier() do { ? ? ? ? ? ? ? ? ? > > ? ? ? ? ? ? ? ?\ > > + ? ? ? ? ? ? ? ? asm volatile ("" : : : "memory"); ? ? ? ? ? ? ? ? \ > > +} while(0) > > I don't know much about PPC architecture, but as I remember it uses a > ?weakly-ordering memory model. > Is that correct? > If so, then you probably need rte_arch_compiler_barrier() to be "sync" > instruction (like mb()s above) . > The reason is that IA has much stronger memory ordering model and there are a > lot of places in the code where it implies > that ?ordering. > For example - ring enqueue/dequeue functions. > > Konstantin
[dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power architecture
> -Original Message- > From: Richardson, Bruce > Sent: Thursday, October 16, 2014 10:43 AM > To: Chao CH Zhu; Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > architecture > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chao CH Zhu > > Sent: Thursday, October 16, 2014 4:14 AM > > To: Ananyev, Konstantin > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH 02/12] Add atomic operations for IBM Power > > architecture > > > > Konstantin, > > > > In my understanding, compiler barrier is a kind of software barrier which > > prevents the compiler from moving memory accesses across the barrier. This > > should be architecture-independent. And the "sync" instruction is a > > hardware barrier which depends on PowerPC architecture. So I think the > > compiler barrier should be the same on x86 and PowerPC. Any comments? > > Please correct me if I was wrong. > > > I would agree with that assessment, as far as it goes, in that a compiler > barrier is going to be the same on both architectures. However, > we also need to start thinking about actual use cases - how to we specify the > barriers in a piece of code where we need a full memory > barrier on PPC and only a compiler barrier on IA? > My suggestion would be to do first as you propose and have proper primitives > for the different barrier types defined correctly for > each platform - with the compiler barrier being, presumably, common across > each one. Then, as a second step, we probably need to > look at defining "logical" barrier types (for want of a better term) that can > then be used in the code and which would be different > across platforms. Yeh, as I said in other mail, what we probably can do: Create a new architecture dependent macro: rte_barrier(). That would expand into rte_compiler_barrier() for IA and to rte_mb() for PPC. Got through all references of rte_compiler_barrier() inside DPDK and replace it with rte_barrier(). BTW, for my own curiosity: Is there any good use for compiler_barrier() on systems with weakly ordered memory model? > > Does this make sense to do this way? Is it the best solution? Do we want to > define the basic primitives or are we only ever likely to > need the logical barrier types? > > /Bruce
[dpdk-dev] Possibility to unbind interface by DPDK
I have a question regarding unbinding Linux interface from EAL. This feature was present up to dpdk 1.4 and next it was removed. It was available under RTE_EAL_UNBIND_PORTS flag. Is there a possibility to get this feature back in the next releases? Unbinding interfaces from EAL makes possible reading network interface parameters like IP address, MTU, VLAN configuration from dpdk applications. When Linux interface is unbound before application start this information is lost for application. Mirek
[dpdk-dev] Possibility to unbind interface by DPDK
On 2014/10/16 19:45, Walukiewicz, Miroslaw wrote: > I have a question regarding unbinding Linux interface from EAL. > > This feature was present up to dpdk 1.4 and next it was removed. > > It was available under RTE_EAL_UNBIND_PORTS flag. > > Is there a possibility to get this feature back in the next releases? > > Unbinding interfaces from EAL makes possible reading network interface > parameters like IP address, MTU, VLAN configuration from dpdk applications. > > When Linux interface is unbound before application start this information is > lost for application. The same problem was found. Might an alternative be to actually bind the NICs to DPDK uio driver like the dpdk_nic_bind.py scipts after getting that NIC parameters in your application . > > Mirek > >
[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > Sent: Saturday, October 11, 2014 6:56 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API > > Introduce a new filter framewok in librte_ether. As to the implemetation > discussion, please refer to > http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN > tunnel filter implementation is based on > it. > > Signed-off-by: Jijiang Liu > Acked-by: Helin Zhang > Acked-by: Jingjing Wu > [..] > new file mode 100644 > index 000..574e9ff > --- /dev/null > +++ b/lib/librte_ether/rte_eth_ctrl.h [...] > +/** > + * All generic operations to filters > + */ > +enum rte_filter_op { > + /**< used to check whether the type filter is supported */ Shouldn't be this comment below? > + RTE_ETH_FILTER_OP_NONE = 0, > + RTE_ETH_FILTER_OP_ADD, /**< add filter entry */ > + RTE_ETH_FILTER_OP_UPDATE, /**< update filter entry */ > + RTE_ETH_FILTER_OP_DELETE, /**< delete filter entry */ > + RTE_ETH_FILTER_OP_GET, /**< get filter entry */ > + RTE_ETH_FILTER_OP_SET, /**< configurations */ > + /**< get information of filter, such as status or statistics */ Same here > + RTE_ETH_FILTER_OP_GET_INFO, > + RTE_ETH_FILTER_OP_MAX, > +}; > +
[dpdk-dev] [PATCH 1/4] lib/librte_ether: new filter APIs definition
2014-10-10 07:28, De Lara Guarch, Pablo: > > > > > Define new APIs to support configure multi-kind filters using same > > > > > APIs > > > > > - rte_eth_dev_filter_supported > > > > > - rte_eth_dev_filter_ctrl > > > > > > > > > > As to the implemetation discussion, please refer to > > > > > http://dpdk.org/ml/archives/dev/2014-September/005179.html, and > > > > > control packet filter implementation is based on it. > > > > > > > > This patch is also present on the patchset Support flow director > > > > programming on Fortville. > > > > Should this patchset be rejected then or just this patch? In second > > > > case, could you send a v2 without this patch? > > > > > > I think this patch does not only present on the flow director patchset, > > > but > > > also on mac vlan support patchset, vxlan patchset, and so on. All of them > > are > > > using the same new filter APIs. If any patchset is applied, others may > > require > > > some modification (just as you said to remove this pacth). > > > > > Additional, without the patch, this patchset cannot work separately. More > > than one features depend on the new filter APIs, but none patchset contains > > the new filter APIs is applied currently. That's why each patchset has such > > patch. > > I see, then probably the best idea would have been send this patch separately, > and just say that these patchsets depend on this patch, basically because > if you try to apply all these patches, you are going to get failures. Yes, sending a separated patch and explicitly base your patch on this one would be really easier to understand. And more generally, it's easier when things are explained. You won't have to pay for the extra words you put in your cover letter ;) There is another problem with this patch: there are many versions around with different logs and even different authors! -- Thomas
[dpdk-dev] [PATCH v4 00/10] VM Power Management
Hi Thomas, > > However with a DPDK solution it would be possible to re-use the message bus > > to pass information like device stats, application state, D-state requests > > etc. to the host and allow for management layer(e.g. OpenStack) to make > > informed decisions. > > I think that management informations should be transmitted in a management > channel. Such solution should exist in OpenStack. Perhaps it does, but this solution is not exclusive to OpenStack and just a potential use case. > > > Also, the scope of adding power management to qemu/KVM would be huge; > > while the easier path is not always the best and the problem of power > > management in VMs is both a DPDK problem (given that librte_power only > > worked on the host) and a general virtualization problem that would be > > better solved by those with direct knowledge of Qemu/KVM architecture > > and influence on the direction of the Qemu project. > > Being a huge effort is not an argument. I agree completely and was implied by what followed the conjunction. > Please check with Qemu community, they'll welcome it. > > > As it stands, the host backend is simply an example application that can > > be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has > > obvious leanings to Qemu, but even this could be easily swapped out for > > XenBus, IVSHMEM, IP etc. > > > > If power management is to be eventually supported by Hypervisors directly > > then we could also enable to option to switch to that environment, currently > > the librte_power implementations (VM or Host) can be selected dynamically > > (environment auto-detection) or explicitly via rte_power_set_env(), adding > > an arbitrary number of environments is relatively easy. > > Yes, you are adding a new layer to workaround hypervisor lacks. And this layer > will handle native support when it will exist. But if you implement native > support now, we don't need this extra layer. Indeed, but we have a solution implemented now and yes it is a workaround, that is until Hypervisors support such functionality. It is possible that whatever solutions for power management present themselves in the future may require workarounds also, us-vhost is an example of such a workaround introduced to DPDK. > > > I hope this helps to clarify the approach. > > Thanks for your explanation. Thanks for the feedback. > > -- > Thomas Alan.
[dpdk-dev] filter_ctl PMD API idea
2014-09-08 15:06, Wu, Jingjing: > Any comments or advises? > > Thanks! > > Fortville Filter features' development will be started based on this design > this week. Thanks Jingjing for explaining your plan before working on it. There were no comment for 1 month so we'll assume everybody is OK. Now your work is done and it's time to integrate it. This design is used in many pending patchsets. Now I wait for an unique patch out of any patchset in order to do some comments about implementation. Then it will be applied with i40e filters using this API. So we'll have a new API implemented only for i40e. But when DPDK 1.8 will be out, I expect to receive patches replacing old API with this new one for igb and ixgbe. Last request, please could you write a brief email summarizing all filters of Intel NICs from an user perspective, and which ones are implemented in DPDK, with which API? Thanks > > -Original Message- > > Hi, all > > > > When we develop filters feature in i40e driver for Intel? Ethernet > > Controller XL710/X710 > > [Fortville] (For both 10G/40G), we found that there are lots of new > > filters, there are also > > some changes on the existing filters, comparing to ixgbe. > > If we keep the way to add new ops in rte_eth_dev for each new filter, it > > can work. > > But we suggest to use a more generic API for all filters to avoid a > > superset dev_ops. It needs > > to be cleaner and easy-to-use. There is a need for technical discussion. > > > > Here is the early design idea we are looking for comments. > > > > 1. Create two new APIs > > - > > rte_eth_filter_supported(uint8_t port, uint16_t filter_type); > > /* check whether this filter type is supported for the queried port */ > > rte_eth_filter_ctl(uint8_t port, uint16_t filter_type, uint16_t filter_op, > > void *arg); > > /* configure filters, will call new ops eth_filter_ctl in eth_dev_ops */ > > - > > > > 2. Define filter types, operations, and structures in new header file > > lib/librte_eth/rte_eth_filter.h. > > - > > #define RTE_ETH_FILTER_RSS 1 > > #define RTE_ETH_FILTER_SYN 2 > > #define RTE_ETH_FILTER_5TUPLE 3 > > #define RTE_ETH_FILTER_FDIR 4 > > > > > > #define RTE_ETH_FILTER_OP_GET 1 > > #define RTE_ETH_FILTER_OP_ADD 2 > > #define RTE_ETH_FILTER_OP_DELETE3 > > #define RTE_ETH_FILTER_OP_SET 4 > > < other operations if want to define>... > > > > /* structures defined for corresponding filter type and operation */ > > /* take RTE_ETH_FILTER_FDIR and OP_SET for example*/ > > > > struct rte_eth_filter_fdir_cfg { > > #define RTE_ETH_FILTER_FDIR_SET_MASK 0 > > #define RTE_ETH_FILTER_FDIR_SET_OFFSET 1 > > ?? > > uint16_t cfg_type; > > ??/* sub operation to defined what specific configuration it will take, > > ???and which following fields are meaningful*/ > > > > ??/* fields, can be a union or combine of required specific items*/ > > > > > > }; > > > > - > > By this way, It is easy to add more filter types or operation in future. > > And the difference among the same filter and operation can be distinguish > > by sub command > > in defined structure, e.g. ?cfg_type? in above rte_eth_filter_fdir_cfg > > structure. > > > > 3. Define ops in driver (take i40e for example) > > - > > static struct eth_dev_ops i40e_eth_dev_ops = { > > . filter_ctl = i40e_filter_ctl, > > }; > > - > > Then the functions in drivers can be implemented separately. > > > > 4. Use case In test-pmd/cmdline.c > > - > > #include > > /* add or change commands e.g. fdir_set (arg1) (arg2) ?? */ > > > > static void > > cmd_fdir_parsed() > > { > > ?? > > ??/* take setting fdir mask for example*/ > > ??struct rte_eth_filter_fdir_cfg cfg; > > > > ??if (rte_eth_filter_supported(port, RTE_ETH_FILTER_FDIR)) { > > ?? cfg.cfg_type = RTE_ETH_FILTER_FDIR_SET_MASK; > > ?? /* fill the corresponding fields in cfg*/ > > ?? ?? > > ?? rte_eth_filter_ctl(port, RTE_ETH_FILTER_FDIR, RTE_ETH_FILTER_OP_SET, > > &cfg); > > ??} > > ?? > > } > > - > > > > > > Any comments are welcome! > > > > At the time being, only Intel PMD is only available on dpdk.org. We are > > lack of understanding > > on the other non-Intel PMD, the current design did not take them into > > account. But we are > > looking for the inputs from those PMD developers, we strongly look forward > > to those PMD > > are released as open source. > > > > Thanks! > > Jingjing
[dpdk-dev] [v2 20/23] librte_cfgfile: interpret config files
Hi Cristian, 2014-06-04 19:08, Cristian Dumitrescu: > This library provides a tool to interpret config files that have standard > structure. > > It is used by the Packet Framework examples/ip_pipeline sample application. > > It originates from examples/qos_sched sample application and now it makes > this code available as a library for other sample applications to use. > The code duplication with qos_sched sample app to be addressed later. 4 months ago, you said that this duplication will be adressed later. Neither you nor anyone at Intel submitted a patch to clean up that. I just want to be sure that "later" doesn't mean "never" because I'm accepting another "later" word for cleaning old filtering API. Maybe you just forgot it so please prove me that I'm right to accept "later" clean-up, in general. Thanks -- Thomas
[dpdk-dev] [PATCH v5 1/8]i40e:support VxLAN packet identification in librte_ether
2014-10-11 13:55, Jijiang Liu: > Add data structures and APIs in librte_ether for supporting tunneling UDP > port configuration on i40e, > Currently, only VxLAN is implemented, which include > - VxLAN UDP port initialization > - Add APIs to configure VxLAN UDP port Please could you explain in the commit log how it is related to filtering? [...] > + /** > + * Add tunneling UDP port configuration of Ethernet device tunneling UDP or UDP tunneling? Please, explain what the device could do with these informations. Offloading? Filtering? > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param tunnel_udp > + * Where to store the current Tunneling UDP configuration > + * of the Ethernet device. Many words are useless here. "UDP tunneling configuration" should be sufficient. > + * @param count > + * How many configurations are going to added. It's a verbose commenting style, but why not. Typo: "to be added". > + * > + * @return > + * - (0) if successful. > + * - (-ENODEV) if port identifier is invalid. > + * - (-ENOTSUP) if hardware doesn't support tunnel type. > + */ > +int > +rte_eth_dev_udp_tunnel_add(uint8_t port_id, > + struct rte_eth_udp_tunnel *tunnel_udp, > + uint8_t count); Thanks -- Thomas
[dpdk-dev] EAL : Input/output error on DPDK 1.7.1
Hey, I observe continuous burst of I/O Errors, as indicated below, with the testpmd application with DPDK 1.7.1.This seems to originate from eal_intr_process_interrupts() function. I seemed to have setup the DPDK prerequisites alright. Another recent post seemed to suggest moving back to 1.7.0, however I would like to persist with 1.7.1. Any help/pointers in resolving this would be greatly appreciated. Much thanks,Raghav root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app# ./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2 EAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error reading from file descriptor 21: Input/output error root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py --status Network devices using DPDK-compatible driver:02:01.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000 Network devices using kernel driver===:02:00.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio *Active*:02:03.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth3 drv=e1000 unused=igb_uio :02:05.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth4 drv=e1000 unused=igb_uio :02:06.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth5 drv=e1000 unused=igb_uio Other network devices=
[dpdk-dev] [PATCH v4 04/10] VM Power Management application and Makefile.
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alan Carew > Sent: Sunday, October 12, 2014 8:36 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v4 04/10] VM Power Management application > and Makefile. > > For launching CLI thread and Monitor thread and initialising > resources. > Requires a minimum of two lcores to run, additional cores specified by eal > core > mask are not used. > > Signed-off-by: Alan Carew > --- > examples/vm_power_manager/Makefile | 57 ++ > examples/vm_power_manager/main.c | 117 > + > examples/vm_power_manager/main.h | 52 + > 3 files changed, 226 insertions(+) > create mode 100644 examples/vm_power_manager/Makefile > create mode 100644 examples/vm_power_manager/main.c > create mode 100644 examples/vm_power_manager/main.h [...] > +# Default target, can be overriden by command line or environment > +RTE_TARGET ?= x86_64-default-linuxapp-gcc Tiny comment here. Target should be x86_64-native-linuxapp-gcc > + > +include $(RTE_SDK)/mk/rte.vars.mk > + > +# binary name > +APP = vm_power_mgr > + > +# all source are stored in SRCS-y > +SRCS-y := main.c vm_power_cli.c power_manager.c channel_manager.c > +SRCS-y += channel_monitor.c > + > +CFLAGS += -O3 -lvirt -I$(RTE_SDK)/lib/librte_power/ > +CFLAGS += $(WERROR_FLAGS) > +
[dpdk-dev] [PATCH] librte_eal: FreeBSD contigmem prevent possible buffer overrun during module unload.
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alan Carew > Sent: Tuesday, October 14, 2014 1:19 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] librte_eal: FreeBSD contigmem prevent > possible buffer overrun during module unload. > > The maximum mount contiguous memory regions for FreeBSD is limited by > RTE_CONTIGMEM_MAX_NUM_BUFS, a pointer to each region is stored in > static void * contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS] > > A user can specify a greater amount via hw.contigmem.num_buffers, > while the allocation logic will prevent this allocation from occuring the > logic > in contigmem_unload() will attempt to free hw.contigmem.num_buffers and > an > overrun occurs. > > This patch limits the freeing to a maximum of > RTE_CONTIGMEM_MAX_NUM_BUFS. > > Signed-off-by: Alan Carew Acked-by: Pablo de Lara
[dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e
2014-10-11 13:55, Jijiang Liu: > # > +# Compile tunneling UDP port support > +# > +CONFIG_RTE_LIBRTE_TUNNEL_UDP_PORT=4789 > + > +# 1) this option is not to "Compile tunneling UDP port support" 2) why is it a compile time option? should it be an API parameter or a runtime option? > + uint16_t packet_type; /**< Packet type, which indicates packet > format */ It's not very clear what packet type is. There is maybe a more precise description, or is it hardware dependent? > static struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf, > - struct i40e_vsi *vsi); > + struct i40e_vsi *vsi); It's not related to VXLAN. -- Thomas
[dpdk-dev] [PATCH v5 3/8]app/test-pmd:test VxLAN packet identification
2014-10-11 13:55, Jijiang Liu: > - "tx_checksum set mask (port_id)\n" > + "tx_checksum set (mask) (port_id)\n" > "Enable hardware insertion of checksum offload with" > - " the 4-bit mask, 0~0xf, in packets sent on a port.\n" > + " the 8-bit mask, 0~0xff, in packets sent on a port.\n" > "bit 0 - insert ip checksum offload if set\n" > "bit 1 - insert udp checksum offload if set\n" > "bit 2 - insert tcp checksum offload if set\n" > "bit 3 - insert sctp checksum offload if set\n" > + "bit 4 - insert inner ip checksum offload if > set\n" > + "bit 5 - insert inner udp checksum offload if > set\n" > + "bit 6 - insert inner tcp checksum offload if > set\n" > + "bit 7 - insert inner sctp checksum offload if > set\n" > "Please check the NIC datasheet for HW limits.\n\n" [...] > .help_str = "enable hardware insertion of L3/L4checksum with a given " > - "mask in packets sent on a port, the bit mapping is given as, Bit 0 for > ip" > - "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP", > + "mask in packets sent on a port, the bit mapping is given as, Bit 0 for > ip " > + "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP, Bit 4 for inner ip " > + "Bit 5 for inner UDP, Bit 6 for inner TCP, Bit 7 for inner SCTP", > .tokens = { > (void *)&cmd_tx_cksum_set_tx_cksum, > (void *)&cmd_tx_cksum_set_set, How is it related to VXLAN? I may have missed something. But if not, I note the name of the reviewers ;) -- Thomas
[dpdk-dev] [PATCH v5 4/8]librte_ether:add a common filter API
I don't review the common API as it should be done in an unique place and there are many copies in different patchsets. Let's focus on tunnels. 2014-10-11 13:55, Jijiang Liu: > +/ TUNNEL FILTER DATA DEFINATION *** */ We cannot miss this comment :) > +#define ETH_TUNNEL_FILTER_OMAC 0x01 > +#define ETH_TUNNEL_FILTER_OIP 0x02 > +#define ETH_TUNNEL_FILTER_TENID 0x04 > +#define ETH_TUNNEL_FILTER_IMAC 0x08 > +#define ETH_TUNNEL_FILTER_IVLAN 0x10 > +#define ETH_TUNNEL_FILTER_IIP 0x20 > + > +#define RTE_TUNNEL_FLAGS_TO_QUEUE 1 These values requires some comments. > +/* > + * Tunneled filter type > + */ > +enum rte_tunnel_filter_type { > + RTE_TUNNEL_FILTER_TYPE_NONE = 0, > + RTE_TUNNEL_FILTER_OIP = ETH_TUNNEL_FILTER_OIP, > + RTE_TUNNEL_FILTER_IMAC_IVLAN = > + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN, > + RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID = > + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN | > + ETH_TUNNEL_FILTER_TENID, > + RTE_TUNNEL_FILTER_IMAC_TENID = > + ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_TENID, > + RTE_TUNNEL_FILTER_IMAC = ETH_TUNNEL_FILTER_IMAC, > + RTE_TUNNEL_FILTER_OMAC_TENID_IMAC = > + ETH_TUNNEL_FILTER_OMAC | ETH_TUNNEL_FILTER_TENID | > + ETH_TUNNEL_FILTER_IMAC, > + RTE_TUNNEL_FILTER_IIP = ETH_TUNNEL_FILTER_IIP, > + RTE_TUNNEL_FILTER_TYPE_MAX, > +}; It's absolutely impossible to understand. Keep in mind the first goal of an API: be used (which imply to be understood by users). And I really don't understand why you define values for combination of previous flags. Please, keep it simple. -- Thomas
[dpdk-dev] [PATCH v5 0/8]Support VxLAN on Fortville
This test report brings a lot of details. It's a good thing but we should find a way to remove the "administrative words". It should start with the tested-by line to allow copy paste in the commit log. 2014-10-11 07:56, Liu, Yong: > Patch name: Support VxLAN on Fortville > Brief description:Verify vxlan checksum detect/offload and tunnel filter > work fine. > Test Flag:Tested-by > Tester name: yong.liu at intel.com > Test environment: > OS: Fedora20 3.15.8-200.fc20.x86_64 > GCC: gcc version 4.8.3 20140624 > CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz > NIC: Intel Corporation Ethernet Controller XL710 for > 40GbE QSFP+ [8086:1583] > Test Tool Chain information: > N/A > Commit ID:ee1a5470faa751c1fd07d23b86659fe7a68fd251 > > Detailed Testing information > DPDK SW Configuration: > Default x86_64-native-linuxapp-gcc configuration > Test Result Summary: Total 6 cases, 6 passed, 0 failed > > Test Case - name: > vxlan_ipv4_detect > Test Case - Description: > Check testpmd can receive and detect vxlan packet > Test Case -command / instruction: > Start testpmd with vxlan enabled and rss disabled Why RSS is disabled? > testpmd -c -n 4 -- -i --tunnel-type=1 --disble-rss > --rxq=4 --txq=4 --nb-cores=8 --nb-ports=2 --disble-rss: typo in command line. It raises doubts on the test. -- Thomas
[dpdk-dev] [PATCH v5 7/8]i40e:support VxLAN Tx checksum offload
2014-10-11 13:55, Jijiang Liu: > Support VxLAN Tx checksum offload, which include > - outer L3(IP) checksum offload > - inner L3(IP) checksum offload > - inner L4(UDP, TCP and SCTP) checksum offload [...] > + > + /* fields to support tunnelling packet TX offloads */ I know that previous comment is "fields to support TX offloads", but I'd prefer "for TX offloading of tunnels". Maybe that "encapsulation" is better than "tunnel". Just my opinion. > + union { > + /**< combined inner l2/l3 lengths as single var */ > + uint16_t inner_l2_l3_len; > + > + struct { > + /**< inner L3 (IP) Header Length. */ > + uint16_t inner_l3_len:9; > + > + /**< L2 (MAC) Header Length. */ > + uint16_t inner_l2_len:7; > + }; > + }; I would like to highlight that you are using 2 bytes in the second cache line of the mbuf. It deserves at least a line in the commit log. Actually I'd prefer a separate patch for mbuf modifications. Thanks -- Thomas
[dpdk-dev] filter_ctl PMD API idea
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Friday, October 17, 2014 12:07 AM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] filter_ctl PMD API idea > > 2014-09-08 15:06, Wu, Jingjing: > > Any comments or advises? > > > > Thanks! > > > > Fortville Filter features' development will be started based on this design > > this week. > > Thanks Jingjing for explaining your plan before working on it. > There were no comment for 1 month so we'll assume everybody is OK. > Now your work is done and it's time to integrate it. > It's great, thanks. > This design is used in many pending patchsets. > Now I wait for an unique patch out of any patchset in order to do some > comments about implementation. OK, an unique patch of this new API definition will be sent soon. And I hope it can be reviewed With high priority, due to many pending patchsets we need to rework. > Then it will be applied with i40e filters using this API. > So we'll have a new API implemented only for i40e. > But when DPDK 1.8 will be out, I expect to receive patches replacing old API > with this new one for igb and ixgbe. Fine, now I am working on integrating ixgbe's flow director to the new APIs. > Last request, please could you write a brief email summarizing all filters > of Intel NICs from an user perspective, and which ones are implemented in > DPDK, with which API? > OK. > Thanks > > > > > -Original Message- > > > Hi, all > > > > > > When we develop filters feature in i40e driver for Intel? Ethernet > > > Controller XL710/X710 > > > [Fortville] (For both 10G/40G), we found that there are lots of new > > > filters, there are also > > > some changes on the existing filters, comparing to ixgbe. > > > If we keep the way to add new ops in rte_eth_dev for each new filter, it > > > can work. > > > But we suggest to use a more generic API for all filters to avoid a > > > superset dev_ops. It > needs > > > to be cleaner and easy-to-use. There is a need for technical discussion. > > > > > > Here is the early design idea we are looking for comments. > > > > > > 1. Create two new APIs > > > - > > > rte_eth_filter_supported(uint8_t port, uint16_t filter_type); > > > /* check whether this filter type is supported for the queried port */ > > > rte_eth_filter_ctl(uint8_t port, uint16_t filter_type, uint16_t > > > filter_op, void *arg); > > > /* configure filters, will call new ops eth_filter_ctl in eth_dev_ops */ > > > - > > > > > > 2. Define filter types, operations, and structures in new header file > > > lib/librte_eth/rte_eth_filter.h. > > > - > > > #define RTE_ETH_FILTER_RSS1 > > > #define RTE_ETH_FILTER_SYN2 > > > #define RTE_ETH_FILTER_5TUPLE 3 > > > #define RTE_ETH_FILTER_FDIR 4 > > > > > > > > > #define RTE_ETH_FILTER_OP_GET 1 > > > #define RTE_ETH_FILTER_OP_ADD 2 > > > #define RTE_ETH_FILTER_OP_DELETE 3 > > > #define RTE_ETH_FILTER_OP_SET 4 > > > < other operations if want to define>... > > > > > > /* structures defined for corresponding filter type and operation */ > > > /* take RTE_ETH_FILTER_FDIR and OP_SET for example*/ > > > > > > struct rte_eth_filter_fdir_cfg { > > > #define RTE_ETH_FILTER_FDIR_SET_MASK 0 > > > #define RTE_ETH_FILTER_FDIR_SET_OFFSET 1 > > > ?? > > > uint16_t cfg_type; > > > ??/* sub operation to defined what specific configuration it will take, > > > ???and which following fields are meaningful*/ > > > > > > ??/* fields, can be a union or combine of required specific items*/ > > > > > > > > > }; > > > > > > - > > > By this way, It is easy to add more filter types or operation in future. > > > And the difference among the same filter and operation can be distinguish > > > by sub > command > > > in defined structure, e.g. ?cfg_type? in above rte_eth_filter_fdir_cfg > > > structure. > > > > > > 3. Define ops in driver (take i40e for example) > > > - > > > static struct eth_dev_ops i40e_eth_dev_ops = { > > > . filter_ctl = i40e_filter_ctl, > > > }; > > > - > > > Then the functions in drivers can be implemented separately. > > > > > > 4. Use case In test-pmd/cmdline.c > > > - > > > #include > > > /* add or change commands e.g. fdir_set (arg1) (arg2) ?? */ > > > > > > static void > > > cmd_fdir_parsed() > > > { > > > ?? > > > ??/* take setting fdir mask for example*/ > > > ??struct rte_eth_filter_fdir_cfg cfg; > > > > > > ??if (rte_eth_filter_supported(port, RTE_ETH_FILTER_FDIR)) { > > > ??cfg.cfg_type = RTE_ETH_FILTER_FDIR_SET