[dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter in librte_ether
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 11:12 PM > To: Liu, Jijiang > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter > in librte_ether > > 2014-08-26 15:31, Jijiang Liu: > > +enum rte_tunnel_filter_type { > > + RTE_TUNNEL_FILTER_TYPE_NONE = 0, > > + RTE_TUNNEL_FILTER_IMAC_IVLAN, /**< Filter by inner MAC and VLAN ID. > */ > > + RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID, > > + /**< Filter by inner MAC address and VLAN ID, tenned ID. */ > > + RTE_TUNNEL_FILTER_IMAC_TENID, /**< Filter by inner MAC and tenant > ID. */ > > + RTE_TUNNEL_FILTER_IMAC, /**< Filter by inner MAC address */ > > + RTE_TUNNEL_FILTER_OMAC_TENID_IMAC, > > + /**< Filter by outer MAC address, tenant ID and Inner MAC */ > > + RTE_TUNNEL_FILTER_TYPE_MAX, > > +}; > [...] > > /** > > + * Tunnel Packet filter configuration. > > + */ > > +struct rte_eth_tunnel_filter_conf { > > + struct ether_addr *outer_mac; /**< Outer MAC address fiter. */ > > + struct ether_addr *inner_mac; /**< Inner MAC address fiter. */ > > + uint16_t inner_vlan; /**< Inner VLAN fiter. */ > > + enum rte_tunnel_iptype ip_type; /**< IP address type. */ > > + union { > > + uint32_t ipv4_addr;/**< IPv4 source address to match. */ > > + uint32_t ipv6_addr[4]; /**< IPv6 source address to match. */ > > + } ip_addr; /**< IPv4/IPv6 source address to match (union of above). > > +*/ > > + > > + uint8_t filter_type; /**< Filter type. */ > > + uint8_t to_queue; /**< Use MAC and VLAN to point to a > queue. */ > > + enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */ > > + uint32_t tenant_id;/** < Tenant number. */ > > + uint16_t queue_id; /** < queue number. */ > > +}; > [...] > > +typedef int (*eth_tunnel_filter_set_t)(struct rte_eth_dev *dev, > > + struct rte_eth_tunnel_filter_conf > > *tunnel_filter, > > + uint8_t filter_count, uint8_t add); /**< > > @internal > Set > > +tunnel filter */ > [...] > > + /** > > + * Add tunnel filter configuration of Ethernet device > > + * > > + * @param port_id > > + * The port identifier of the Ethernet device. > > + * @param tunnel_filter > > + * Where to store the current Tunneling UDP configuration > > + * of the Ethernet device. > > + * @param filter_count > > + * How many filters are going to added. > > + * @param add > > + * 0: remove tunnel filter > > + * 1: add tunnel filter > > + * > > + * @return > > + * - (0) if successful. > > + * - (-ENODEV) if port identifier is invalid. > > + * - (-EINVAL) if bad parameter. > > + * - (-ENOTSUP) if hardware doesn't support tunnel type. > > + */ > > +int > > +rte_eth_dev_tunnel_filter_set(uint8_t port_id, > > + struct rte_eth_tunnel_filter_conf *tunnel_filter, > > + uint8_t filter_count, uint8_t add); > > I wonder if we could use a common function to set all kind of filters? > > Thoughts are welcome. > > -- > Thomas The rte_eth_dev_tunnel_filter_set() is a common filter function for tunneling packet, which can set all kind of filters. But now I just implemented and tested VxLAN tunneling packet in this function, another tunneling packets support will be here later. Look at the structure definition, which support another tunneling types. struct rte_eth_tunnel_filter_conf ? ... enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */ ... } -- Jijiang Liu
[dpdk-dev] [PATCH v2 1/7]i40e:support VxLAN packet identification
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:59 PM > To: Liu, Jijiang > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 1/7]i40e:support VxLAN packet > identification > > Hi Jijiang, > > 2014-08-26 15:31, Jijiang Liu: > > VxLAN UDP port configuration on i40e, which include > > - VxLAN UDP port initialization > > - add APIs to configure VxLAN UDP port > > Actually you should explain that you introduce tunnels and first one is VxLAN. > > > lib/librte_ether/rte_ethdev.c | 63 > > lib/librte_ether/rte_ethdev.h | 76 ++ > > lib/librte_ether/rte_ether.h | 10 ++ > > lib/librte_pmd_i40e/i40e_ethdev.c | 199 > - > > lib/librte_pmd_i40e/i40e_ethdev.h |5 + > > Changing ethdev API is so sensible that it deserves a separated commit. > > Code seems OK. > You meant that I should split the patch into two patches, one is for changes in librte_ether, other is for Change in librte_pmd_i40e? Or one for VxLAN UDP port initialization, other is for add APIs to configure VxLAN UDP port? -- Jijiang Liu
[dpdk-dev] [PATCH v2 1/7] i40e: flow director resource reserve and initialize on i40e
Hi, Thomas The "flow director resource" means we need to reserve a specific queue and VSI on HW to support. The queue and vsi is used for programming flow director filter, not common tx/rx queues. 2 separated items, one is about setup, another is for release. I will try to separate them. > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:18 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 1/7] i40e: flow director resource reserve > and initialize on > i40e > > Hi Jingjing, > > I don't understand the title. What is a "flow director resource"? > > 2014-08-27 10:13, Jingjing Wu: > > flow director resource reserve and initialize on i40e, it includes > > - queue initialization and switch on and vsi creation during setup > > - queue vsi for flow director release during close > > If you have 2 separated items, it probably means it should be 2 patches. > > -- > Thomas
[dpdk-dev] [PATCH v2 3/7] i40e: function implement in i40e for flow director filter programming
Hi, Thomas I will rework on it. > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:25 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 3/7] i40e: function implement in i40e for > flow director > filter programming > > 2014-08-27 10:13, Jingjing Wu: > > support the API ops defined in ethdev, the behavior according to each > > command: > > RTE_CMD_FDIR_RULE_ADD: add a new FDIR filter rule. > > RTE_CMD_FDIR_RULE_DEL: delete a FDIR filter rule. > > Here title is really too complicated. Be concise. > I'd change it to: > i40e: flow director filter > > -- > Thomas
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
HI, Thomas Just as zhang, helin's said in his pacth "[dpdk-dev] [PATCH 2/5] ethdev: add new ops of 'check_command_supported' and 'rx_classification_filter_ctl'": 'rx_classification_filter_ctl' is for receive classification filter configuring. e.g. hash function configuration, flow director configuration. It is a common API where a lot of commands can be implemented for different sub features. We want to implement several common API for NIC specific features, to avoid creating quite a lot of ops in 'struct eth_dev_ops'. The idea came from ioctl. Because about flow director feature, there is large gap between i40e and ixgbe. The existed flow director APIs looks specific to IXGBE, so I choose this new API to implement i40e's flow director feature. Here, I briefly describe how the new 'rx_classification_filter_ctl' works: The API is like below: typedef int (*eth_rx_classification_filter_ctl_t)(struct rte_eth_dev *dev, enum rte_eth_command cmd, void *arg); Define a head file called rte_i40e.h in lib/librte_pmd_i40e, which contains the definition about structures specific to i40e, linked to the arg parameters above. Define a head file called rte_eth_features.h in lib/librte_ether, which contains the commands' definition linked to cmd parameters above. And if user want to use i40e specific features, then the head file rte_i40e.h need to be included user's application, for example, test-pmd. And user can encode these structures and call XXX_ctl API to configure their features. Do you think it make sense? And about the rename things, I will move it to a separate patch. > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:23 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > Hi Jingjing, > > 2014-08-27 10:13, Jingjing Wu: > > support a new ethdev API rx_classification_filter_ctl for all > > the configuration or queries for receive classification filters. > > this patch supports commands the API used below: > > RTE_CMD_FDIR_RULE_ADD > > RTE_CMD_FDIR_RULE_DEL > > RTE_CMD_FDIR_FLUSH > > RTE_CMD_FDIR_INFO_GET > > Could you explain why existing API (flow director + filters) is not enough? > I'd really like to see a common API for all filtering stuff. > > > -/* for 40G only */ > > -#define ETH_RSS_NONF_IPV4_UDP_SHIFT 31 > > -#define ETH_RSS_NONF_IPV4_TCP_SHIFT 33 > > -#define ETH_RSS_NONF_IPV4_SCTP_SHIFT 34 > > -#define ETH_RSS_NONF_IPV4_OTHER_SHIFT 35 > > -#define ETH_RSS_FRAG_IPV4_SHIFT 36 > > -#define ETH_RSS_NONF_IPV6_UDP_SHIFT 41 > > -#define ETH_RSS_NONF_IPV6_TCP_SHIFT 43 > > -#define ETH_RSS_NONF_IPV6_SCTP_SHIFT 44 > > -#define ETH_RSS_NONF_IPV6_OTHER_SHIFT 45 > > -#define ETH_RSS_FRAG_IPV6_SHIFT 46 > > -#define ETH_RSS_FCOE_OX_SHIFT 48 > > -#define ETH_RSS_FCOE_RX_SHIFT 49 > > -#define ETH_RSS_FCOE_OTHER_SHIFT 50 > > -#define ETH_RSS_L2_PAYLOAD_SHIFT 63 > > +/* Packet Classification Type for 40G only */ > > +#define ETH_PCTYPE_NONF_IPV4_UDP 31 > > +#define ETH_PCTYPE_NONF_IPV4_TCP 33 > > +#define ETH_PCTYPE_NONF_IPV4_SCTP 34 > > +#define ETH_PCTYPE_NONF_IPV4_OTHER35 > > +#define ETH_PCTYPE_FRAG_IPV4 36 > > +#define ETH_PCTYPE_NONF_IPV6_UDP 41 > > +#define ETH_PCTYPE_NONF_IPV6_TCP 43 > > +#define ETH_PCTYPE_NONF_IPV6_SCTP 44 > > +#define ETH_PCTYPE_NONF_IPV6_OTHER45 > > +#define ETH_PCTYPE_FRAG_IPV6 46 > > +#define ETH_PCTYPE_FCOE_OX48 /* not used */ > > +#define ETH_PCTYPE_FCOE_RX49 /* not used */ > > +#define ETH_PCTYPE_FCOE_OTHER 50 /* not used */ > > +#define ETH_PCTYPE_L2_PAYLOAD 63 > > Why is it specific to i40e? Could we have something generic? > Please take care at having only generic things in librte_ether. > > By the way, these renamings should be in a separated patch. > > -- > Thomas
[dpdk-dev] [PATCH v2 5/7] fix the Marco conflict
Hi, Thomas Because these macros such as IPPROTO_TCP, IPPROTO_UDP are already defined in . If user's application include and rte_ip.h at the same time, there will be conflict error, for example cmdline.c in testpmd. I remember there was someone also raised this issue few month ago. So just use the way "#ifndef #endif" to avoid the conflict. And it is exactly workaround as you said. Oh, it should be macro, but not marco, my spelling mistake. Sorry for that. I will rename this. > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:28 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 5/7] fix the Marco conflict > > 2014-08-27 10:13, Jingjing Wu: > > fix the Marco conflict between rte_ip.h and netinet/in.h > > Who is Marco? > > > +#ifndef _NETINET_IN_H > > +#ifndef _NETINET_IN_H_ > > /* IPv4 protocols */ > > #define IPPROTO_IP 0 /**< dummy for IP */ > > #define IPPROTO_HOPOPTS0 /**< IP6 hop-by-hop options */ > > @@ -226,7 +228,8 @@ struct ipv4_hdr { > > #define IPPROTO_DIVERT 254 /**< divert pseudo-protocol */ > > #define IPPROTO_RAW 255 /**< raw IP packet */ > > #define IPPROTO_MAX 256 /**< maximum protocol number */ > > - > > +#endif /* _NETINET_IN_H_ */ > > +#endif /* _NETINET_IN_H */ > > Please explain your "fix" (which seems to be a workaround). > > -- > Thomas
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
Hi, Thomas > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, August 27, 2014 10:36 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config > functions for > i40e flow director support > > Hi Jingjing, > > 2014-08-27 10:13, Jingjing Wu: > > add structure definition to construct programming packet. > > What is a "programming packet"? For Fortville, we need to set a flow director filter by sending a packet which contains the input set values through the queue belonging to flow director. > > > +#ifdef RTE_LIBRTE_I40E_PMD > > + "i40e_flow_director_filter (port_id) (add|del)" > > + " flow (ip4|ip6) src (src_ip_address) dst > > (dst_ip_address)" > > + " flexwords (flexwords_value) (drop|fwd)" > > + " queue (queue_id) fd_id (fd_id_value)\n" > > + "Add/Del a IP type flow director filter for i40e > > NIC.\n\n" > > + > > + "i40e_flow_director_filter (port_id) (add|del)" > > + " flow (udp4|tcp4|udp6|tcp6)" > > + " src (src_ip_address) (src_port)" > > + " dst (dst_ip_address) (dst_port)" > > + " flexwords (flexwords_value) (drop|fwd)" > > + " queue (queue_id) fd_id (fd_id_value)\n" > > + "Add/Del a UDP/TCP type flow director filter for > > i40e NIC.\n\n" > > + > > + "i40e_flush_flow_diretor (port_id)\n" > > + "Flush all flow director entries of a device on > > i40e NIC.\n\n" > > +#endif /* RTE_LIBRTE_I40E_PMD */ > > I'd really like to stop seeing this kind of thing. > We cannot add some ifdef for each PMD in generic code. > > I stopped reading after that. > > Sorry, I don't want to be rude but my feeling is that adding such feature > with global picture in mind is not easy. I know you want to offer all i40e > capabilities but you should think at future evolutions and how other drivers > will be integrated with yours. > Sorry to make you feel uncomfortable for such code. Just as you say, I want to offer more i40e capabilities. I will rework code in testpmd. > Thanks > -- > Thomas
[dpdk-dev] overcommitting CPUs
PMD is combined of 'PM' - a thread model and 'D' - a user space driver. DPDK provides optimized RX and TX in Driver on fast path. DPDK provides a single thread core affinity model to demonstrate the best IO with minimum noisy penalty. They are not tight coupling as Venky said. In some cases, you may only pick up the RX/TX but give up the thread model DPDK provided. Just take care to well handle the penalty may exist in the specific thread model. For DPDK, we do think on it, and start to deal with the negative factor. In another perspective, the more cycles we gain on 'D' side the more we could spend on 'PM' side to cancel the penalty out. Maybe a sample using RX/TX without dead polling is a good start. But cannot expect more on user space wake up latency so far. Regards, Liang Cunming > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Venkatesan, Venky > Sent: Wednesday, August 27, 2014 10:54 PM > To: dev at dpdk.org > Subject: Re: [dpdk-dev] overcommitting CPUs > > DPDK currently isn't exactly poll mode - it has an API that receives and > transmits packets. How you enter that API could be interrupt or polled > -we've left that up to the application to decide, rather than force a > interrupt/NAPI type architecture. I do agree with Alex in that > implementing a interrupt/load driven entry point as an option will make > it usable more widely. There are multiple challenges here - managing the > latency of an interrupt driven scheme in a user-space context, not to > mention very high jitter rates to mention a few. > > That said, overcommitment of CPUs can be achieved in other ways as well. > You could allocate and enforce CPU sharing via cgroups, and allocate x% > of a core to the DPDK pthread. It does introduce a degree of > indeterminism to when the DPDK pthread gets scheduled back in (depending > on how many other threads are running on that core). But it is another > option ... > > Regards, > -Venky > > On 8/27/2014 1:40 AM, Alex Markuze wrote: > > IMHO adding "Interrupt Mode" to dpdk is important as this can open > > DPDK to a larger public of consumers, I can easily imagine someone > > trying to find user space networking solution (And deciding against > > verbs - RDMA) for the obvious reasons and not needing deterministic > > latency. > > > > A few thoughts: > > > > Deterministic Latency: Its a fiction in a sence that this something > > you will be able to see only in a small controlled environment. As > > network latencies in Data Centres(DC) are dominated by switch queuing > > (One good reference is http://fastpass.mit.edu that Vincent shared a > > few days back). > > > > Virtual environments: In virtual environments this is especially > > interesting as the NIC driver(Hypervisor) is working in IRQ mode which > > unless the Interrupts are pinned to different cpus then the VM will > > have a disruptive effect on the VM's performance. Moving to interrupt > > mode mode in paravirtualised environments makes sense as in any > > environment that is not carefully crafted you should not expect any > > deterministic guaranties and would opt for a simpler programming model > > - like interrupt mode. > > > > NAPI: With 10G NICs Most CPUs poll rate is faster then the NIC message > > rate resulting in 1:1 napi_poll callback to IRQ ratio this is true > > even with small packets. In some cases where the CPU is working slower > > - for example when intel_iommu=on,strict is set , you can actually see > > a performance inversion where the "slower" CPU can reach higher B/W > > because the slowdown makes NAPI work with the kernel effectively > > moving to polling mode. > > > > I think that a smarter DPDK-NAPI is important, but it is a next step > > IFF the interrupt mode is adopted. > > > > On Wed, Aug 27, 2014 at 8:48 AM, Patel, Rashmin N > > wrote: > >> You're right and I've felt the same harder part of determinism with other > hypervisors' soft switch solutions as well. I think it's worth thinking about. > >> > >> Thanks, > >> Rashmin > >> > >> On Aug 26, 2014 9:15 PM, Stephen Hemminger > wrote: > >> The way to handle switch between out of poll mode is to use IRQ coalescing > >> parameters. > >> You want to hold off IRQ until there are a couple packets or a short delay. > >> Going out of poll mode > >> is harder to determine. > >> > >> > >> On Tue, Aug 26, 2014 at 9:59 AM, Zhou, Danny > wrote: > >> > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen > Hemminger > Sent: Wednesday, August 27, 2014 12:39 AM > To: Michael Marchetti > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] overcommitting CPUs > > On Tue, 26 Aug 2014 16:27:14 + > "Michael Marchetti" wrote: > > > Hi, has there been any consideration to introduce a non-spinning > >>> network driver (interrupt based), for the purpose of overcommitting > CPUs in a virtualized environment? This would obviously ha
[dpdk-dev] vxlan performance data under gro/gso in kernel after 3.14
newest ovs support using native linux tunnel tx/rx api, and linux kernel up to 3.14, which has gso/gro merged for udp tunnel. I remember someone submitted this patch has attained 12Gbps for linux vxlan tcp, so are there any performance data for ovs vxlan tcp?
[dpdk-dev] vxlan performance data under gro/gso in kernel after 3.14
sorry, just depreciate it, I post it in the wrong list :-) On Thu, Aug 28, 2014 at 12:56 PM, loy wolfe wrote: > newest ovs support using native linux tunnel tx/rx api, and linux kernel > up to 3.14, which has gso/gro merged for udp tunnel. I remember someone > submitted this patch has attained 12Gbps for linux vxlan tcp, so are there > any performance data for ovs vxlan tcp? >
[dpdk-dev] Clang Scan build results
Hi Matthew, On Aug 27, 2014, at 10:00 PM, Matthew Hall wrote: > On Wed, Aug 27, 2014 at 03:13:43PM +, Wiles, Roger Keith wrote: >> Hi Everyone, > > Hi Keith, > > For me the build failed with clang but I made a series of awful patches to > get > it to compile... not sure if the clang failures could be related to your > scan-build failures. If it will help you I can provide you the patches I made > to get it to work on clang... they are not ready for the DPDK master branch > but it's really good to get safe output from scan-build. I was able to get DPDK to build using target x86_64-native-linuxapp-clang without errors, but with scan-build it gave me errors that were not in the normal clang build. Was this the target system you were building or something else? > >> It would be nice to run once in a while to weed out any basic problems. We >> could run something like PC-Lint or Coverity, but they cost money :-) > > Not 100% true... you can run Coverity for free on open source if you are the > maintainer... given Intel, Wind River, and 6WIND all have some form of > maintainership authority over DPDK there should be a way to qualify via this > avenue. Yes, it was pointer out to me that Coverity is free to open source via scan.coverity.com site, just have to register and get accepted. > > Matthew. Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
[dpdk-dev] [PATCH v2 1/7]i40e:support VxLAN packet identification
2014-08-28 01:33, Liu, Jijiang: > > > lib/librte_ether/rte_ethdev.c | 63 > > > lib/librte_ether/rte_ethdev.h | 76 ++ > > > lib/librte_ether/rte_ether.h | 10 ++ > > > lib/librte_pmd_i40e/i40e_ethdev.c | 199 > > > > Changing ethdev API is so sensible that it deserves a separated commit. > > > You meant that I should split the patch into two patches, one is for > changes in librte_ether, other is for Change in librte_pmd_i40e? Yes, 1 patch for ethdev (including rte_ether.h) and the other one for i40e. -- Thomas
[dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter in librte_ether
2014-08-28 00:55, Liu, Jijiang: > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > I wonder if we could use a common function to set all kind of filters? > > > > Thoughts are welcome. > > The rte_eth_dev_tunnel_filter_set() is a common filter function for > tunneling packet, which can set all kind of filters. I understand that. But my question was: could we have common functions for tunnel filters and (existing) generic filters? -- Thomas
[dpdk-dev] next releases
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cyril Chemparathy > Sent: Wednesday, August 27, 2014 6:05 PM > To: Thomas Monjalon; dev at dpdk.org > Subject: Re: [dpdk-dev] next releases > > Hi Thomas, > > On 8/25/2014 10:15 AM, Thomas Monjalon wrote: > > Hello all, > > > > I am back from holidays; thanks for all the > > patches/reviews/comments done during last weeks. > > > > I'd like to have a version 1.7.1, ideally at the end of this week. > > > > For the coming days, > > - first priority is to integrate bug fixes > > - some changes which do not imply API could be part of 1.7.1 > > - please, do not send more features until 1.8.0-rc1 > > - features that have been been *properly* reviewed or acked before > > end of august will be integrated in 1.8.0-rc1 > > - all pending features which do not have any review will be postponed > > after 1.8.0-rc1 > > - then rc2 will integrate new features if *properly* reviewed at that > > time > > > > I'd like to have some cleanups in version 1.8. Examples: > > - get rid of doxygen warnings > > - check if compile time options can be moved to run time > > - rename some options (CONFIG_RTE_LIBRTE_*_PMD -> > CONFIG_RTE_LIBRTE_PMD_*) > > - merge common code between linux and bsd implementions > > - check secondary process rights > > - remove drivers lists from code for easy integration of new drivers > > - use rte_eth_dev_atomic_read_link_status in drivers > > - use librte_cfgfile instead of examples/qos_sched > > - add sysfs functions as eal services > > - replace printf calls by rte functions > > - use new assert macros for unit tests > > - remove kni traces from bsd > > - remove bare metal traces > > - compress test_lpm*_routes.h > > Any thoughts on consolidating/cleaning up the timer interfaces? > > Usage across rte_rdtsc(), rte_get_tsc_cycles(), and > rte_get_timer_cycles() could use some rationalization, I think. > > It looks like most code should use rte_get_timer_cycles() and generally > honor user specified timer source selection. The relatively few places > that have a good cause to pin down on TSC should probably use > rte_get_tsc_cycles() instead of rde_rdtsc(). On the other hand, if > rte_rdtsc() is meant for direct use, why do we need the > rte_get_tsc_cycles() wrapper? > > Also, I'm not quite clear on the intended usage of rte_rdtsc_precise(). > I can't find uses of this function on master, and it is not quite clear > to me if the intent is to replace rte_rdtsc() in some (or all?) places > in the code. Any insights on this? > > Thanks > -- Cyril. No comments on any planned cleanup, but I can provide some background on the existing situation. (This is all to the best of my memory, since much of it dates back a few years and I'm not digging through old repos for the official logs :-) ) Originally there was an explicit rte_rdtsc() call to do timestamping and rough cycle counts, and an EAL provided timer api to do wall-time measurements using the HPET. Using the HPET proved more awkward than it needed to be (due to the fact that most distro kernels did not enable the MMAP_HPET option), and had less precision than using rdtsc. So we changed over to having the EAL timer use either HPET or TSC depending on what was available at boot. So we had two timer APIs both of which generally used the TSC, and no way to allow an application to explicitly request the HPET if that was the timesource we wanted. What was worse was that for compatibility across versions the EAL timers were accessed via APIs with HPET in the name - even when they returned values based off TSC. So we cleaned things up a bit, so that we had a truly generic set of APIs that had a function to return the frequency and cycles for the system-default time source, as well as specific functions to return that info for both HPET (if available) and TSC. The names were made as consistent as possible - so we have rte_get_tsc_cycles() to match the rte_get_timer_cycles() and rte_get_hpet_cycles() functions. The original rte_rdtsc() function was also kept around for backward compatibility, as it was widely used, so both it and rte_get_tsc_cycles() are indeed identical. However, I believe these are the only two duplicate functions in the API, with the wrapper function being just one line long, so it's not a big overhead. I wouldn't look to remove the wrapper function as it would break the consistency of the API for the sake of removing a single-line function, and I definitely wouldn't remove rte_rdtsc() because it's so widely used. As for usage, my recommendation that anything wanting to work with wall-time uses rte_get_timer_cycles() (as you suggest), and that anything doing unit tests which wants to get approximate cycle counts uses rte_rdtsc() as existing unit test code does. As for rte_rdtsc_precise, I'm not sure about its origins, but I'm su
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
2014-08-28 03:51, Wu, Jingjing: > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > 2014-08-27 10:13, Jingjing Wu: > > > add structure definition to construct programming packet. > > > > What is a "programming packet"? > > For Fortville, we need to set a flow director filter by sending a > packet which contains the input set values through the queue > belonging to flow director. OK. To be more clear, some detailed explanations are required in the commit log. Please try to be very descriptive. You can think comments like this: - if comments are absolutely needed to understand the code, you should put comments in the code (or write the code differently) - if some feature context can help for the review, you should explain context and design in the commit log > > > +#ifdef RTE_LIBRTE_I40E_PMD > > > + "i40e_flow_director_filter (port_id) (add|del)" > > > + " flow (ip4|ip6) src (src_ip_address) dst > > > (dst_ip_address)" > > > + " flexwords (flexwords_value) (drop|fwd)" > > > + " queue (queue_id) fd_id (fd_id_value)\n" > > > + "Add/Del a IP type flow director filter for i40e > > > NIC.\n\n" > > > + > > > + "i40e_flow_director_filter (port_id) (add|del)" > > > + " flow (udp4|tcp4|udp6|tcp6)" > > > + " src (src_ip_address) (src_port)" > > > + " dst (dst_ip_address) (dst_port)" > > > + " flexwords (flexwords_value) (drop|fwd)" > > > + " queue (queue_id) fd_id (fd_id_value)\n" > > > + "Add/Del a UDP/TCP type flow director filter for > > > i40e NIC.\n\n" > > > + > > > + "i40e_flush_flow_diretor (port_id)\n" > > > + "Flush all flow director entries of a device on > > > i40e NIC.\n\n" > > > +#endif /* RTE_LIBRTE_I40E_PMD */ > > > > I'd really like to stop seeing this kind of thing. > > We cannot add some ifdef for each PMD in generic code. > > > > I stopped reading after that. > > > > Sorry, I don't want to be rude but my feeling is that adding such feature > > with global picture in mind is not easy. I know you want to offer all i40e > > capabilities but you should think at future evolutions and how other drivers > > will be integrated with yours. > > > > Sorry to make you feel uncomfortable for such code. Just as you say, > I want to offer more i40e capabilities. I will rework code in testpmd. Thanks -- Thomas
[dpdk-dev] [PATCH v2 5/7] fix the Marco conflict
2014-08-28 03:39, Wu, Jingjing: > Because these macros such as IPPROTO_TCP, IPPROTO_UDP are already > defined in . If user's application include > and rte_ip.h at the same time, there will be conflict error, for > example cmdline.c in testpmd. Yes > I remember there was someone also raised this issue few month ago. Yes, and the question was: "should we totally remove these definitions"? I think yes. > So just use the way "#ifndef #endif" to avoid the conflict. But you didn't explain difference between _NETINET_IN_H and _NETINET_IN_H_. > And it is exactly workaround as you said. Yes, it's a workaround. If rte_ip.h is included before netinet/in.h, I think there is still a problem. -- Thomas
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
Hi, Thomas Thanks for your tips. I have another question: If we use the way 'rx_classification_filter_ctl' works, the specific structures defined in rte_i40e.h will be visible in user's application, such as testpmd. I know I shouldn't make commands linked with i40e like what I did before. But will the i40e specific structures become visible be acceptable? > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 4:51 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config > functions for > i40e flow director support > > 2014-08-28 03:51, Wu, Jingjing: > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > > 2014-08-27 10:13, Jingjing Wu: > > > > add structure definition to construct programming packet. > > > > > > What is a "programming packet"? > > > > For Fortville, we need to set a flow director filter by sending a > > packet which contains the input set values through the queue > > belonging to flow director. > > OK. To be more clear, some detailed explanations are required in the > commit log. Please try to be very descriptive. > You can think comments like this: > - if comments are absolutely needed to understand the code, you should > put comments in the code (or write the code differently) > - if some feature context can help for the review, you should explain > context and design in the commit log > > > > > +#ifdef RTE_LIBRTE_I40E_PMD > > > > + "i40e_flow_director_filter (port_id) (add|del)" > > > > + " flow (ip4|ip6) src (src_ip_address) dst > > > > (dst_ip_address)" > > > > + " flexwords (flexwords_value) (drop|fwd)" > > > > + " queue (queue_id) fd_id (fd_id_value)\n" > > > > + "Add/Del a IP type flow director filter for > > > > i40e NIC.\n\n" > > > > + > > > > + "i40e_flow_director_filter (port_id) (add|del)" > > > > + " flow (udp4|tcp4|udp6|tcp6)" > > > > + " src (src_ip_address) (src_port)" > > > > + " dst (dst_ip_address) (dst_port)" > > > > + " flexwords (flexwords_value) (drop|fwd)" > > > > + " queue (queue_id) fd_id (fd_id_value)\n" > > > > + "Add/Del a UDP/TCP type flow director > > > > filter for i40e NIC.\n\n" > > > > + > > > > + "i40e_flush_flow_diretor (port_id)\n" > > > > + "Flush all flow director entries of a > > > > device on i40e NIC.\n\n" > > > > +#endif /* RTE_LIBRTE_I40E_PMD */ > > > > > > I'd really like to stop seeing this kind of thing. > > > We cannot add some ifdef for each PMD in generic code. > > > > > > I stopped reading after that. > > > > > > Sorry, I don't want to be rude but my feeling is that adding such feature > > > with global picture in mind is not easy. I know you want to offer all i40e > > > capabilities but you should think at future evolutions and how other > > > drivers > > > will be integrated with yours. > > > > > > > Sorry to make you feel uncomfortable for such code. Just as you say, > > I want to offer more i40e capabilities. I will rework code in testpmd. > > Thanks > -- > Thomas
[dpdk-dev] [PATCHv3] librte_acl make it build/work for 'default' target
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin > Sent: Wednesday, August 27, 2014 8:19 PM > To: Neil Horman > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCHv3] librte_acl make it build/work for 'default' > target > > > > > -Original Message- > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > Sent: Wednesday, August 27, 2014 7:57 PM > > To: Ananyev, Konstantin > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com > > Subject: Re: [PATCHv3] librte_acl make it build/work for 'default' target > > > > On Wed, Aug 27, 2014 at 11:25:04AM +, Ananyev, Konstantin wrote: > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > Sent: Tuesday, August 26, 2014 6:45 PM > > > > To: Ananyev, Konstantin > > > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com > > > > Subject: Re: [PATCHv3] librte_acl make it build/work for 'default' > > > > target > > > > > > > > On Mon, Aug 25, 2014 at 04:30:05PM +, Ananyev, Konstantin wrote: > > > > > Hi Neil, > > > > > > > > > > > -Original Message- > > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > > > Sent: Thursday, August 21, 2014 9:15 PM > > > > > > To: dev at dpdk.org > > > > > > Cc: Ananyev, Konstantin; thomas.monjalon at 6wind.com; Neil Horman > > > > > > Subject: [PATCHv3] librte_acl make it build/work for 'default' > > > > > > target > > > > > > > > > > > > Make ACL library to build/work on 'default' architecture: > > > > > > - make rte_acl_classify_scalar really scalar > > > > > > (make sure it wouldn't use sse4 instrincts through > > > > > > resolve_priority()). > > > > > > - Provide two versions of rte_acl_classify code path: > > > > > > rte_acl_classify_sse() - could be build and used only on systems > > > > > > with > sse4.2 > > > > > > and upper, return -ENOTSUP on lower arch. > > > > > > rte_acl_classify_scalar() - a slower version, but could be build > > > > > > and used > > > > > > on all systems. > > > > > > - keep common code shared between these two codepaths. > > > > > > > > > > > > v2 chages: > > > > > > run-time selection of most appropriate code-path for given ISA. > > > > > > By default the highest supprted one is selected. > > > > > > User can still override that selection by manually assigning new > > > > > > value > to > > > > > > the global function pointer rte_acl_default_classify. > > > > > > rte_acl_classify() becomes a macro calling whatever > rte_acl_default_classify > > > > > > points to. > > > > > > > > > > > > > > > > I see you decided not to wait for me and fix everything by yourself :) > > > > > > > > > Yeah, sorry, I'm getting pinged about enabling these features in Fedora, > and it > > > > had been about 2 weeks, so I figured I'd just take care of it. > > > > > > No worries. I admit that it was a long delay from my side. > > > > > > > > > > > > > V3 Changes > > > > > > Updated classify pointer to be a function so as to better preserve > > > > > > ABI > > > > > > > > > > As I said in my previous mail it generates extra jump... > > > > > Though from numbers I got the performance impact is negligible: < 1%. > > > > > So I suppose, I don't have a good enough reason to object :) > > > > > > > > > Yeah, I just don't see a way around it. I was hoping that the compiler > would > > > > have been smart enough to see that the rte_acl_classify function was > > > > small > and > > > > in-linable, but apparently it won't do that. As you note however the > > > > performance change is minor (I'm guessing within a standard deviation of > your > > > > results). > > > > > > > > > Though I still think we better keep rte_acl_classify_scalar() > > > > > publically > available (same as we do for rte acl_classify_sse()): > > > > > First of all keep rte_acl_classify_scalar() is already part of our > > > > > public API. > > > > > Also, as I remember, one of the customers explicitly asked for scalar > version and they planned to call it directly. > > > > > Plus using rte_acl_select_classify() to always switch between > implementations is not always handy: > > > > > > > > I'm not exactly opposed to this, though it seems odd to me that a user > might > > > > want to call a particular version of the classifier directly. But I > > > > certainly > > > > can't predict everything a consumer wants to do. If we really need to > > > > keep > it > > > > public then, it begs the question, is providing a generic entry point > > > > even > > > > worthwhile? Is it just as easy to expose the scalar/sse and any future > versions > > > > directly so the application can just embody the intellegence to select > > > > the > best > > > > path? That saves us having to maintain another API point. I can go > > > > with > > > > consensus on that. > > > > > > > > > - it is global, which means that we can't simultaneously use > classify_scalar() and classify_sse() for 2 different ACL contexts. > > > > > - to properly support s
[dpdk-dev] next releases
2014-08-28 08:41, Richardson, Bruce: > As for rte_rdtsc_precise, I'm not sure about its origins, but I'm > surprised to see that it does not correspond to the rdtscp instruction. > Can anyone else comment on this one? I would assume it's designed > to be used to get more accurate measurements of smaller blocks of > code that we want to benchmark, since rdtsc works best when timing > larger blocks (in terms of cycle counts, that is, not source lines :-) ). The good thing with git history (and well written commit logs) is that we can easily get such answer: http://dpdk.org/browse/dpdk/commit/?id=3314648f83c3dc06d7d9a Bruce, do you know how rdtscp is supported across Intel processors? -- Thomas
[dpdk-dev] next releases
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 10:06 AM > To: Richardson, Bruce > Cc: Cyril Chemparathy; dev at dpdk.org > Subject: Re: [dpdk-dev] next releases > > 2014-08-28 08:41, Richardson, Bruce: > > As for rte_rdtsc_precise, I'm not sure about its origins, but I'm > > surprised to see that it does not correspond to the rdtscp instruction. > > Can anyone else comment on this one? I would assume it's designed > > to be used to get more accurate measurements of smaller blocks of > > code that we want to benchmark, since rdtsc works best when timing > > larger blocks (in terms of cycle counts, that is, not source lines :-) ). > > The good thing with git history (and well written commit logs) is that > we can easily get such answer: > http://dpdk.org/browse/dpdk/commit/?id=3314648f83c3dc06d7d9a Indeed! I had mistakenly assumed that the function was older than the dpdk.org history. My bad on that one, and thanks for the link. :-) > > Bruce, do you know how rdtscp is supported across Intel processors? Sorry, no I don't, and the instruction set manuals only say to check the relevant cpuid bit to see if it's supported. :-(. If the existing code works ok, there is probably no compelling reason to look to change it though. > > -- > Thomas
[dpdk-dev] [ethdev] Multiple devices with single PCI
Hello. I found that DPDK has an abstraction for having multiple devices with single PCI. (RTE_PCI_DRV_MULTIPLE flag) However, their is a naming collision while registering multiple devices. Here is my possible solution. Best regards. Keunhong. = commit b4fb08c42584283a7c5fbb251ab23f0e2b5f099e Author: leeopop 2014-07-24 22:28:12 Committer: leeopop 2014-07-24 22:28:12 Parent: bcc733c4780a007f56564277a79309c427367cc2 (ethdev: fix build of named allocation debug) Branches: master, github/master multiple dev support lib/librte_ether/rte_ethdev.c diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index fd1010a..06dda6b 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -204,8 +204,12 @@ eth_drv = (struct eth_driver *)pci_drv; /* Create unique Ethernet device name using PCI address */ - snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d", - pci_dev->addr.bus, pci_dev->addr.devid, pci_dev->addr.function); + if(pci_drv->drv_flags & RTE_PCI_DRV_MULTIPLE) + snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d-%d", + pci_dev->addr.bus, pci_dev->addr.devid, pci_dev->addr.function, nb_ports); + else + snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d", + pci_dev->addr.bus, pci_dev->addr.devid, pci_dev->addr.function); eth_dev = rte_eth_dev_allocate(ethdev_name); if (eth_dev == NULL)
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
2014-08-28 03:30, Wu, Jingjing: > We want to implement several common API for NIC specific features, > to avoid creating quite a lot of ops in 'struct eth_dev_ops'. > The idea came from ioctl. The approach can be interesting. > Because about flow director feature, there is large gap between i40e > and ixgbe. The existed flow director APIs looks specific to IXGBE, > so I choose this new API to implement i40e's flow director feature. The API is not OK for you so you create another one. I'm OK to change APIs but you should remove the old one, or at least, implement your new API in existing drivers to allow deprecation of the old API. I think it would help if you start by doing ixgbe work and then apply it to i40e. > The API is like below: > typedef int (*eth_rx_classification_filter_ctl_t)(struct rte_eth_dev *dev, > enum rte_eth_command cmd, > void *arg); > Define a head file called rte_i40e.h in lib/librte_pmd_i40e, which contains > the definition about structures specific to i40e, linked to the arg > parameters above. > Define a head file called rte_eth_features.h in lib/librte_ether, which > contains the commands' definition linked to cmd parameters above. Why creating a rte_eth_features.h? Don't you think rte_ethdev.h is a good place? > And if user want to use i40e specific features, then the head file > rte_i40e.h need to be included user's application, for example, test-pmd. > And user can encode these structures and call XXX_ctl API to configure > their features. OK, but the question is to know what is a specific feature? I don't think flow director is a specific feature. We shouldn't have to care if port is i40e or ixgbe to setup flow director. Is it possible to have a common API and maybe an inheritance of the common structure with PMD specific fields? Example: struct fdir_entry { fdir_input input; fdir_action action; enum rte_driver driver; }; fdir_entry_generic fdir_entry = {.driver = RTE_DRIVER_GENERIC}; filter_ctl(port, FDIR_RULE_ADD, fdir_entry); struct i40e_fdir_entry { struct fdir_entry generic; i40e_fdir_input i40e_input; }; So i40e_input will be handled by the PMD if driver == RTE_DRIVER_I40E. It's just an idea, comments are welcome. -- Thomas
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
2014-08-28 09:01, Wu, Jingjing: > I have another question: > If we use the way 'rx_classification_filter_ctl' works, the specific > structures defined in rte_i40e.h will be visible in user's application, > such as testpmd. > I know I shouldn't make commands linked with i40e like what I did before. > But will the i40e specific structures become visible be acceptable? I think testpmd should be limited to generic API. So it wouldn't be acceptable to be dependent of i40e files. But having some specific i40e tests in examples or app/test is OK. -- Thomas
[dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter in librte_ether
Hi Thomas, > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 4:37 PM > To: Liu, Jijiang > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter > in librte_ether > > 2014-08-28 00:55, Liu, Jijiang: > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > > I wonder if we could use a common function to set all kind of filters? > > > > > > Thoughts are welcome. > > > > The rte_eth_dev_tunnel_filter_set() is a common filter function for > > tunneling packet, which can set all kind of filters. > > I understand that. But my question was: could we have common functions for > tunnel filters and (existing) generic filters? > > -- > Thomas Tunneling packet is encapsulated format, in order to extend another tunneling type support and distinguish between tunneling and non-tunneling packet, we had better provide independent common filter API. The existing generic filter function is related to normal L2 packet filter(non-tunneling packet). If there are two kind of filter APIs, one is for normal L2 packet, other is for tunneling packet, which will make user more clear when and how to use them. Thanks Jijiang Liu
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Thursday, August 28, 2014 12:01 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config > functions for i40e flow director support > > 2014-08-28 09:01, Wu, Jingjing: > > I have another question: > > If we use the way 'rx_classification_filter_ctl' works, the specific > > structures defined in rte_i40e.h will be visible in user's application, > > such as testpmd. > > I know I shouldn't make commands linked with i40e like what I did before. > > But will the i40e specific structures become visible be acceptable? > > I think testpmd should be limited to generic API. > So it wouldn't be acceptable to be dependent of i40e files. > But having some specific i40e tests in examples or app/test is OK. > Probably I didn't get you right: Are you suggesting to have a new clone of testpmd for any new device we are going to support? That seems like too much hassle to me. Plus what to do if someone would like to test configuration with two different devices involved: ixgbe and i40e for example? I suggest we keep one testpmd for all devices we support. Of course we'll probably have to make some rework to avoid if (strncmp(drv_name, "xxx") spread all over it. We need to find some better way to discover/setup HW specific features. Thanks Konstantin
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Thursday, August 28, 2014 11:56 AM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > 2014-08-28 03:30, Wu, Jingjing: > > We want to implement several common API for NIC specific features, > > to avoid creating quite a lot of ops in 'struct eth_dev_ops'. > > The idea came from ioctl. > > The approach can be interesting. > > > Because about flow director feature, there is large gap between i40e > > and ixgbe. The existed flow director APIs looks specific to IXGBE, > > so I choose this new API to implement i40e's flow director feature. > > The API is not OK for you so you create another one. > I'm OK to change APIs but you should remove the old one, or at least, > implement your new API in existing drivers to allow deprecation of the > old API. > I think it would help if you start by doing ixgbe work and then apply it > to i40e. > > > The API is like below: > > typedef int (*eth_rx_classification_filter_ctl_t)(struct rte_eth_dev *dev, > > enum rte_eth_command cmd, > > void *arg); > > Define a head file called rte_i40e.h in lib/librte_pmd_i40e, which contains > > the definition about structures specific to i40e, linked to the arg > > parameters above. > > Define a head file called rte_eth_features.h in lib/librte_ether, which > > contains the commands' definition linked to cmd parameters above. > > Why creating a rte_eth_features.h? Don't you think rte_ethdev.h is a good > place? As I remember the long term idea was (Jingjing please correct me, if I am wrong here): Keep rte_ethdev.h for generic API. Put features specific for different NICs into rte_eth_features.h To make things clearer and avoid polluting of rte_ethdev.h. Provide API for the upper layer to query list of specific features(commands) supported by the underlying device. Something like: rte_eth_dev_get_rx_filter_supported(port, ...); And ioctl-type API to configure HW specific features: rte_eth_dev_rx_classification_filter_ctl(port, cmd, cmd_spedific_arg); So, instead of application knows in advance "which specific NIC it is using", application would query which features/commannds the NIC provides and then configure them appropriately. > > > And if user want to use i40e specific features, then the head file > > rte_i40e.h need to be included user's application, for example, test-pmd. > > And user can encode these structures and call XXX_ctl API to configure > > their features. > > OK, but the question is to know what is a specific feature? > I don't think flow director is a specific feature. We shouldn't have > to care if port is i40e or ixgbe to setup flow director. > Is it possible to have a common API and maybe an inheritance of the > common structure with PMD specific fields? > > Example: > > struct fdir_entry { > fdir_input input; > fdir_action action; > enum rte_driver driver; > }; > fdir_entry_generic fdir_entry = {.driver = RTE_DRIVER_GENERIC}; > filter_ctl(port, FDIR_RULE_ADD, fdir_entry); > > struct i40e_fdir_entry { > struct fdir_entry generic; > i40e_fdir_input i40e_input; > }; > > So i40e_input will be handled by the PMD if driver == RTE_DRIVER_I40E. > > It's just an idea, comments are welcome. > > -- > Thomas
[dpdk-dev] [PATCH v2 7/7]app/testpmd: add commands and config functions for i40e flow director support
2014-08-28 11:30, Ananyev, Konstantin: > From: Thomas Monjalon > > 2014-08-28 09:01, Wu, Jingjing: > > > I have another question: > > > If we use the way 'rx_classification_filter_ctl' works, the specific > > > structures defined in rte_i40e.h will be visible in user's application, > > > such as testpmd. > > > I know I shouldn't make commands linked with i40e like what I did before. > > > But will the i40e specific structures become visible be acceptable? > > > > I think testpmd should be limited to generic API. > > So it wouldn't be acceptable to be dependent of i40e files. > > But having some specific i40e tests in examples or app/test is OK. > > > > Probably I didn't get you right: Indeed ;) > Are you suggesting to have a new clone of testpmd for any new device > we are going to support? No. I say there shouldn't be any PMD dependency on testpmd. It means we should use only generic API. > That seems like too much hassle to me. > Plus what to do if someone would like to test configuration with two > different devices involved: ixgbe and i40e for example? ixgbe and i40e features should use the same generic API for flow director. > I suggest we keep one testpmd for all devices we support. > Of course we'll probably have to make some rework to avoid > if (strncmp(drv_name, "xxx") spread all over it. > We need to find some better way to discover/setup HW specific features. Agreed -- Thomas
[dpdk-dev] [PATCH v2 3/7]rte_ether:add API of VxLAN packet filter in librte_ether
2014-08-28 11:02, Liu, Jijiang: > From: Thomas Monjalon > > 2014-08-28 00:55, Liu, Jijiang: > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > > > I wonder if we could use a common function to set all kind of filters? > > > > > > > > Thoughts are welcome. > > > > > > The rte_eth_dev_tunnel_filter_set() is a common filter function for > > > tunneling packet, which can set all kind of filters. > > > > I understand that. But my question was: could we have common functions for > > tunnel filters and (existing) generic filters? > > Tunneling packet is encapsulated format, in order to extend another > tunneling type support and distinguish between tunneling and non-tunneling > packet, we had better provide independent common filter API. > The existing generic filter function is related to normal L2 packet > filter(non-tunneling packet). If there are two kind of filter APIs, one is > for normal L2 packet, other is for tunneling packet, which will make user > more clear when and how to use them. So I ask wether it is possible to merge 2 functions and you answer they are different functions. I know they are different. I don't see why you can merge all different tunnels filtering in one function and not merge them with other L2 filters type? Should we wait that someone suggest a new API for HTTP filtering (for a new shiny NIC) to think about how all filters could be configured through a common API? We already have flow director, syn filter, ethertype filter, 2-tuple filter, 5-tuple filter, flex filter and you want to add tunnel filter. Last time, filters was called "generic filter" and I asked to think about how it was generic: http://dpdk.org/ml/archives/dev/2014-May/002859.html As I suspected, same story restart. I want these new i40e features as much as you. But it's not responsible to let API becoming a mess like that. All these filters must be reworked (including flow director). -- Thomas
[dpdk-dev] [ethdev] Multiple devices with single PCI
Hello, On Thu, Aug 28, 2014 at 12:54 PM, ??? wrote: > I found that DPDK has an abstraction for having multiple devices with > single PCI. > (RTE_PCI_DRV_MULTIPLE flag) > However, their is a naming collision while registering multiple devices. > Here is my possible solution. > - Actually, I think this flag could just disappear. It had been added at a time when ethdev objects could only be created through eth_drivers. Right now, a pci driver can create ethdev objects using rte_eth_dev_allocate(thenameyouwant). So I would rather propose you convert your driver from a eth_driver to a pci_driver. It is not that hard, you only need to allocate your own private structure and ensure the mandatory fields are filled. You can look at rte_eth_dev_init() to see what is needed. - Thomas, RTE_PCI_DRV_MULTIPLE deprecation for 1.8.x ? -- David Marchand
[dpdk-dev] [ethdev] Multiple devices with single PCI
2014-08-28 14:42, David Marchand: > - Thomas, RTE_PCI_DRV_MULTIPLE deprecation for 1.8.x ? Yes, but I don't know how to deprecate a flag. It could be a simple removal. -- Thomas
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
Hi, Thomas Please see my comments below. Thanks a lot. > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 6:56 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > 2014-08-28 03:30, Wu, Jingjing: > > We want to implement several common API for NIC specific features, > > to avoid creating quite a lot of ops in 'struct eth_dev_ops'. > > The idea came from ioctl. > > The approach can be interesting. Yes, we have discussed this approach inside. And some other Fortville Features are also based on it, such as RSS hash function support, mac vlan filters. Maybe it a good chance to discuss in open forum now. > > > Because about flow director feature, there is large gap between i40e > > and ixgbe. The existed flow director APIs looks specific to IXGBE, > > so I choose this new API to implement i40e's flow director feature. > > The API is not OK for you so you create another one. > I'm OK to change APIs but you should remove the old one, or at least, > implement your new API in existing drivers to allow deprecation of the > old API. > I think it would help if you start by doing ixgbe work and then apply it > to i40e. > Yes, it will be perfect if we can use this new API to achieve flow director setting all types of NICs. But the concern is downward compatibility. Users who is planning update DPDK version need to change their code to adapt such changes. That's why we choose a new API instead of modifying current APIs. And Of course, the ideal plan is adding such XXX_ctl function in Ixgbe and Igb to moving smoothly without removing current APIs. I'm not sure whether I understanding correctly about the importance of compatibility. > > The API is like below: > > typedef int (*eth_rx_classification_filter_ctl_t)(struct rte_eth_dev *dev, > > enum rte_eth_command cmd, > > void *arg); > > Define a head file called rte_i40e.h in lib/librte_pmd_i40e, which contains > > the definition about structures specific to i40e, linked to the arg > > parameters above. > > Define a head file called rte_eth_features.h in lib/librte_ether, which > > contains the commands' definition linked to cmd parameters above. > > Why creating a rte_eth_features.h? Don't you think rte_ethdev.h is a good > place? > > > And if user want to use i40e specific features, then the head file > > rte_i40e.h need to be included user's application, for example, test-pmd. > > And user can encode these structures and call XXX_ctl API to configure > > their features. > > OK, but the question is to know what is a specific feature? > I don't think flow director is a specific feature. We shouldn't have > to care if port is i40e or ixgbe to setup flow director. > Is it possible to have a common API and maybe an inheritance of the > common structure with PMD specific fields? > Yes, flow director is not a specific feature. Even ixgbe and i40 use the same name. But the context and key have much difference. That's why I called it specific. > Example: > > struct fdir_entry { > fdir_input input; > fdir_action action; > enum rte_driver driver; > }; > fdir_entry_generic fdir_entry = {.driver = RTE_DRIVER_GENERIC}; > filter_ctl(port, FDIR_RULE_ADD, fdir_entry); > > struct i40e_fdir_entry { > struct fdir_entry generic; > i40e_fdir_input i40e_input; > }; > > So i40e_input will be handled by the PMD if driver == RTE_DRIVER_I40E. > > It's just an idea, comments are welcome. Yes, it's a good idea about an inheritance of the common structure. I think it may support new NIC integration in future. We can do it with the new API architecture. But the concern is still how to be compatible with old version. > -- > Thomas
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
Hi, Konstantin Thanks a lot. > -Original Message- > From: Ananyev, Konstantin > Sent: Thursday, August 28, 2014 7:49 PM > To: Thomas Monjalon; Wu, Jingjing > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > > Sent: Thursday, August 28, 2014 11:56 AM > > To: Wu, Jingjing > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > > > 2014-08-28 03:30, Wu, Jingjing: > > > We want to implement several common API for NIC specific features, > > > to avoid creating quite a lot of ops in 'struct eth_dev_ops'. > > > The idea came from ioctl. > > > > The approach can be interesting. > > > > > Because about flow director feature, there is large gap between i40e > > > and ixgbe. The existed flow director APIs looks specific to IXGBE, > > > so I choose this new API to implement i40e's flow director feature. > > > > The API is not OK for you so you create another one. > > I'm OK to change APIs but you should remove the old one, or at least, > > implement your new API in existing drivers to allow deprecation of the > > old API. > > I think it would help if you start by doing ixgbe work and then apply it > > to i40e. > > > > > The API is like below: > > > typedef int (*eth_rx_classification_filter_ctl_t)(struct rte_eth_dev *dev, > > > enum rte_eth_command cmd, > > > void *arg); > > > Define a head file called rte_i40e.h in lib/librte_pmd_i40e, which > > > contains > > > the definition about structures specific to i40e, linked to the arg > > > parameters above. > > > Define a head file called rte_eth_features.h in lib/librte_ether, which > > > contains the commands' definition linked to cmd parameters above. > > > > Why creating a rte_eth_features.h? Don't you think rte_ethdev.h is a good > > place? > > As I remember the long term idea was > (Jingjing please correct me, if I am wrong here): > > Keep rte_ethdev.h for generic API. > Put features specific for different NICs into rte_eth_features.h > To make things clearer and avoid polluting of rte_ethdev.h. > > Provide API for the upper layer to query list of specific features(commands) > supported by the > underlying device. > Something like: > rte_eth_dev_get_rx_filter_supported(port, ...); Yes, in helin's patch "[dpdk-dev] [PATCH v2 2/6] ethdev: add new ops of 'is_command_supported' and 'rx_classification_filter_ctl'", he already defined rte_eth_dev_is_command_supported. It can be used to check whether such command can be supported by the queried port. Actually, some features are based on this architecture, including helin's "Support configuring hash functions" and other non-ready patch. We supposed after any patch of ours is applied, others need to rework then. We are using the same approach. > And ioctl-type API to configure HW specific features: > rte_eth_dev_rx_classification_filter_ctl(port, cmd, cmd_spedific_arg); > > So, instead of application knows in advance "which specific NIC it is using", > application would query which features/commannds the NIC provides and then > configure > them appropriately. > > > > > > And if user want to use i40e specific features, then the head file > > > rte_i40e.h need to be included user's application, for example, test-pmd. > > > And user can encode these structures and call XXX_ctl API to configure > > > their features. > > > > OK, but the question is to know what is a specific feature? > > I don't think flow director is a specific feature. We shouldn't have > > to care if port is i40e or ixgbe to setup flow director. > > Is it possible to have a common API and maybe an inheritance of the > > common structure with PMD specific fields? > > > > Example: > > > > struct fdir_entry { > > fdir_input input; > > fdir_action action; > > enum rte_driver driver; > > }; > > fdir_entry_generic fdir_entry = {.driver = RTE_DRIVER_GENERIC}; > > filter_ctl(port, FDIR_RULE_ADD, fdir_entry); > > > > struct i40e_fdir_entry { > > struct fdir_entry generic; > > i40e_fdir_input i40e_input; > > }; > > > > So i40e_input will be handled by the PMD if driver == RTE_DRIVER_I40E. > > > > It's just an idea, comments are welcome. > > > > -- > > Thomas
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
2014-08-28 13:39, Wu, Jingjing: > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > I'm OK to change APIs but you should remove the old one, or at least, > > implement your new API in existing drivers to allow deprecation of the > > old API. > > I think it would help if you start by doing ixgbe work and then apply it > > to i40e. > > > > Yes, it will be perfect if we can use this new API to achieve flow director > setting all types of NICs. But the concern is downward compatibility. In this case, cleanup is more important than compatibility. > Users who is planning update DPDK version need to change their code > to adapt such changes. Yes, but we can keep deprecated function during 1 release. > That's why we choose a new API instead of modifying current APIs. And > Of course, the ideal plan is adding such XXX_ctl function in Ixgbe and > Igb to moving smoothly without removing current APIs. Yes > > I don't think flow director is a specific feature. We shouldn't have > > to care if port is i40e or ixgbe to setup flow director. > > Is it possible to have a common API and maybe an inheritance of the > > common structure with PMD specific fields? > > Yes, flow director is not a specific feature. Even ixgbe and i40 use the same > name. But the context and key have much difference. That's why I called it > specific. > > Yes, it's a good idea about an inheritance of the common structure. I think it > may support new NIC integration in future. We can do it with the new API > architecture. But the concern is still how to be compatible with old version. There is no compatibility blocker here. If we can keep deprecated functions a while, we'll do. Otherwise, just go with the new API. I prefer we concentrate on good design rather than on compatibility. Thanks -- Thomas
[dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API rx_classification_filter_ctl
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 10:21 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 2/7] ethdev: define new ethdev API > rx_classification_filter_ctl > > 2014-08-28 13:39, Wu, Jingjing: > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > > I'm OK to change APIs but you should remove the old one, or at least, > > > implement your new API in existing drivers to allow deprecation of the > > > old API. > > > I think it would help if you start by doing ixgbe work and then apply it > > > to i40e. > > > > > > > Yes, it will be perfect if we can use this new API to achieve flow director > > setting all types of NICs. But the concern is downward compatibility. > > In this case, cleanup is more important than compatibility. > > > Users who is planning update DPDK version need to change their code > > to adapt such changes. > > Yes, but we can keep deprecated function during 1 release. > > > That's why we choose a new API instead of modifying current APIs. And > > Of course, the ideal plan is adding such XXX_ctl function in Ixgbe and > > Igb to moving smoothly without removing current APIs. > > Yes > > > > I don't think flow director is a specific feature. We shouldn't have > > > to care if port is i40e or ixgbe to setup flow director. > > > Is it possible to have a common API and maybe an inheritance of the > > > common structure with PMD specific fields? > > > > Yes, flow director is not a specific feature. Even ixgbe and i40 use the > > same > > name. But the context and key have much difference. That's why I called it > > specific. > > > > Yes, it's a good idea about an inheritance of the common structure. I think > > it > > may support new NIC integration in future. We can do it with the new API > > architecture. But the concern is still how to be compatible with old > > version. > > There is no compatibility blocker here. > If we can keep deprecated functions a while, we'll do. Otherwise, just go with > the new API. > I prefer we concentrate on good design rather than on compatibility. > OK, now I have a rough understanding about your opinion, I guess there will be lots of rework need to be done. I will try. Thanks for your explanation. > Thanks > -- > Thomas Thanks Jingjing
[dpdk-dev] [PATCH v2 5/7] fix the Marco conflict
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, August 28, 2014 4:56 PM > To: Wu, Jingjing > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 5/7] fix the Marco conflict > > 2014-08-28 03:39, Wu, Jingjing: > > Because these macros such as IPPROTO_TCP, IPPROTO_UDP are already > > defined in . If user's application include > > and rte_ip.h at the same time, there will be conflict error, for > > example cmdline.c in testpmd. > > Yes > > > I remember there was someone also raised this issue few month ago. > > Yes, and the question was: "should we totally remove these definitions"? > I think yes. > Yes, it will be clear. But it also provides a way to user who doesn't use netinet/in.h. > > So just use the way "#ifndef #endif" to avoid the conflict. > > But you didn't explain difference between _NETINET_IN_H and _NETINET_IN_H_. It is due to the different versions of in.h, some use _NETINET_IN_H_ to define the head file, while some use _NETINET_IN_H > > > And it is exactly workaround as you said. > > Yes, it's a workaround. > If rte_ip.h is included before netinet/in.h, I think there is still a problem. Yes, it's just workaround. As my understanding, in DPDK's source code System head files includes first should be as following. So I think it's OK then. > > -- > Thomas Can I send a separate patch for this? Because it has no strict relationship with flow director. Thanks Jingjing
[dpdk-dev] [PATCH v2 5/7] fix the Marco conflict
2014-08-28 14:37, Wu, Jingjing: > Can I send a separate patch for this? > Because it has no strict relationship with flow director. Yes, please. But I think you should look at the removal option (no redefinition of at all). Thanks -- Thomas
[dpdk-dev] [PATCH] ixgbe: Make vector stores unaligned
> When writing to the mbuf array for receiving packets, do not assume > 16-byte alignment by using aligned stores. If the pointers are only > 8-byte aligned, the program will crash due to incorrect alignment. > Changing "store" to "storeu" fixes this. > > Signed-off-by: Bruce Richardson Acked-by: Thomas Monjalon Applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH] fix unix permissions for source files
> > No need for that 'x bit' on source files. > > > > Signed-off-by: David Marchand > > Acked-by: Bruce Richardson Applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH v2 0/6] Mbuf structure Rework, part 1
This patch set does some initial pre-work to prepare the mbuf data structure (and ixgbe vector driver to a lesser extent) for more major changes which will follow on in a subsequent patch set. [See previous RFC patch set for more indications of the future coming changes]. The main changes here are the flattening out of the mbuf data structure, with much of it based off work by Olivier. The ctrlmbuf and pktmbuf structures are now gone, as is the vlan_macip structure. However, in this set, the concept of having a separate ctrl mbuf type is kept around. The plan is in a later set when we expand the flags field to 64-bits, we can use a single bit in the flags to indicate a control packet. For now, though, the ctrlmbuf functions and macros just are aliases for the pktmbuf equivalents as much as possible. Changes in V2: * Fix newly-introduced style issues flagged by checkpatch * Update to apply cleanly to latest head Bruce Richardson (3): ixgbe: put only non-zero initializer in definition mbuf: rename in_port to just port mbuf: flatten struct vlan_macip into mbuf struct Olivier Matz (3): mbuf: rename RTE_MBUF_SCATTER_GATHER into RTE_MBUF_REFCNT mbuf: remove rte_ctrlmbuf mbuf: remove the rte_pktmbuf structure app/test-pmd/cmdline.c | 2 - app/test-pmd/csumonly.c| 6 +- app/test-pmd/flowgen.c | 22 +- app/test-pmd/icmpecho.c| 4 +- app/test-pmd/ieee1588fwd.c | 6 +- app/test-pmd/macfwd-retry.c| 2 +- app/test-pmd/macfwd.c | 8 +- app/test-pmd/macswap.c | 8 +- app/test-pmd/rxonly.c | 13 +- app/test-pmd/testpmd.c | 12 +- app/test-pmd/testpmd.h | 2 +- app/test-pmd/txonly.c | 44 +-- app/test/commands.c| 2 - app/test/packet_burst_generator.c | 46 +-- app/test/test_distributor.c| 18 +- app/test/test_distributor_perf.c | 4 +- app/test/test_mbuf.c | 100 ++- app/test/test_sched.c | 4 +- app/test/test_table_acl.c | 6 +- app/test/test_table_pipeline.c | 4 +- config/common_bsdapp | 2 +- config/common_linuxapp | 2 +- doc/doxy-api.conf | 2 +- examples/Makefile | 4 +- examples/dpdk_qat/crypto.c | 22 +- examples/dpdk_qat/main.c | 2 +- examples/exception_path/main.c | 10 +- examples/ip_fragmentation/Makefile | 4 +- examples/ip_fragmentation/main.c | 6 +- examples/ip_pipeline/Makefile | 2 +- examples/ip_pipeline/cmdline.c | 44 +-- examples/ip_pipeline/init.c| 2 +- examples/ip_pipeline/main.c| 2 +- examples/ip_pipeline/pipeline_firewall.c | 4 +- .../ip_pipeline/pipeline_flow_classification.c | 4 +- examples/ip_pipeline/pipeline_routing.c| 4 +- examples/ip_pipeline/pipeline_rx.c | 8 +- examples/ip_pipeline/pipeline_tx.c | 2 +- examples/ip_reassembly/main.c | 8 +- examples/ipv4_multicast/Makefile | 4 +- examples/ipv4_multicast/main.c | 18 +- examples/l3fwd-acl/main.c | 2 +- examples/l3fwd-power/main.c| 2 +- examples/l3fwd-vf/main.c | 2 +- examples/l3fwd/main.c | 10 +- examples/load_balancer/runtime.c | 2 +- .../client_server_mp/mp_client/client.c| 2 +- examples/quota_watermark/qw/main.c | 4 +- examples/vhost/main.c | 74 ++--- examples/vhost_xen/main.c | 22 +- lib/librte_distributor/rte_distributor.c | 2 +- lib/librte_ip_frag/ip_frag_common.h| 13 +- lib/librte_ip_frag/rte_ipv4_fragmentation.c| 40 +-- lib/librte_ip_frag/rte_ipv4_reassembly.c | 6 +- lib/librte_ip_frag/rte_ipv6_fragmentation.c| 38 +-- lib/librte_ip_frag/rte_ipv6_reassembly.c | 5 +- lib/librte_mbuf/rte_mbuf.c | 84 ++ lib/librte_mbuf/rte_mbuf.h | 311 + lib/librte_pmd_bond/rte_eth_bond_pmd.c | 4 +- lib/librte_pmd_e1000/em_rxtx.c | 98 --- lib/librte_pmd_e1000/igb_rxtx.c| 103 ++
[dpdk-dev] [PATCH v2 3/6] mbuf: remove rte_ctrlmbuf
From: Olivier Matz The initial role of rte_ctrlmbuf is to carry generic messages (data pointer + data length) but it's not used by the DPDK or it applications. Keeping it implies: - loosing 1 byte in the rte_mbuf structure - having some dead code rte_mbuf.[ch] This patch removes this feature. Thanks to it, it is now possible to simplify the rte_mbuf structure by merging the rte_pktmbuf structure in it. This is done in next commit. Signed-off-by: Olivier Matz * Updated patch to HEAD. * Modified patch to retain the old function names for ctrl mbufs as macros. This helps with app compatibility, and allows the concept of a control mbuf to be reintroduced via a single-bit flag in a future change. * Updated the packet framework ip_pipeline example application to work following this change. Changes in v2: * Fixed whitespace errors introduced by this patch flagged by checkpatch Signed-off-by: Bruce Richardson --- app/test-pmd/cmdline.c | 1 - app/test-pmd/flowgen.c | 2 +- app/test-pmd/testpmd.c | 2 - app/test-pmd/txonly.c | 2 +- app/test/commands.c| 1 - app/test/test_mbuf.c | 72 + examples/ip_pipeline/cmdline.c | 44 examples/ip_pipeline/init.c| 2 +- examples/ip_pipeline/pipeline_firewall.c | 4 +- .../ip_pipeline/pipeline_flow_classification.c | 4 +- examples/ip_pipeline/pipeline_routing.c| 4 +- examples/ip_pipeline/pipeline_rx.c | 4 +- examples/ipv4_multicast/main.c | 2 +- lib/librte_mbuf/rte_mbuf.c | 60 +++ lib/librte_mbuf/rte_mbuf.h | 119 +++-- lib/librte_pmd_e1000/em_rxtx.c | 2 +- lib/librte_pmd_e1000/igb_rxtx.c| 2 +- lib/librte_pmd_i40e/i40e_rxtx.c| 4 +- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 4 +- lib/librte_pmd_virtio/virtio_rxtx.c| 2 +- lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 2 +- lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 2 +- 22 files changed, 97 insertions(+), 244 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 345be11..6dd576a 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -6377,7 +6377,6 @@ dump_struct_sizes(void) #define DUMP_SIZE(t) printf("sizeof(" #t ") = %u\n", (unsigned)sizeof(t)); DUMP_SIZE(struct rte_mbuf); DUMP_SIZE(struct rte_pktmbuf); - DUMP_SIZE(struct rte_ctrlmbuf); DUMP_SIZE(struct rte_mempool); DUMP_SIZE(struct rte_ring); #undef DUMP_SIZE diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c index 04911ff..a8f2a65 100644 --- a/app/test-pmd/flowgen.c +++ b/app/test-pmd/flowgen.c @@ -96,7 +96,7 @@ tx_mbuf_alloc(struct rte_mempool *mp) struct rte_mbuf *m; m = __rte_mbuf_raw_alloc(mp); - __rte_mbuf_sanity_check_raw(m, RTE_MBUF_PKT, 0); + __rte_mbuf_sanity_check_raw(m, 0); return (m); } diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index e8a4b45..5368d01 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -398,13 +398,11 @@ testpmd_mbuf_ctor(struct rte_mempool *mp, mb_ctor_arg = (struct mbuf_ctor_arg *) opaque_arg; mb = (struct rte_mbuf *) raw_mbuf; - mb->type = RTE_MBUF_PKT; mb->pool = mp; mb->buf_addr = (void *) ((char *)mb + mb_ctor_arg->seg_buf_offset); mb->buf_physaddr = (uint64_t) (rte_mempool_virt2phy(mp, mb) + mb_ctor_arg->seg_buf_offset); mb->buf_len = mb_ctor_arg->seg_buf_size; - mb->type = RTE_MBUF_PKT; mb->ol_flags = 0; mb->pkt.data = (char *) mb->buf_addr + RTE_PKTMBUF_HEADROOM; mb->pkt.nb_segs = 1; diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c index ef93741..d634096 100644 --- a/app/test-pmd/txonly.c +++ b/app/test-pmd/txonly.c @@ -93,7 +93,7 @@ tx_mbuf_alloc(struct rte_mempool *mp) struct rte_mbuf *m; m = __rte_mbuf_raw_alloc(mp); - __rte_mbuf_sanity_check_raw(m, RTE_MBUF_PKT, 0); + __rte_mbuf_sanity_check_raw(m, 0); return (m); } diff --git a/app/test/commands.c b/app/test/commands.c index 5f23420..f38d419 100644 --- a/app/test/commands.c +++ b/app/test/commands.c @@ -276,7 +276,6 @@ dump_struct_sizes(void) #define DUMP_SIZE(t) printf("sizeof(" #t ") = %u\n", (unsigned)sizeof(t)); DUMP_SIZE(struct rte_mbuf); DUMP_SIZE(struct rte_pktmbuf); - DUMP_SIZE(struct rte_ctrlmbuf); DUMP_SIZE(struct rte_mempool); DUMP_SIZE(struct rte_ring); #undef DUMP_SIZE diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c index 678c3
[dpdk-dev] [PATCH v2 6/6] mbuf: flatten struct vlan_macip into mbuf struct
The vlan_macip structure combined a vlan tag id with l2 and l3 headers lengths for tracking offloads. However, this structure was only used as a unit by the e1000 and ixgbe drivers, not generally. This patch removes the structure from the mbuf header and places the fields into the mbuf structure directly at the required point, without any net effect on the structure layout. This allows us to treat the vlan tags and header length fields as separate for future mbuf changes. The drivers which were written to use the combined structure still do so, using a driver-local definition of it. Changes in V2: * None Signed-off-by: Bruce Richardson --- app/test-pmd/csumonly.c | 4 +-- app/test-pmd/flowgen.c | 14 +- app/test-pmd/macfwd.c | 6 ++--- app/test-pmd/macswap.c | 6 ++--- app/test-pmd/rxonly.c | 3 +-- app/test-pmd/testpmd.c | 4 ++- app/test-pmd/txonly.c | 6 ++--- app/test/packet_burst_generator.c | 10 examples/ip_fragmentation/main.c| 2 +- examples/ip_pipeline/pipeline_rx.c | 4 +-- examples/ip_pipeline/pipeline_tx.c | 2 +- examples/ip_reassembly/main.c | 8 +++--- examples/ipv4_multicast/main.c | 4 ++- examples/vhost/main.c | 6 ++--- lib/librte_ip_frag/ip_frag_common.h | 3 +-- lib/librte_ip_frag/rte_ipv4_fragmentation.c | 2 +- lib/librte_ip_frag/rte_ipv4_reassembly.c| 6 ++--- lib/librte_ip_frag/rte_ipv6_reassembly.c| 5 ++-- lib/librte_mbuf/rte_mbuf.h | 33 +++- lib/librte_pmd_e1000/em_rxtx.c | 40 ++--- lib/librte_pmd_e1000/igb_rxtx.c | 39 +--- lib/librte_pmd_i40e/i40e_rxtx.c | 14 +- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 18 ++--- lib/librte_pmd_ixgbe/ixgbe_rxtx.h | 23 - lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 6 ++--- 25 files changed, 159 insertions(+), 109 deletions(-) diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index 655b6d8..28b66f5 100644 --- a/app/test-pmd/csumonly.c +++ b/app/test-pmd/csumonly.c @@ -432,8 +432,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs) } /* Combine the packet header write. VLAN is not consider here */ - mb->vlan_macip.f.l2_len = l2_len; - mb->vlan_macip.f.l3_len = l3_len; + mb->l2_len = l2_len; + mb->l3_len = l3_len; mb->ol_flags = ol_flags; } nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx); diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c index 17dbf83..b091b6d 100644 --- a/app/test-pmd/flowgen.c +++ b/app/test-pmd/flowgen.c @@ -205,13 +205,13 @@ pkt_burst_flow_gen(struct fwd_stream *fs) udp_hdr->dgram_len = RTE_CPU_TO_BE_16(pkt_size - sizeof(*eth_hdr) - sizeof(*ip_hdr)); - pkt->nb_segs= 1; - pkt->pkt_len= pkt_size; - pkt->ol_flags = ol_flags; - pkt->vlan_macip.f.vlan_tci = vlan_tci; - pkt->vlan_macip.f.l2_len= sizeof(struct ether_hdr); - pkt->vlan_macip.f.l3_len= sizeof(struct ipv4_hdr); - pkts_burst[nb_pkt] = pkt; + pkt->nb_segs= 1; + pkt->pkt_len= pkt_size; + pkt->ol_flags = ol_flags; + pkt->vlan_tci = vlan_tci; + pkt->l2_len = sizeof(struct ether_hdr); + pkt->l3_len = sizeof(struct ipv4_hdr); + pkts_burst[nb_pkt] = pkt; next_flow = (next_flow + 1) % cfg_n_flows; } diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c index 999c8e3..4b905cd 100644 --- a/app/test-pmd/macfwd.c +++ b/app/test-pmd/macfwd.c @@ -116,9 +116,9 @@ pkt_burst_mac_forward(struct fwd_stream *fs) ether_addr_copy(&ports[fs->tx_port].eth_addr, ð_hdr->s_addr); mb->ol_flags = txp->tx_ol_flags; - mb->vlan_macip.f.l2_len = sizeof(struct ether_hdr); - mb->vlan_macip.f.l3_len = sizeof(struct ipv4_hdr); - mb->vlan_macip.f.vlan_tci = txp->tx_vlan_id; + mb->l2_len = sizeof(struct ether_hdr); + mb->l3_len = sizeof(struct ipv4_hdr); + mb->vlan_tci = txp->tx_vlan_id; } nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx); fs->tx_packets += nb_tx; diff --git
[dpdk-dev] [PATCH v2 4/6] mbuf: remove the rte_pktmbuf structure
From: Olivier Matz The rte_pktmbuf structure was initially included in the rte_mbuf structure. This was needed when there was 2 types of mbuf (ctrl and packet). As the control mbuf has been removed, we can merge the rte_pktmbuf into the rte_mbuf structure. Advantages of doing this: - the access to mbuf fields is easier (ex: m->data instead of m->pkt.data) - make the structure more consistent: for instance, there was no reason to have the ol_flags field in rte_mbuf - it will allow a deeper reorganization of the rte_mbuf structure in the next commits, allowing to gain several bytes in it Signed-off-by: Olivier Matz Updated to work with latest code, and to include new example apps. Changes in V2: * Further updates to apply to latest HEAD on master Signed-off-by: Bruce Richardson --- app/test-pmd/cmdline.c | 1 - app/test-pmd/csumonly.c| 6 +- app/test-pmd/flowgen.c | 16 +-- app/test-pmd/icmpecho.c| 4 +- app/test-pmd/ieee1588fwd.c | 6 +- app/test-pmd/macfwd-retry.c| 2 +- app/test-pmd/macfwd.c | 8 +- app/test-pmd/macswap.c | 8 +- app/test-pmd/rxonly.c | 12 +- app/test-pmd/testpmd.c | 8 +- app/test-pmd/testpmd.h | 2 +- app/test-pmd/txonly.c | 42 +++--- app/test/commands.c| 1 - app/test/packet_burst_generator.c | 46 +++ app/test/test_distributor.c| 18 +-- app/test/test_distributor_perf.c | 4 +- app/test/test_mbuf.c | 12 +- app/test/test_sched.c | 4 +- app/test/test_table_acl.c | 6 +- app/test/test_table_pipeline.c | 4 +- examples/dpdk_qat/crypto.c | 22 ++-- examples/dpdk_qat/main.c | 2 +- examples/exception_path/main.c | 10 +- examples/ip_fragmentation/main.c | 6 +- examples/ip_pipeline/pipeline_rx.c | 4 +- examples/ip_pipeline/pipeline_tx.c | 2 +- examples/ip_reassembly/main.c | 8 +- examples/ipv4_multicast/main.c | 14 +- examples/l3fwd-acl/main.c | 2 +- examples/l3fwd-power/main.c| 2 +- examples/l3fwd-vf/main.c | 2 +- examples/l3fwd/main.c | 10 +- examples/load_balancer/runtime.c | 2 +- .../client_server_mp/mp_client/client.c| 2 +- examples/quota_watermark/qw/main.c | 4 +- examples/vhost/main.c | 100 +++ examples/vhost_xen/main.c | 22 ++-- lib/librte_distributor/rte_distributor.c | 2 +- lib/librte_ip_frag/ip_frag_common.h| 14 +- lib/librte_ip_frag/rte_ipv4_fragmentation.c| 40 +++--- lib/librte_ip_frag/rte_ipv4_reassembly.c | 6 +- lib/librte_ip_frag/rte_ipv6_fragmentation.c| 38 +++--- lib/librte_ip_frag/rte_ipv6_reassembly.c | 4 +- lib/librte_mbuf/rte_mbuf.c | 26 ++-- lib/librte_mbuf/rte_mbuf.h | 142 ++--- lib/librte_pmd_bond/rte_eth_bond_pmd.c | 4 +- lib/librte_pmd_e1000/em_rxtx.c | 64 +- lib/librte_pmd_e1000/igb_rxtx.c| 68 +- lib/librte_pmd_i40e/i40e_rxtx.c| 98 +++--- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 100 +++ lib/librte_pmd_ixgbe/ixgbe_rxtx.h | 2 +- lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 20 ++- lib/librte_pmd_pcap/rte_eth_pcap.c | 14 +- lib/librte_pmd_virtio/virtio_rxtx.c| 58 - lib/librte_pmd_virtio/virtqueue.h | 2 +- lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 26 ++-- lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 12 +- lib/librte_pmd_xenvirt/virtqueue.h | 4 +- lib/librte_port/rte_port_frag.c| 2 +- lib/librte_sched/rte_sched.c | 14 +- lib/librte_sched/rte_sched.h | 10 +- 61 files changed, 594 insertions(+), 600 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 6de38e6..67321f7 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -6475,7 +6475,6 @@ dump_struct_sizes(void) { #define DUMP_SIZE(t) printf("sizeof(" #t ") = %u\n", (unsigned)sizeof(t)); DUMP_SIZE(struct rte
[dpdk-dev] [PATCH v2 2/6] mbuf: rename RTE_MBUF_SCATTER_GATHER into RTE_MBUF_REFCNT
From: Olivier Matz It seems that RTE_MBUF_SCATTER_GATHER is not the proper name for the feature it provides. "Scatter gather" means that data is stored using several buffers. RTE_MBUF_REFCNT seems to be a better name for that feature as it provides a reference counter for mbufs. The macro RTE_MBUF_SCATTER_GATHER is poisoned to ensure this modification is seen by drivers or applications using it. Changes in V2: * None Signed-off-by: Olivier Matz Signed-off-by: Bruce Richardson --- app/test/test_mbuf.c | 16 config/common_bsdapp | 2 +- config/common_linuxapp| 2 +- doc/doxy-api.conf | 2 +- examples/Makefile | 4 ++-- examples/ip_fragmentation/Makefile| 4 ++-- examples/ip_pipeline/Makefile | 2 +- examples/ip_pipeline/main.c | 2 +- examples/ipv4_multicast/Makefile | 4 ++-- examples/vhost/main.c | 4 ++-- lib/librte_mbuf/rte_mbuf.c| 2 +- lib/librte_mbuf/rte_mbuf.h| 35 +++ lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 4 ++-- lib/librte_port/Makefile | 4 ++-- 14 files changed, 45 insertions(+), 42 deletions(-) diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c index 2b87521..678c382 100644 --- a/app/test/test_mbuf.c +++ b/app/test/test_mbuf.c @@ -82,7 +82,7 @@ static struct rte_mempool *pktmbuf_pool = NULL; static struct rte_mempool *ctrlmbuf_pool = NULL; -#if defined RTE_MBUF_SCATTER_GATHER && defined RTE_MBUF_REFCNT_ATOMIC +#if defined RTE_MBUF_REFCNT && defined RTE_MBUF_REFCNT_ATOMIC static struct rte_mempool *refcnt_pool = NULL; static struct rte_ring *refcnt_mbuf_ring = NULL; @@ -365,7 +365,7 @@ fail: static int testclone_testupdate_testdetach(void) { -#ifndef RTE_MBUF_SCATTER_GATHER +#ifndef RTE_MBUF_REFCNT return 0; #else struct rte_mbuf *mc = NULL; @@ -406,7 +406,7 @@ fail: if (mc) rte_pktmbuf_free(mc); return -1; -#endif /* RTE_MBUF_SCATTER_GATHER */ +#endif /* RTE_MBUF_REFCNT */ } #undef GOTO_FAIL @@ -439,7 +439,7 @@ test_pktmbuf_pool(void) printf("Error pool not empty"); ret = -1; } -#ifdef RTE_MBUF_SCATTER_GATHER +#ifdef RTE_MBUF_REFCNT extra = rte_pktmbuf_clone(m[0], pktmbuf_pool); if(extra != NULL) { printf("Error pool not empty"); @@ -548,11 +548,11 @@ test_pktmbuf_free_segment(void) /* * Stress test for rte_mbuf atomic refcnt. * Implies that: - * RTE_MBUF_SCATTER_GATHER and RTE_MBUF_REFCNT_ATOMIC are both defined. + * RTE_MBUF_REFCNT and RTE_MBUF_REFCNT_ATOMIC are both defined. * For more efficency, recomended to run with RTE_LIBRTE_MBUF_DEBUG defined. */ -#if defined RTE_MBUF_SCATTER_GATHER && defined RTE_MBUF_REFCNT_ATOMIC +#if defined RTE_MBUF_REFCNT && defined RTE_MBUF_REFCNT_ATOMIC static int test_refcnt_slave(__attribute__((unused)) void *arg) @@ -657,7 +657,7 @@ test_refcnt_master(void) static int test_refcnt_mbuf(void) { -#if defined RTE_MBUF_SCATTER_GATHER && defined RTE_MBUF_REFCNT_ATOMIC +#if defined RTE_MBUF_REFCNT && defined RTE_MBUF_REFCNT_ATOMIC unsigned lnum, master, slave, tref; @@ -808,7 +808,7 @@ test_failing_mbuf_sanity_check(void) return -1; } -#ifdef RTE_MBUF_SCATTER_GATHER +#ifdef RTE_MBUF_REFCNT badbuf = *buf; badbuf.refcnt = 0; if (verify_mbuf_check_panics(&badbuf)) { diff --git a/config/common_bsdapp b/config/common_bsdapp index bf6d8a0..645949f 100644 --- a/config/common_bsdapp +++ b/config/common_bsdapp @@ -249,7 +249,7 @@ CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n # CONFIG_RTE_LIBRTE_MBUF=y CONFIG_RTE_LIBRTE_MBUF_DEBUG=n -CONFIG_RTE_MBUF_SCATTER_GATHER=y +CONFIG_RTE_MBUF_REFCNT=y CONFIG_RTE_MBUF_REFCNT_ATOMIC=y CONFIG_RTE_PKTMBUF_HEADROOM=128 diff --git a/config/common_linuxapp b/config/common_linuxapp index 9047975..5bee910 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -277,7 +277,7 @@ CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n # CONFIG_RTE_LIBRTE_MBUF=y CONFIG_RTE_LIBRTE_MBUF_DEBUG=n -CONFIG_RTE_MBUF_SCATTER_GATHER=y +CONFIG_RTE_MBUF_REFCNT=y CONFIG_RTE_MBUF_REFCNT_ATOMIC=y CONFIG_RTE_PKTMBUF_HEADROOM=128 diff --git a/doc/doxy-api.conf b/doc/doxy-api.conf index 1fd4492..63dc64d 100644 --- a/doc/doxy-api.conf +++ b/doc/doxy-api.conf @@ -56,7 +56,7 @@ FILE_PATTERNS = rte_*.h \ cmdline.h PREDEFINED = __DOXYGEN__ \ __attribute__(x)= \ - RTE_MBUF_SCATTER_GATHER + RTE_MBUF_REFCNT OPTIMIZE_OUTPUT_FOR_C = YES ENABLE_PREPROCESSING= YES diff --git a/examples/Makefile b/examples/Makefile index dc85cf3..6245f83 100644 --- a/examples/Makefile +++ b/examples/Makefile @@ -45,8 +45,8 @@ DIRS-y += exception_path DIRS-y += helloworld DIRS-y += ip_pipeline DIRS-y += ip_reassemb
[dpdk-dev] [PATCHv3] librte_acl make it build/work for 'default' target
On Wed, Aug 27, 2014 at 07:18:44PM +, Ananyev, Konstantin wrote: > > > > -Original Message- > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > Sent: Wednesday, August 27, 2014 7:57 PM > > To: Ananyev, Konstantin > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com > > Subject: Re: [PATCHv3] librte_acl make it build/work for 'default' target > > > > On Wed, Aug 27, 2014 at 11:25:04AM +, Ananyev, Konstantin wrote: > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > Sent: Tuesday, August 26, 2014 6:45 PM > > > > To: Ananyev, Konstantin > > > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com > > > > Subject: Re: [PATCHv3] librte_acl make it build/work for 'default' > > > > target > > > > > > > > On Mon, Aug 25, 2014 at 04:30:05PM +, Ananyev, Konstantin wrote: > > > > > Hi Neil, > > > > > > > > > > > -Original Message- > > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > > > Sent: Thursday, August 21, 2014 9:15 PM > > > > > > To: dev at dpdk.org > > > > > > Cc: Ananyev, Konstantin; thomas.monjalon at 6wind.com; Neil Horman > > > > > > Subject: [PATCHv3] librte_acl make it build/work for 'default' > > > > > > target > > > > > > > > > > > > Make ACL library to build/work on 'default' architecture: > > > > > > - make rte_acl_classify_scalar really scalar > > > > > > (make sure it wouldn't use sse4 instrincts through > > > > > > resolve_priority()). > > > > > > - Provide two versions of rte_acl_classify code path: > > > > > > rte_acl_classify_sse() - could be build and used only on systems > > > > > > with sse4.2 > > > > > > and upper, return -ENOTSUP on lower arch. > > > > > > rte_acl_classify_scalar() - a slower version, but could be build > > > > > > and used > > > > > > on all systems. > > > > > > - keep common code shared between these two codepaths. > > > > > > > > > > > > v2 chages: > > > > > > run-time selection of most appropriate code-path for given ISA. > > > > > > By default the highest supprted one is selected. > > > > > > User can still override that selection by manually assigning new > > > > > > value to > > > > > > the global function pointer rte_acl_default_classify. > > > > > > rte_acl_classify() becomes a macro calling whatever > > > > > > rte_acl_default_classify > > > > > > points to. > > > > > > > > > > > > > > > > I see you decided not to wait for me and fix everything by yourself :) > > > > > > > > > Yeah, sorry, I'm getting pinged about enabling these features in > > > > Fedora, and it > > > > had been about 2 weeks, so I figured I'd just take care of it. > > > > > > No worries. I admit that it was a long delay from my side. > > > > > > > > > > > > > V3 Changes > > > > > > Updated classify pointer to be a function so as to better preserve > > > > > > ABI > > > > > > > > > > As I said in my previous mail it generates extra jump... > > > > > Though from numbers I got the performance impact is negligible: < 1%. > > > > > So I suppose, I don't have a good enough reason to object :) > > > > > > > > > Yeah, I just don't see a way around it. I was hoping that the compiler > > > > would > > > > have been smart enough to see that the rte_acl_classify function was > > > > small and > > > > in-linable, but apparently it won't do that. As you note however the > > > > performance change is minor (I'm guessing within a standard deviation > > > > of your > > > > results). > > > > > > > > > Though I still think we better keep rte_acl_classify_scalar() > > > > > publically available (same as we do for rte acl_classify_sse()): > > > > > First of all keep rte_acl_classify_scalar() is already part of our > > > > > public API. > > > > > Also, as I remember, one of the customers explicitly asked for scalar > > > > > version and they planned to call it directly. > > > > > Plus using rte_acl_select_classify() to always switch between > > > > > implementations is not always handy: > > > > > > > > I'm not exactly opposed to this, though it seems odd to me that a user > > > > might > > > > want to call a particular version of the classifier directly. But I > > > > certainly > > > > can't predict everything a consumer wants to do. If we really need to > > > > keep it > > > > public then, it begs the question, is providing a generic entry point > > > > even > > > > worthwhile? Is it just as easy to expose the scalar/sse and any future > > > > versions > > > > directly so the application can just embody the intellegence to select > > > > the best > > > > path? That saves us having to maintain another API point. I can go > > > > with > > > > consensus on that. > > > > > > > > > - it is global, which means that we can't simultaneously use > > > > > classify_scalar() and classify_sse() for 2 different ACL contexts. > > > > > - to properly support such switching we then will need to support > > > > > something like (see app/test/test_acl.c below): > > > > > old_alg = rte_acl_get_classify(); > >
[dpdk-dev] [PATCH v2 1/6] ixgbe: put only non-zero initializer in definition
Since all unspecified fields in an initializer are assumed to be zero we can simplify the empty mbuf definition in the vector driver to only use the fields that are non-zero, i.e. just nb_segs = 1. This makes things shorter and means that the structure doesn't need as many updates for other fields being renamed or moved. The variable itself is never modified and only used by a single function so it can be made const and local to the using function. Changes in v2: * None Signed-off-by: Bruce Richardson --- lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 29 + 1 file changed, 5 insertions(+), 24 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c index 92e07de..56dce23 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c @@ -44,33 +44,14 @@ #pragma GCC diagnostic ignored "-Wcast-qual" #endif -static struct rte_mbuf mb_def = { - - .ol_flags = 0, - { - .pkt = { - .data_len = 0, - .pkt_len = 0, - - .vlan_macip = { - .data = 0, - }, - .hash = { - .rss = 0, - }, - - .nb_segs = 1, - .in_port = 0, - - .next = NULL, - .data = NULL, - }, - }, -}; - static inline void ixgbe_rxq_rearm(struct igb_rx_queue *rxq) { + static const struct rte_mbuf mb_def = { + .pkt = { + .nb_segs = 1, + }, + }; int i; uint16_t rx_id; volatile union ixgbe_adv_rx_desc *rxdp; -- 1.9.3
[dpdk-dev] [PATCH v2 5/6] mbuf: rename in_port to just port
In some cases we may want to tag a packet for a particular destination or output port, so rename the "in_port" field in the mbuf to just "port" so that it can be re-used for this purpose if an application needs it. Changes in V2: * Further updates to apply to latest HEAD on master Signed-off-by: Bruce Richardson --- examples/dpdk_qat/main.c | 2 +- examples/ipv4_multicast/main.c | 2 +- examples/load_balancer/runtime.c | 2 +- examples/multi_process/client_server_mp/mp_client/client.c | 2 +- lib/librte_mbuf/rte_mbuf.c | 4 ++-- lib/librte_mbuf/rte_mbuf.h | 6 +++--- lib/librte_pmd_e1000/em_rxtx.c | 4 ++-- lib/librte_pmd_e1000/igb_rxtx.c| 4 ++-- lib/librte_pmd_i40e/i40e_rxtx.c| 8 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 8 lib/librte_pmd_virtio/virtio_rxtx.c| 4 ++-- lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 4 ++-- lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 2 +- 13 files changed, 26 insertions(+), 26 deletions(-) diff --git a/examples/dpdk_qat/main.c b/examples/dpdk_qat/main.c index 75c9876..1599a0a 100644 --- a/examples/dpdk_qat/main.c +++ b/examples/dpdk_qat/main.c @@ -384,7 +384,7 @@ main_loop(__attribute__((unused)) void *dummy) } } - port = dst_ports[pkt->in_port]; + port = dst_ports[pkt->port]; /* Transmit the packet */ nic_tx_send_packet(pkt, (uint8_t)port); diff --git a/examples/ipv4_multicast/main.c b/examples/ipv4_multicast/main.c index cc12d9d..2232851 100644 --- a/examples/ipv4_multicast/main.c +++ b/examples/ipv4_multicast/main.c @@ -337,7 +337,7 @@ mcast_out_pkt(struct rte_mbuf *pkt, int use_clone) hdr->nb_segs = (uint8_t)(pkt->nb_segs + 1); /* copy metadata from source packet*/ - hdr->in_port = pkt->in_port; + hdr->port = pkt->port; hdr->vlan_macip = pkt->vlan_macip; hdr->hash = pkt->hash; diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c index b69917b..a53f33f 100644 --- a/examples/load_balancer/runtime.c +++ b/examples/load_balancer/runtime.c @@ -540,7 +540,7 @@ app_lcore_worker( ipv4_dst = rte_be_to_cpu_32(ipv4_hdr->dst_addr); if (unlikely(rte_lpm_lookup(lp->lpm_table, ipv4_dst, &port) != 0)) { - port = pkt->in_port; + port = pkt->port; } pos = lp->mbuf_out[port].n_mbufs; diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c index 71e4a48..ee2338c 100644 --- a/examples/multi_process/client_server_mp/mp_client/client.c +++ b/examples/multi_process/client_server_mp/mp_client/client.c @@ -211,7 +211,7 @@ enqueue_packet(struct rte_mbuf *buf, uint8_t port) static void handle_packet(struct rte_mbuf *buf) { - const uint8_t in_port = buf->in_port; + const uint8_t in_port = buf->port; const uint8_t out_port = output_ports[in_port]; enqueue_packet(buf, out_port); diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c index d12a0bf..c1b2176 100644 --- a/lib/librte_mbuf/rte_mbuf.c +++ b/lib/librte_mbuf/rte_mbuf.c @@ -122,7 +122,7 @@ rte_pktmbuf_init(struct rte_mempool *mp, /* init some constant fields */ m->pool = mp; m->nb_segs = 1; - m->in_port = 0xff; + m->port = 0xff; } /* do some sanity checks on a mbuf: panic if it fails */ @@ -176,7 +176,7 @@ rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, unsigned dump_len) m, (uint64_t)m->buf_physaddr, (unsigned)m->buf_len); fprintf(f, " pkt_len=%"PRIu32", ol_flags=%"PRIx16", nb_segs=%u, " "in_port=%u\n", m->pkt_len, m->ol_flags, - (unsigned)m->nb_segs, (unsigned)m->in_port); + (unsigned)m->nb_segs, (unsigned)m->port); nb_segs = m->nb_segs; while (m && nb_segs != 0) { diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index d66b9bd..047a5a7 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -166,7 +166,7 @@ struct rte_mbuf { /* these fields are valid for first segment only */ uint8_t nb_segs;/**< Number of segments. */ - uint8_t in_port;/**< Input port. */ + uint8_t port; /**< Input port. */ uint32_t pkt_len; /**< Total pkt len: sum of all segment data_len. */ /* offload features, valid for first segment only */ @@ -542,7 +542,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m) m->pkt_len = 0; m-
[dpdk-dev] [PATCHv4] librte_acl make it build/work for 'default' target
Make ACL library to build/work on 'default' architecture: - make rte_acl_classify_scalar really scalar (make sure it wouldn't use sse4 instrincts through resolve_priority()). - Provide two versions of rte_acl_classify code path: rte_acl_classify_sse() - could be build and used only on systems with sse4.2 and upper, return -ENOTSUP on lower arch. rte_acl_classify_scalar() - a slower version, but could be build and used on all systems. - keep common code shared between these two codepaths. v2 chages: run-time selection of most appropriate code-path for given ISA. By default the highest supprted one is selected. User can still override that selection by manually assigning new value to the global function pointer rte_acl_default_classify. rte_acl_classify() becomes a macro calling whatever rte_acl_default_classify points to. V3 Changes Updated classify pointer to be a function so as to better preserve ABI REmoved macro definitions for match check functions to make them static inline V4 Changes Rewrote classification selection mechanim to use a function table, so that we can just store the preferred alg in the rte_acl_ctx struct so that multiprocess access works. I understand that leaves us with an extra load instruction, but I think thats ok, because it also allows... Addition of a new function rte_acl_classify_alg. This function lets you specify an enum value to override the acl contexts default algorith when doing a classification. This allows an application to specify a classification algorithm without needing to pulicize each method. I know there was concern over keeping those methods public, but we don't have a static ABI at the moment, so this seems to me a reasonable thing to do, as it gives us less of an ABI surface to worry about. Fixed misc missed static declarations Removed acl_match_check.h and moved match_check function to acl_run.h typdeffed function pointer to match check. Signed-off-by: Neil Horman CC: konstantin.ananyev at intel.com CC: thomas.monjalon at 6wind.com --- app/test-acl/main.c | 13 +- app/test/test_acl.c | 10 +- lib/librte_acl/Makefile | 5 +- lib/librte_acl/acl.h| 1 + lib/librte_acl/acl_bld.c| 5 +- lib/librte_acl/acl_run.c| 944 lib/librte_acl/acl_run.h| 271 lib/librte_acl/acl_run_scalar.c | 197 + lib/librte_acl/acl_run_sse.c| 630 +++ lib/librte_acl/rte_acl.c| 62 +++ lib/librte_acl/rte_acl.h| 66 ++- 11 files changed, 1208 insertions(+), 996 deletions(-) delete mode 100644 lib/librte_acl/acl_run.c create mode 100644 lib/librte_acl/acl_run.h create mode 100644 lib/librte_acl/acl_run_scalar.c create mode 100644 lib/librte_acl/acl_run_sse.c diff --git a/app/test-acl/main.c b/app/test-acl/main.c index d654409..6551918 100644 --- a/app/test-acl/main.c +++ b/app/test-acl/main.c @@ -787,6 +787,10 @@ acx_init(void) /* perform build. */ ret = rte_acl_build(config.acx, &cfg); + /* setup default rte_acl_classify */ + if (config.scalar) + rte_acl_set_default_classify(RTE_ACL_CLASSIFY_SCALAR); + dump_verbose(DUMP_NONE, stdout, "rte_acl_build(%u) finished with %d\n", config.bld_categories, ret); @@ -815,13 +819,8 @@ search_ip5tuples_once(uint32_t categories, uint32_t step, int scalar) v += config.trace_sz; } - if (scalar != 0) - ret = rte_acl_classify_scalar(config.acx, data, - results, n, categories); - - else - ret = rte_acl_classify(config.acx, data, - results, n, categories); + ret = rte_acl_classify(config.acx, data, results, + n, categories); if (ret != 0) rte_exit(ret, "classify for ipv%c_5tuples returns %d\n", diff --git a/app/test/test_acl.c b/app/test/test_acl.c index 869f6d3..2169f59 100644 --- a/app/test/test_acl.c +++ b/app/test/test_acl.c @@ -148,7 +148,7 @@ test_classify_run(struct rte_acl_ctx *acx) } /* make a quick check for scalar */ - ret = rte_acl_classify_scalar(acx, data, results, + ret = rte_acl_classify_alg(acx, RTE_ACL_CLASSIFY_SCALAR, data, results, RTE_DIM(acl_test_data), RTE_ACL_MAX_CATEGORIES); if (ret != 0) { printf("Line %i: SSE classify failed!\n", __LINE__); @@ -343,7 +343,7 @@ test_invalid_layout(void) } /* classify tuples */ - ret = rte_acl_classify(acx, data, results, + ret = rte_acl_classify_alg(acx, RTE_ACL_CLASSIFY_SCALAR, data, results, RTE_DIM(results), 1); if (ret != 0) { printf("Line %i: SSE classify failed!\n", __LINE__); @@ -362,7 +362,7 @@ te
[dpdk-dev] [PATCH v3] lib/librte_vhost: vhost library support to facilitate integration with DPDK accelerated vswitch.
> config/common_linuxapp |7 + > lib/Makefile |1 + > lib/librte_vhost/Makefile| 48 ++ > lib/librte_vhost/eventfd_link/Makefile | 39 + > lib/librte_vhost/eventfd_link/eventfd_link.c | 194 + > lib/librte_vhost/eventfd_link/eventfd_link.h | 40 + > lib/librte_vhost/rte_virtio_net.h| 192 + > lib/librte_vhost/vhost-net-cdev.c| 363 ++ > lib/librte_vhost/vhost-net-cdev.h| 109 +++ > lib/librte_vhost/vhost_rxtx.c| 292 > lib/librte_vhost/virtio-net.c| 1002 > ++ It would help if you made a first patch to move existing code, another patch to convert it into a lib, and a last one for the new example. So it would show how you transform the old example code and would be easier to review. Thanks -- Thomas
[dpdk-dev] [PATCH] hash: added rte_hash_keys to extract all keys
2014-08-12 23:47, Tomas Vestelind: > I added a function which extracts all the configured keys in a hash map. > This is good to have when debugging and printing data store in hash > maps. Someone to review this patch, please? (and the other one for rte_hash_clear) -- Thomas
[dpdk-dev] [PATCH] hash: added rte_hash_keys to extract all keys
On Tue, 12 Aug 2014 23:47:33 +0200 Tomas Vestelind wrote: > I added a function which extracts all the configured keys in a hash map. > This is good to have when debugging and printing data store in hash > maps. > > Signed-off-by: Tomas Vestelind > --- > lib/librte_hash/rte_hash.c | 26 ++ > lib/librte_hash/rte_hash.h | 15 +++ > 2 files changed, 41 insertions(+) > > diff --git a/lib/librte_hash/rte_hash.c b/lib/librte_hash/rte_hash.c > index d02b6b4..2108c4f 100644 > --- a/lib/librte_hash/rte_hash.c > +++ b/lib/librte_hash/rte_hash.c > @@ -481,3 +481,29 @@ rte_hash_lookup_bulk(const struct rte_hash *h, const > void **keys, > > return 0; > } > + > +unsigned int > +rte_hash_keys(const struct rte_hash *h, void *keys) > +{ > +unsigned int found_keys = 0; > +unsigned int bucket, entry; > + > +/* Go through each bucket and all its entries */ > +for (bucket = 0; bucket < h->num_buckets; bucket++) { > +const hash_sig_t *sig = get_sig_tbl_bucket(h, bucket); > + > +for (entry = 0; entry < h->bucket_entries; entry++) { > +/* If the signature is valid, find and save the corresponding > key */ > +if (sig[entry] != NULL_SIGNATURE) { > + uint8_t *key_bucket = get_key_tbl_bucket(h, bucket); > + void *key = get_key_from_bucket(h, key_bucket, entry); > + rte_memcpy(keys, key, h->key_len); > + > + keys = (uint8_t* )keys + h->key_len; > + found_keys++; > +} > +} > +} > + > +return found_keys; > +} > diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h > index 2ecaf1a..e0fb28f 100644 > --- a/lib/librte_hash/rte_hash.h > +++ b/lib/librte_hash/rte_hash.h > @@ -303,6 +303,21 @@ rte_hash_hash(const struct rte_hash *h, const void *key) > int > rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys, > uint32_t num_keys, int32_t *positions); > + > +/** > + * Copy the hash table keys to the supplied list of keys. > + * This operation is multi-thread safe. > + * > + * @param h > + * Hash table to look in. > + * @param keys > + * A pointer to a list of where keys will be written. > + * Must be large enough to fit a potentially full hash map. > + * @return > + * The number of found keys. > + */ > +unsigned int > +rte_hash_keys(const struct rte_hash *h, void *keys); > #ifdef __cplusplus > } > #endif Please indent with tabs not spaces. Using rte_memcpy() is no better than just using memcpy. Please put blank line after declartions. Writing code with continue makes it easier? for (entry = 0; entry < h->bucket_entries; entry++) { /* If the signature is valid, find and save the corresponding key */ if (sig[entry] == NULL_SIGNATURE continue;
[dpdk-dev] rte_mbuf: documentation, meta-data, and inconsistencies
rte_muf: Inconsistency in the programmer's guide or the code.? - >From the DPDK 1.7.0 programmer's guide we read: "For a newly allocated mbuf, the area at which the data begins in the message buffer is RTE_PKTMBUF_HEADROOM bytes after the beginning of the buffer, which is cache aligned." In the file ./lib/librte_mbuf/rte_mbuf.c and function rte_pktmbuf_init() we find: m->pkt.data = (char*) m->buf_addr + RTE_MIN(RTE_PKTMBUF_HEADROOM, m->buf_len); Where RTE_PKTMBUF_HEADROOM is configured to be 128 bytes from the file ./config/common_linuxapp: CONFIG_RTE_PKTMBUF_HEADROOM=128 Question 1: Does the above invocation of RTE_MIN() cause the programmer's guide to be inaccurate? (Saying "in practice it does not matter..." is not an answer) Question 2: Why is "packet metadata" implicitly sharing the same bytes of the mbuf headroom area, instead of being explicitly kept from being overwritten by a call to rte_pktmbuf_prepend(pkt, 100); presuming my metadata is at least 32 bytes long The above command would place the packet data area starting at the last 4 bytes of my metadata (Saying "you can change CONFIG_RTE_PKTMBUF_HEADROOM at compile time" is not an answer because it effects all mbufs not just the one I with my meta-data...) Question 3: Why do we write: #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) rte_mempool_create(s, 1024, MBUF_SIZE, 32,0, rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL, socketid, flags); instead of: #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf)) rte_mempool_create(s, 1024, MBUF_SIZE, 32,0, rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL, socketid, flags); and have rte_pktmbuf_init() enforce ( + RTE_PKTMBUF_HEADROOM ) automatically? -- My reason for asking the above questions stems from my work on a Packet Framework Pipeline. I came to understand that there really isn't an *explicit* declaration of "allocate this much meta-data space for each packet". The programmer's guide and mbuf.h describe of headroom and tailroom as a place to edit the packet data. An example would be to wrap the packet for tunnelling by prepending and appending the data area to be large enought to contain a new header and checksum. In fact, while the programmer's guide mentions packet meta-data a few times, there is no section which actually describes how to make and access your very own packet meta-data. This addition would be very nice. The tables in the Packet Framework Pipeline examples all use keys injected into the meta-data of the mbuf at RX time to compare a rule against that key (explicitly stating the "offset into packet meta-data") and not allowing "offset into packet data". I actually like this setup because it allows the meta-data key to be securely analyzed and copied from the packet data - keeping malformed packets out of the decision making process of the tables. But, in the end, sharing the meta-data area with the packet headroom seems to be a very bad idea. Sincerely, Daniel Chapiesky
[dpdk-dev] rte_mbuf: documentation, meta-data, and inconsistencies
On Thu, Aug 28, 2014 at 08:00:59PM -0400, daniel chapiesky wrote: > But, in the end, sharing the meta-data area with the packet headroom seems > to be a very > bad idea. > > Sincerely, > > Daniel Chapiesky You might have picked a good time to inquire about it as some of the Intel guys are making patches to clean up rte_mbuf during the last couple of weeks as we speak. Matthew.