[dpdk-dev] [PATCH v6 12/19] malloc: fix the issue of SOCKET_ID_ANY
Hi, > -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Saturday, February 14, 2015 1:57 AM > To: Liang, Cunming > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 12/19] malloc: fix the issue of > SOCKET_ID_ANY > > On Fri, Feb 13, 2015 at 09:38:14AM +0800, Cunming Liang wrote: > > Add check for rte_socket_id(), avoid get unexpected return like (-1). > > > > Signed-off-by: Cunming Liang > > --- > > lib/librte_malloc/malloc_heap.h | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_malloc/malloc_heap.h > > b/lib/librte_malloc/malloc_heap.h > > index b4aec45..a47136d 100644 > > --- a/lib/librte_malloc/malloc_heap.h > > +++ b/lib/librte_malloc/malloc_heap.h > > @@ -44,7 +44,12 @@ extern "C" { > > static inline unsigned > > malloc_get_numa_socket(void) > > { > > - return rte_socket_id(); > > + unsigned socket_id = rte_socket_id(); > > + > > + if (socket_id == (unsigned)SOCKET_ID_ANY) > > + return 0; > > + > > + return socket_id; > Why is -1 unexpected? Isn't it reasonable to assume that some memory is > equidistant from all cpu numa nodes? [LCM] One piece of memory will be whole allocated from one specific NUMA node. But won't be like some part from one and the other part from another. If no specific NUMA node assigned(SOCKET_ID_ANY/-1), it firstly asks for the current NUMA node where current core belongs to. 'malloc_get_numa_socket()' is called on that time. When the time 1:1 thread/core mapping is assumed and the default value is 0, it always will return a none (-1) value. Now rte_socket_id() may return -1 in the case the pthread runs on multi-cores which are not belongs to one NUMA node, or in the case _socket_id is not yet assigned and the default value is (-1). So if current _socket_id is -1, then just pick up the first node as the candidate. Probably I shall add more comments for this. > > Neil > > > } > > > > void * > > -- > > 1.8.1.4 > > > >
[dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API declaration
Hi, > -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Friday, February 13, 2015 9:58 PM > To: Liang, Cunming > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API > declaration > > On Fri, Feb 13, 2015 at 09:38:08AM +0800, Cunming Liang wrote: > > 1. add two TLS *_socket_id* and *_cpuset* > > 2. add two external API rte_thread_set/get_affinity > > 3. add one internal API eal_thread_dump_affinity > > > > Signed-off-by: Cunming Liang > > --- > > v5 changes: > >add comments for RTE_CPU_AFFINITY_STR_LEN > >update comments for eal_thread_dump_affinity() > >return void for rte_thread_get_affinity() > >move rte_socket_id() change to another patch > > > > lib/librte_eal/bsdapp/eal/eal_thread.c| 2 ++ > > lib/librte_eal/common/eal_thread.h| 36 > +++ > > lib/librte_eal/common/include/rte_lcore.h | 26 +- > > lib/librte_eal/linuxapp/eal/eal_thread.c | 2 ++ > > 4 files changed, 65 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c > b/lib/librte_eal/bsdapp/eal/eal_thread.c > > index ab05368..10220c7 100644 > > --- a/lib/librte_eal/bsdapp/eal/eal_thread.c > > +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c > > @@ -56,6 +56,8 @@ > > #include "eal_thread.h" > > > > RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); > > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id); > > +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); > > > > /* > > * Send a message to a slave lcore identified by slave_id to call a > > diff --git a/lib/librte_eal/common/eal_thread.h > b/lib/librte_eal/common/eal_thread.h > > index f1ce0bd..e4e76b9 100644 > > --- a/lib/librte_eal/common/eal_thread.h > > +++ b/lib/librte_eal/common/eal_thread.h > > @@ -34,6 +34,8 @@ > > #ifndef EAL_THREAD_H > > #define EAL_THREAD_H > > > > +#include > > + > > /** > > * basic loop of thread, called for each thread by eal_init(). > > * > > @@ -61,4 +63,38 @@ void eal_thread_init_master(unsigned lcore_id); > > */ > > unsigned eal_cpu_socket_id(unsigned cpu_id); > > > > +/** > > + * Get the NUMA socket id from cpuset. > > + * This function is private to EAL. > > + * > > + * @param cpusetp > > + * The point to a valid cpu set. > > + * @return > > + * socket_id or SOCKET_ID_ANY > > + */ > > +int eal_cpuset_socket_id(rte_cpuset_t *cpusetp); > > + > > +/** > > + * Default buffer size to use with eal_thread_dump_affinity() > > + */ > > +#define RTE_CPU_AFFINITY_STR_LEN256 > > + > > +/** > > + * Dump the current pthread cpuset. > > + * This function is private to EAL. > > + * > > + * Note: > > + * If the dump size is greater than the size of given buffer, > > + * the string will be truncated and with '\0' at the end. > > + * > > + * @param str > > + * The string buffer the cpuset will dump to. > > + * @param size > > + * The string buffer size. > > + * @return > > + * 0 for success, -1 if truncation happens. > > + */ > > +int > > +eal_thread_dump_affinity(char *str, unsigned size); > > + > > #endif /* EAL_THREAD_H */ > > diff --git a/lib/librte_eal/common/include/rte_lcore.h > b/lib/librte_eal/common/include/rte_lcore.h > > index 4c7d6bb..33f558e 100644 > > --- a/lib/librte_eal/common/include/rte_lcore.h > > +++ b/lib/librte_eal/common/include/rte_lcore.h > > @@ -80,7 +80,9 @@ struct lcore_config { > > */ > > extern struct lcore_config lcore_config[RTE_MAX_LCORE]; > > > > -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */ > > +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per thread "lcore id". > */ > > +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". > */ > > +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". > */ > > > > /** > > * Return the ID of the execution unit we are running on. > > @@ -229,6 +231,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int > wrap) > > i > i = rte_get_next_lcore(i, 1, 0)) > > > > +/** > > + * Set core affinity of the current thread. > > + * Support both EAL and none-EAL thread and update TLS. > > + * > > + * @param cpusetp > > + * Point to cpu_set_t for setting current thread affinity. > > + * @return > > + * On success, return 0; otherwise return -1; > > + */ > > +int rte_thread_set_affinity(rte_cpuset_t *cpusetp); > > + > > +/** > > + * Get core affinity of the current thread. > > + * > > + * @param cpusetp > > + * Point to cpu_set_t for getting current thread cpu affinity. > > + * It presumes input is not NULL, otherwise it causes panic. > > + * > > + */ > > +void rte_thread_get_affinity(rte_cpuset_t *cpusetp); > > + > > + > > #ifdef __cplusplus > > } > > #endif > > diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c > b/lib/librte_eal/linuxapp/eal/eal_thread.c > > index 80a985f..748a83a 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal_thread.c > > +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c > > @@ -
[dpdk-dev] [PATCH v2 4/4] app/testpmd:test NVGRE Tx checksum offload
> -Original Message- > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > Sent: Friday, February 13, 2015 5:53 PM > To: Liu, Jijiang; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 4/4] app/testpmd:test NVGRE Tx checksum > offload > > Hi Jijiang, > > > On 02/12/2015 01:45 AM, Jijiang Liu wrote: > > Enhance csum fwd engine based on current TX checksum framework in order > to test TX Checksum offload for NVGRE packet. > > > > It includes: > > - IPv4 and IPv6 packet > > - outer L3, inner L3 and L4 checksum offload for Tx side. > > > > [...] > > @@ -231,20 +235,25 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct > testpmd_offload_info *info) > > struct ether_hdr *eth_hdr; > > struct ipv4_hdr *ipv4_hdr; > > struct ipv6_hdr *ipv6_hdr; > > + uint8_t gre_len = 0; > > > > - /* if flags != 0; it's not supported */ > > - if (gre_hdr->flags != 0) > > + /* check which fields are supported */ > > + if (gre_hdr->flags != 0 && > > + (gre_hdr->flags & _htons(GRE_SUPPORTED_FIELDS)) == 0) > > return; > > > > + gre_len += sizeof(struct simple_gre_hdr); > > + > > + if (gre_hdr->flags & _htons(GRE_KEY_PRESENT)) > > + gre_len += GRE_KEY_LEN; > > + > > I think this test won't work if the flags contains both supported and > unsupported > flags. > > What about this instead: > > if ((gre_hdr->flags & _htons(~GRE_SUPPORTED_FIELDS)) != 0) > return; > That's correct, I will update it in next version. > > Regards, > Olivier
[dpdk-dev] [PATCH v6 05/19] eal: add support parsing socket_id from cpuset
> -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Friday, February 13, 2015 9:52 PM > To: Liang, Cunming > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 05/19] eal: add support parsing socket_id > from cpuset > > On Fri, Feb 13, 2015 at 09:38:07AM +0800, Cunming Liang wrote: > > It returns the socket_id if all cpus in the cpuset belongs > > to the same NUMA node, otherwise it will return SOCKET_ID_ANY. > > > > Signed-off-by: Cunming Liang > > --- > > v5 changes: > >expose cpu_socket_id as eal_cpu_socket_id for linuxapp > >eal_cpuset_socket_id() remove static inline and move to c file > > > > lib/librte_eal/bsdapp/eal/eal_lcore.c | 7 +++ > > lib/librte_eal/common/eal_thread.h | 11 +++ > > lib/librte_eal/linuxapp/eal/eal_lcore.c | 7 --- > > 3 files changed, 22 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c > b/lib/librte_eal/bsdapp/eal/eal_lcore.c > > index 72f8ac2..162fb4f 100644 > > --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c > > +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c > > @@ -41,6 +41,7 @@ > > #include > > > > #include "eal_private.h" > > +#include "eal_thread.h" > > > > /* No topology information available on FreeBSD including NUMA info */ > > #define cpu_core_id(X) 0 > > @@ -112,3 +113,9 @@ rte_eal_cpu_init(void) > > > > return 0; > > } > > + > > +unsigned > > +eal_cpu_socket_id(__rte_unused unsigned cpu_id) > > +{ > > + return cpu_socket_id(cpu_id); > > +} > > diff --git a/lib/librte_eal/common/eal_thread.h > b/lib/librte_eal/common/eal_thread.h > > index b53b84d..f1ce0bd 100644 > > --- a/lib/librte_eal/common/eal_thread.h > > +++ b/lib/librte_eal/common/eal_thread.h > > @@ -50,4 +50,15 @@ __attribute__((noreturn)) void *eal_thread_loop(void > *arg); > > */ > > void eal_thread_init_master(unsigned lcore_id); > > > > +/** > > + * Get the NUMA socket id from cpu id. > > + * This function is private to EAL. > > + * > > + * @param cpu_id > > + * The logical process id. > > + * @return > > + * socket_id or SOCKET_ID_ANY > > + */ > > +unsigned eal_cpu_socket_id(unsigned cpu_id); > > + > > #endif /* EAL_THREAD_H */ > > diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c > b/lib/librte_eal/linuxapp/eal/eal_lcore.c > > index 29615f8..ef8c433 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c > > +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c > > @@ -45,6 +45,7 @@ > > > > #include "eal_private.h" > > #include "eal_filesystem.h" > > +#include "eal_thread.h" > > > > #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u" > > #define CORE_ID_FILE "topology/core_id" > > @@ -71,8 +72,8 @@ cpu_detected(unsigned lcore_id) > > * Note: physical package id != NUMA node, but we use it as a > > * fallback for kernels which don't create a nodeY link > > */ > > -static unsigned > > -cpu_socket_id(unsigned lcore_id) > > +unsigned > > +eal_cpu_socket_id(unsigned lcore_id) > If you want to export this symbol, then you need to add it to the version map. [LCM] They're all EAL internal function, won't plan to expose as EAL API. > > Neil
[dpdk-dev] [PATCH v6 04/19] eal: fix wrong strnlen() return value in 32bit icc
> -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Saturday, February 14, 2015 1:55 AM > To: Olivier MATZ > Cc: Liang, Cunming; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 04/19] eal: fix wrong strnlen() return > value in > 32bit icc > > On Fri, Feb 13, 2015 at 03:05:44PM +0100, Olivier MATZ wrote: > > Hi Neil, > > > > On 02/13/2015 02:49 PM, Neil Horman wrote: > > > On Fri, Feb 13, 2015 at 09:38:06AM +0800, Cunming Liang wrote: > > >> The problem is that strnlen() here may return invalid value with 32bit > > >> icc. > > >> (actually it returns it?s second parameter,e.g: sysconf(_SC_ARG_MAX)). > > >> It starts to manifest hwen max_len parameter is > 2M and using icc ?m32 ? > O2 (or above). > > >> > > >> Suggested-by: Konstantin Ananyev > > >> Signed-off-by: Cunming Liang > > >> --- > > >> v5 changes: > > >>using strlen instead of strnlen. > > >> > > >> lib/librte_eal/common/eal_common_options.c | 6 +++--- > > >> 1 file changed, 3 insertions(+), 3 deletions(-) > > >> > > >> diff --git a/lib/librte_eal/common/eal_common_options.c > b/lib/librte_eal/common/eal_common_options.c > > >> index 178e303..9cf2faa 100644 > > >> --- a/lib/librte_eal/common/eal_common_options.c > > >> +++ b/lib/librte_eal/common/eal_common_options.c > > >> @@ -167,7 +167,7 @@ eal_parse_coremask(const char *coremask) > > >> if (coremask[0] == '0' && ((coremask[1] == 'x') > > >> || (coremask[1] == 'X'))) > > >> coremask += 2; > > >> -i = strnlen(coremask, PATH_MAX); > > >> +i = strlen(coremask); > > > This is crash prone. If coremask is passed in without a trailing null > > > pointer, > > > strlen will return a huge value that can overrun the array. > > > > We discussed that in a previous thread: > > http://dpdk.org/ml/archives/dev/2015-February/012552.html > > > > coremask is always a valid nul-terminated string as it comes from > > argv[] table. > > It is not a memory fragment that is controlled by a user, so I don't > > think using strnlen() instead of strlen() would solve any issue. > > > Thats absolutely false, you can't in any way make that assertion. > eal_parse_common_option is a public API call. An application can construct > its > own string to pass into the parser. The test applications all use the command > line functions so its not a visible issue from the test apps, but you can't > assume what the test apps do is what everyone will do. It would be one thing > if > you could make the parse_common_option function private, but with the > current > layout you can't so you have to be ready for garbage input. > > Neil [LCM] It sounds reasonable to me. I'll rollback the code and use strnlen(coremask, ARG_MAX) instead. > > > Regards, > > Olivier > >
[dpdk-dev] [PATCH v6 00/19] support multi-pthread per core
Hi, > -Original Message- > From: Olivier MATZ [mailto:olivier.matz at 6wind.com] > Sent: Friday, February 13, 2015 6:06 PM > To: Liang, Cunming; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 00/19] support multi-pthread per core > > Hi, > > On 02/13/2015 02:38 AM, Cunming Liang wrote: > > v6 changes: > > rename RTE_RING_PAUSE_REP(_COUNT) and set default to 0 > > rollback to use RTE_MAX_LCORE when checking valid lcore_id for EAL thread > > > > v5 changes: > > reorder some patch and split into addtional two patches > > rte_thread_get_affinity() return type change to avoid > > add RTE_RING_PAUSE_REP into config and by default turn off > > > > v4 changes: > > new patch fixing strnlen() invalid return in 32bit icc [03/17] > > update and add more comments on sched_yield() [16/17] > > > > v3 changes: > > new patch adding sched_yield() in rte_ring to avoid long spin [16/17] > > > > v2 changes: > > add '-' support for EAL option '--lcores' [02/17] > > > > The patch series contain the enhancements of EAL and fixes for libraries > > to run multi-pthreads(either EAL or non-EAL thread) per physical core. > > Two major changes list as below: > > - Extend the core affinity of each EAL thread to 1:n. > > Each lcore stands for a EAL thread rather than a logical core. > > The change adds new EAL option to allow static lcore to cpuset assginment. > > Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is > > the special > case. > > - Fix the libraries to allow running on any non-EAL thread. > > It fix the gaps running libraries in non-EAL thread(dynamic created by > > user). > > Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE. > > > > Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen > in RFC review. > > > > Cunming Liang (19): > > eal: add cpuset into per EAL thread lcore_config > > eal: fix PAGE_SIZE redefine complaint on freebsd > > eal: new eal option '--lcores' for cpu assignment > > eal: fix wrong strnlen() return value in 32bit icc > > eal: add support parsing socket_id from cpuset > > eal: new TLS definition and API declaration > > eal: add eal_common_thread.c for common thread API > > eal: standardize init sequence between linux and bsd > > eal: add rte_gettid() to acquire unique system tid > > eal: apply affinity of EAL thread by assigned cpuset > > enic: fix re-define freebsd compile complain > > malloc: fix the issue of SOCKET_ID_ANY > > log: fix the gap to support non-EAL thread > > eal: set _lcore_id and _socket_id to (-1) by default > > eal: fix recursive spinlock in non-EAL thraed > > mempool: add support to non-EAL thread > > ring: add support to non-EAL thread > > ring: add sched_yield to avoid spin forever > > timer: add support to non-EAL thread > > > > config/common_bsdapp | 1 + > > config/common_linuxapp | 1 + > > lib/librte_eal/bsdapp/eal/Makefile | 1 + > > lib/librte_eal/bsdapp/eal/eal.c| 14 +- > > lib/librte_eal/bsdapp/eal/eal_lcore.c | 14 + > > lib/librte_eal/bsdapp/eal/eal_memory.c | 8 +- > > lib/librte_eal/bsdapp/eal/eal_thread.c | 77 ++ > > lib/librte_eal/common/eal_common_log.c | 17 +- > > lib/librte_eal/common/eal_common_options.c | 308 > - > > lib/librte_eal/common/eal_common_thread.c | 150 ++ > > lib/librte_eal/common/eal_options.h| 2 + > > lib/librte_eal/common/eal_thread.h | 47 > > .../common/include/generic/rte_spinlock.h | 4 +- > > lib/librte_eal/common/include/rte_eal.h| 27 ++ > > lib/librte_eal/common/include/rte_lcore.h | 40 ++- > > lib/librte_eal/common/include/rte_log.h| 5 + > > lib/librte_eal/linuxapp/eal/Makefile | 4 + > > lib/librte_eal/linuxapp/eal/eal.c | 8 +- > > lib/librte_eal/linuxapp/eal/eal_lcore.c| 15 +- > > lib/librte_eal/linuxapp/eal/eal_thread.c | 77 ++ > > lib/librte_malloc/malloc_heap.h| 7 +- > > lib/librte_mempool/rte_mempool.h | 18 +- > > lib/librte_pmd_enic/enic.h | 4 +- > > lib/librte_pmd_enic/enic_compat.h | 2 +- > > lib/librte_pmd_enic/vnic/vnic_dev.c| 6 +- > > lib/librte_ring/rte_ring.h | 41 ++- > > lib/librte_timer/rte_timer.c | 32 ++- > > lib/librte_timer/rte_timer.h | 4 +- > > 28 files changed, 765 insertions(+), 169 deletions(-) > > create mode 100644 lib/librte_eal/common/eal_common_thread.c > > > > Series: > Acked-by: Olivier Matz [LCM] Thanks for all the comments and suggestion. > > Maybe a doc update will be required, could you have a look at it? [L
[dpdk-dev] Query on portmask config
Hi, I'm new to DPDK. Would like to know how to determine the portmask for a given configuration. Does it depend on the number of cores configured Regards Shankari.V
[dpdk-dev] [PATCH v7 00/19] support multi-pthread per core
v7 changes: update EAL version map for new public EAL API rollback to use strnlen() passing EAL core option v6 changes: rename RTE_RING_PAUSE_REP(_COUNT) and set default to 0 rollback to use RTE_MAX_LCORE when checking valid lcore_id for EAL thread v5 changes: reorder some patch and split into addtional two patches rte_thread_get_affinity() return type change to avoid add RTE_RING_PAUSE_REP into config and by default turn off v4 changes: new patch fixing strnlen() invalid return in 32bit icc [03/17] update and add more comments on sched_yield() [16/17] v3 changes: new patch adding sched_yield() in rte_ring to avoid long spin [16/17] v2 changes: add '-' support for EAL option '--lcores' [02/17] The patch series contain the enhancements of EAL and fixes for libraries to run multi-pthreads(either EAL or non-EAL thread) per physical core. Two major changes list as below: - Extend the core affinity of each EAL thread to 1:n. Each lcore stands for a EAL thread rather than a logical core. The change adds new EAL option to allow static lcore to cpuset assginment. Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is the special case. - Fix the libraries to allow running on any non-EAL thread. It fix the gaps running libraries in non-EAL thread(dynamic created by user). Each fix libraries take care the case of rte_lcore_id() >= RTE_MAX_LCORE. Thanks a million for the comments from Konstantin, Bruce, Mirek and Stephen in RFC review. Cunming Liang (19): eal: add cpuset into per EAL thread lcore_config eal: fix PAGE_SIZE redefine complaint on freebsd eal: new eal option '--lcores' for cpu assignment eal: fix wrong strnlen() return value in 32bit icc eal: add support parsing socket_id from cpuset eal: new TLS definition and API declaration eal: add eal_common_thread.c for common thread API eal: standardize init sequence between linux and bsd eal: add rte_gettid() to acquire unique system tid eal: apply affinity of EAL thread by assigned cpuset enic: fix re-define freebsd compile complain malloc: fix the issue of SOCKET_ID_ANY log: fix the gap to support non-EAL thread eal: set _lcore_id and _socket_id to (-1) by default eal: fix recursive spinlock in non-EAL thraed mempool: add support to non-EAL thread ring: add support to non-EAL thread ring: add sched_yield to avoid spin forever timer: add support to non-EAL thread config/common_bsdapp | 1 + config/common_linuxapp | 1 + lib/librte_eal/bsdapp/eal/Makefile | 1 + lib/librte_eal/bsdapp/eal/eal.c| 14 +- lib/librte_eal/bsdapp/eal/eal_lcore.c | 14 + lib/librte_eal/bsdapp/eal/eal_memory.c | 8 +- lib/librte_eal/bsdapp/eal/eal_thread.c | 77 ++--- lib/librte_eal/bsdapp/eal/rte_eal_version.map | 2 + lib/librte_eal/common/eal_common_log.c | 17 +- lib/librte_eal/common/eal_common_options.c | 309 - lib/librte_eal/common/eal_common_thread.c | 150 ++ lib/librte_eal/common/eal_options.h| 2 + lib/librte_eal/common/eal_thread.h | 47 .../common/include/generic/rte_spinlock.h | 4 +- lib/librte_eal/common/include/rte_eal.h| 27 ++ lib/librte_eal/common/include/rte_lcore.h | 40 ++- lib/librte_eal/common/include/rte_log.h| 5 + lib/librte_eal/linuxapp/eal/Makefile | 4 + lib/librte_eal/linuxapp/eal/eal.c | 8 +- lib/librte_eal/linuxapp/eal/eal_lcore.c| 15 +- lib/librte_eal/linuxapp/eal/eal_thread.c | 77 ++--- lib/librte_eal/linuxapp/eal/rte_eal_version.map| 2 + lib/librte_malloc/malloc_heap.h| 7 +- lib/librte_mempool/rte_mempool.h | 18 +- lib/librte_pmd_enic/enic.h | 4 +- lib/librte_pmd_enic/enic_compat.h | 2 +- lib/librte_pmd_enic/vnic/vnic_dev.c| 6 +- lib/librte_ring/rte_ring.h | 41 ++- lib/librte_timer/rte_timer.c | 32 ++- lib/librte_timer/rte_timer.h | 4 +- 30 files changed, 770 insertions(+), 169 deletions(-) create mode 100644 lib/librte_eal/common/eal_common_thread.c -- 1.8.1.4
[dpdk-dev] [PATCH v7 01/19] eal: add cpuset into per EAL thread lcore_config
The patch adds 'cpuset' into per-lcore configure 'lcore_config[]', as the lcore no longer always 1:1 pinning with physical cpu. The lcore now stands for a EAL thread rather than a logical cpu. It doesn't change the default behavior of 1:1 mapping, but allows to affinity the EAL thread to multiple cpus. Signed-off-by: Cunming Liang --- v5 changes: separate eal_memory.c to the new patch lib/librte_eal/bsdapp/eal/eal_lcore.c | 7 +++ lib/librte_eal/common/include/rte_lcore.h | 8 lib/librte_eal/linuxapp/eal/Makefile | 1 + lib/librte_eal/linuxapp/eal/eal_lcore.c | 8 4 files changed, 24 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c index 662f024..72f8ac2 100644 --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c @@ -76,11 +76,18 @@ rte_eal_cpu_init(void) * ones and enable them by default. */ for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + /* init cpuset for per lcore config */ + CPU_ZERO(&lcore_config[lcore_id].cpuset); + lcore_config[lcore_id].detected = (lcore_id < ncpus); if (lcore_config[lcore_id].detected == 0) { config->lcore_role[lcore_id] = ROLE_OFF; continue; } + + /* By default, lcore 1:1 map to cpu id */ + CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset); + /* By default, each detected core is enabled */ config->lcore_role[lcore_id] = ROLE_RTE; lcore_config[lcore_id].core_id = cpu_core_id(lcore_id); diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h index 49b2c03..4c7d6bb 100644 --- a/lib/librte_eal/common/include/rte_lcore.h +++ b/lib/librte_eal/common/include/rte_lcore.h @@ -50,6 +50,13 @@ extern "C" { #define LCORE_ID_ANY -1/**< Any lcore. */ +#if defined(__linux__) + typedef cpu_set_t rte_cpuset_t; +#elif defined(__FreeBSD__) +#include + typedef cpuset_t rte_cpuset_t; +#endif + /** * Structure storing internal configuration (per-lcore) */ @@ -65,6 +72,7 @@ struct lcore_config { unsigned socket_id;/**< physical socket id for this lcore */ unsigned core_id; /**< core number on socket for this lcore */ int core_index;/**< relative index, starting from 0 */ + rte_cpuset_t cpuset; /**< cpu set which the lcore affinity to */ }; /** diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index e117cec..1b6c484 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -91,6 +91,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_dev.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_options.c CFLAGS_eal.o := -D_GNU_SOURCE +CFLAGS_eal_lcore.o := -D_GNU_SOURCE CFLAGS_eal_thread.o := -D_GNU_SOURCE CFLAGS_eal_log.o := -D_GNU_SOURCE CFLAGS_eal_common_log.o := -D_GNU_SOURCE diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c index c67e0e6..29615f8 100644 --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c @@ -158,11 +158,19 @@ rte_eal_cpu_init(void) * ones and enable them by default. */ for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { + /* init cpuset for per lcore config */ + CPU_ZERO(&lcore_config[lcore_id].cpuset); + + /* in 1:1 mapping, record related cpu detected state */ lcore_config[lcore_id].detected = cpu_detected(lcore_id); if (lcore_config[lcore_id].detected == 0) { config->lcore_role[lcore_id] = ROLE_OFF; continue; } + + /* By default, lcore 1:1 map to cpu id */ + CPU_SET(lcore_id, &lcore_config[lcore_id].cpuset); + /* By default, each detected core is enabled */ config->lcore_role[lcore_id] = ROLE_RTE; lcore_config[lcore_id].core_id = cpu_core_id(lcore_id); -- 1.8.1.4
[dpdk-dev] [PATCH v7 02/19] eal: fix PAGE_SIZE redefine complaint on freebsd
Signed-off-by: Cunming Liang --- lib/librte_eal/bsdapp/eal/eal_memory.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c index 65ee87d..33ebd0f 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memory.c +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c @@ -45,7 +45,7 @@ #include "eal_internal_cfg.h" #include "eal_filesystem.h" -#define PAGE_SIZE (sysconf(_SC_PAGESIZE)) +#define EAL_PAGE_SIZE (sysconf(_SC_PAGESIZE)) /* * Get physical address of any mapped virtual address in the current process. @@ -93,7 +93,8 @@ rte_eal_contigmem_init(void) char physaddr_str[64]; addr = mmap(NULL, hpi->hugepage_sz, PROT_READ|PROT_WRITE, - MAP_SHARED, hpi->lock_descriptor, j * PAGE_SIZE); + MAP_SHARED, hpi->lock_descriptor, + j * EAL_PAGE_SIZE); if (addr == MAP_FAILED) { RTE_LOG(ERR, EAL, "Failed to mmap buffer %u from %s\n", j, hpi->hugedir); @@ -167,7 +168,8 @@ rte_eal_contigmem_attach(void) struct rte_memseg *seg = &mcfg->memseg[i]; addr = mmap(seg->addr, hpi->hugepage_sz, PROT_READ|PROT_WRITE, - MAP_SHARED|MAP_FIXED, fd_hugepage, i * PAGE_SIZE); + MAP_SHARED|MAP_FIXED, fd_hugepage, + i * EAL_PAGE_SIZE); if (addr == MAP_FAILED || addr != seg->addr) { RTE_LOG(ERR, EAL, "Failed to mmap buffer %u from %s\n", i, hpi->hugedir); -- 1.8.1.4
[dpdk-dev] [PATCH v7 03/19] eal: new eal option '--lcores' for cpu assignment
It supports one new eal long option '--lcores' for EAL thread cpuset assignment. The format pattern: --lcores='lcores[@cpus]<,lcores[@cpus]>' lcores, cpus could be a single digit/range or a group. '(' and ')' are necessary if it's a group. If not supply '@cpus', the value of cpus uses the same as lcores. e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below lcore 0 runs on cpuset 0x41 (cpu 0,6) lcore 1 runs on cpuset 0x2 (cpu 1) lcore 2 runs on cpuset 0xe0 (cpu 5,6,7) lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2) lcore 6 runs on cpuset 0x41 (cpu 0,6) lcore 7 runs on cpuset 0x80 (cpu 7) lcore 8 runs on cpuset 0x100 (cpu 8) Signed-off-by: Cunming Liang --- v5 changes: add more comments for eal_parse_set fix some typo remove inline prefix from convert_to_cpuset() fix a bug introduced on v2 which broke case '(0,6)' v2 changes: add '-' support for EAL option '--lcores' lib/librte_eal/common/eal_common_options.c | 304 - lib/librte_eal/common/eal_options.h| 2 + lib/librte_eal/linuxapp/eal/Makefile | 1 + 3 files changed, 303 insertions(+), 4 deletions(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 67e02dc..178e303 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -45,6 +45,7 @@ #include #include #include +#include #include "eal_internal_cfg.h" #include "eal_options.h" @@ -85,6 +86,7 @@ eal_long_options[] = { {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM}, {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM}, {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM}, + {OPT_LCORES, 1, 0, OPT_LCORES_NUM}, {0, 0, 0, 0} }; @@ -255,9 +257,11 @@ eal_parse_corelist(const char *corelist) if (min == RTE_MAX_LCORE) min = idx; for (idx = min; idx <= max; idx++) { - cfg->lcore_role[idx] = ROLE_RTE; - lcore_config[idx].core_index = count; - count++; + if (cfg->lcore_role[idx] != ROLE_RTE) { + cfg->lcore_role[idx] = ROLE_RTE; + lcore_config[idx].core_index = count; + count++; + } } min = RTE_MAX_LCORE; } else @@ -292,6 +296,283 @@ eal_parse_master_lcore(const char *arg) return 0; } +/* + * Parse elem, the elem could be single number/range or '(' ')' group + * 1) A single number elem, it's just a simple digit. e.g. 9 + * 2) A single range elem, two digits with a '-' between. e.g. 2-6 + * 3) A group elem, combines multiple 1) or 2) with '( )'. e.g (0,2-4,6) + *Within group elem, '-' used for a range separator; + * ',' used for a single number. + */ +static int +eal_parse_set(const char *input, uint16_t set[], unsigned num) +{ + unsigned idx; + const char *str = input; + char *end = NULL; + unsigned min, max; + + memset(set, 0, num * sizeof(uint16_t)); + + while (isblank(*str)) + str++; + + /* only digit or left bracket is qualify for start point */ + if ((!isdigit(*str) && *str != '(') || *str == '\0') + return -1; + + /* process single number or single range of number */ + if (*str != '(') { + errno = 0; + idx = strtoul(str, &end, 10); + if (errno || end == NULL || idx >= num) + return -1; + else { + while (isblank(*end)) + end++; + + min = idx; + max = idx; + if (*end == '-') { + /* process single - */ + end++; + while (isblank(*end)) + end++; + if (!isdigit(*end)) + return -1; + + errno = 0; + idx = strtoul(end, &end, 10); + if (errno || end == NULL || idx >= num) + return -1; + max = idx; + while (isblank(*end)) + end++; + if (*end != ',' && *end != '\0') + return -1; + } + + if (*end != ',' && *end != '\0' && + *end != '@') + return -1; + + for (idx =
[dpdk-dev] [PATCH v7 04/19] eal: fix wrong strnlen() return value in 32bit icc
The problem is that strnlen() here may return invalid value with 32bit icc. (actually it returns it?s second parameter,e.g: sysconf(_SC_ARG_MAX)). It starts to manifest hwen max_len parameter is > 2M and using icc ?m32 ?O2 (or above). Suggested-by: Konstantin Ananyev Signed-off-by: Cunming Liang --- v7 changes: rollback to use strnlen v5 changes: using strlen instead of strnlen. lib/librte_eal/common/eal_common_options.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 178e303..62d9120 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -52,6 +52,7 @@ #include "eal_filesystem.h" #define BITS_PER_HEX 4 +#define EAL_ARG_MAX 4096 const char eal_short_options[] = @@ -167,7 +168,7 @@ eal_parse_coremask(const char *coremask) if (coremask[0] == '0' && ((coremask[1] == 'x') || (coremask[1] == 'X'))) coremask += 2; - i = strnlen(coremask, PATH_MAX); + i = strnlen(coremask, EAL_ARG_MAX); while ((i > 0) && isblank(coremask[i - 1])) i--; if (i == 0) @@ -227,7 +228,7 @@ eal_parse_corelist(const char *corelist) /* Remove all blank characters ahead and after */ while (isblank(*corelist)) corelist++; - i = strnlen(corelist, sysconf(_SC_ARG_MAX)); + i = strnlen(corelist, EAL_ARG_MAX); while ((i > 0) && isblank(corelist[i - 1])) i--; @@ -472,7 +473,7 @@ eal_parse_lcores(const char *lcores) /* Remove all blank characters ahead and after */ while (isblank(*lcores)) lcores++; - i = strnlen(lcores, sysconf(_SC_ARG_MAX)); + i = strnlen(lcores, EAL_ARG_MAX); while ((i > 0) && isblank(lcores[i - 1])) i--; -- 1.8.1.4
[dpdk-dev] [PATCH v7 05/19] eal: add public function parsing socket_id from cpuid
It defines eal_cpu_socket_id() which exposing the origin private cpu_socket_id(). The function is only used inside EAL. It returns socket_id of the specified cpu_id. Signed-off-by: Cunming Liang --- v7 changes: reword comments v5 changes: expose cpu_socket_id as eal_cpu_socket_id for linuxapp eal_cpuset_socket_id() remove static inline and move to c file lib/librte_eal/bsdapp/eal/eal_lcore.c | 7 +++ lib/librte_eal/common/eal_thread.h | 11 +++ lib/librte_eal/linuxapp/eal/eal_lcore.c | 7 --- 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_lcore.c b/lib/librte_eal/bsdapp/eal/eal_lcore.c index 72f8ac2..162fb4f 100644 --- a/lib/librte_eal/bsdapp/eal/eal_lcore.c +++ b/lib/librte_eal/bsdapp/eal/eal_lcore.c @@ -41,6 +41,7 @@ #include #include "eal_private.h" +#include "eal_thread.h" /* No topology information available on FreeBSD including NUMA info */ #define cpu_core_id(X) 0 @@ -112,3 +113,9 @@ rte_eal_cpu_init(void) return 0; } + +unsigned +eal_cpu_socket_id(__rte_unused unsigned cpu_id) +{ + return cpu_socket_id(cpu_id); +} diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h index b53b84d..f1ce0bd 100644 --- a/lib/librte_eal/common/eal_thread.h +++ b/lib/librte_eal/common/eal_thread.h @@ -50,4 +50,15 @@ __attribute__((noreturn)) void *eal_thread_loop(void *arg); */ void eal_thread_init_master(unsigned lcore_id); +/** + * Get the NUMA socket id from cpu id. + * This function is private to EAL. + * + * @param cpu_id + * The logical process id. + * @return + * socket_id or SOCKET_ID_ANY + */ +unsigned eal_cpu_socket_id(unsigned cpu_id); + #endif /* EAL_THREAD_H */ diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c b/lib/librte_eal/linuxapp/eal/eal_lcore.c index 29615f8..ef8c433 100644 --- a/lib/librte_eal/linuxapp/eal/eal_lcore.c +++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c @@ -45,6 +45,7 @@ #include "eal_private.h" #include "eal_filesystem.h" +#include "eal_thread.h" #define SYS_CPU_DIR "/sys/devices/system/cpu/cpu%u" #define CORE_ID_FILE "topology/core_id" @@ -71,8 +72,8 @@ cpu_detected(unsigned lcore_id) * Note: physical package id != NUMA node, but we use it as a * fallback for kernels which don't create a nodeY link */ -static unsigned -cpu_socket_id(unsigned lcore_id) +unsigned +eal_cpu_socket_id(unsigned lcore_id) { const char node_prefix[] = "node"; const size_t prefix_len = sizeof(node_prefix) - 1; @@ -174,7 +175,7 @@ rte_eal_cpu_init(void) /* By default, each detected core is enabled */ config->lcore_role[lcore_id] = ROLE_RTE; lcore_config[lcore_id].core_id = cpu_core_id(lcore_id); - lcore_config[lcore_id].socket_id = cpu_socket_id(lcore_id); + lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id); if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID lcore_config[lcore_id].socket_id = 0; -- 1.8.1.4
[dpdk-dev] [PATCH v7 06/19] eal: new TLS definition and API declaration
1. add two TLS *_socket_id* and *_cpuset* 2. add two internal API, eal_cpu_socket_id/eal_thread_dump_affinity 3. add two public API, rte_thread_set/get_affinity 4. update EAL version map for EAL public API Signed-off-by: Cunming Liang --- v7 changes: update version map for EAL public API and reword comments v5 changes: add comments for RTE_CPU_AFFINITY_STR_LEN update comments for eal_thread_dump_affinity() return void for rte_thread_get_affinity() move rte_socket_id() change to another patch lib/librte_eal/bsdapp/eal/eal_thread.c | 2 ++ lib/librte_eal/bsdapp/eal/rte_eal_version.map | 2 ++ lib/librte_eal/common/eal_thread.h | 36 + lib/librte_eal/common/include/rte_lcore.h | 26 +- lib/librte_eal/linuxapp/eal/eal_thread.c| 2 ++ lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++ 6 files changed, 69 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c index ab05368..10220c7 100644 --- a/lib/librte_eal/bsdapp/eal/eal_thread.c +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c @@ -56,6 +56,8 @@ #include "eal_thread.h" RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); +RTE_DEFINE_PER_LCORE(unsigned, _socket_id); +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); /* * Send a message to a slave lcore identified by slave_id to call a diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map index d36286e..00b2ccd 100644 --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map @@ -82,6 +82,8 @@ DPDK_2.0 { rte_snprintf; rte_strerror; rte_strsplit; + rte_thread_get_affinity; + rte_thread_set_affinity; rte_vlog; rte_xen_dom0_memory_attach; rte_xen_dom0_memory_init; diff --git a/lib/librte_eal/common/eal_thread.h b/lib/librte_eal/common/eal_thread.h index f1ce0bd..e4e76b9 100644 --- a/lib/librte_eal/common/eal_thread.h +++ b/lib/librte_eal/common/eal_thread.h @@ -34,6 +34,8 @@ #ifndef EAL_THREAD_H #define EAL_THREAD_H +#include + /** * basic loop of thread, called for each thread by eal_init(). * @@ -61,4 +63,38 @@ void eal_thread_init_master(unsigned lcore_id); */ unsigned eal_cpu_socket_id(unsigned cpu_id); +/** + * Get the NUMA socket id from cpuset. + * This function is private to EAL. + * + * @param cpusetp + * The point to a valid cpu set. + * @return + * socket_id or SOCKET_ID_ANY + */ +int eal_cpuset_socket_id(rte_cpuset_t *cpusetp); + +/** + * Default buffer size to use with eal_thread_dump_affinity() + */ +#define RTE_CPU_AFFINITY_STR_LEN256 + +/** + * Dump the current pthread cpuset. + * This function is private to EAL. + * + * Note: + * If the dump size is greater than the size of given buffer, + * the string will be truncated and with '\0' at the end. + * + * @param str + * The string buffer the cpuset will dump to. + * @param size + * The string buffer size. + * @return + * 0 for success, -1 if truncation happens. + */ +int +eal_thread_dump_affinity(char *str, unsigned size); + #endif /* EAL_THREAD_H */ diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h index 4c7d6bb..33f558e 100644 --- a/lib/librte_eal/common/include/rte_lcore.h +++ b/lib/librte_eal/common/include/rte_lcore.h @@ -80,7 +80,9 @@ struct lcore_config { */ extern struct lcore_config lcore_config[RTE_MAX_LCORE]; -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */ +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per thread "lcore id". */ +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". */ +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */ /** * Return the ID of the execution unit we are running on. @@ -229,6 +231,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int wrap) i
[dpdk-dev] [PATCH v7 07/19] eal: add eal_common_thread.c for common thread API
The API works for both EAL thread and none EAL thread. When calling rte_thread_set_affinity, the *_socket_id* and *_cpuset* of calling thread will be updated if the thread successful set the cpu affinity. Signed-off-by: Cunming Liang --- v5 changes: refine code of rte_thread_set_affinity() change rte_thread_get_affinity() return to void lib/librte_eal/bsdapp/eal/Makefile| 1 + lib/librte_eal/common/eal_common_thread.c | 150 ++ lib/librte_eal/linuxapp/eal/Makefile | 2 + 3 files changed, 153 insertions(+) create mode 100644 lib/librte_eal/common/eal_common_thread.c diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index ae214a4..2357cfa 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -77,6 +77,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c +SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c CFLAGS_eal.o := -D_GNU_SOURCE #CFLAGS_eal_thread.o := -D_GNU_SOURCE diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c new file mode 100644 index 000..f4d9892 --- /dev/null +++ b/lib/librte_eal/common/eal_common_thread.c @@ -0,0 +1,150 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "eal_thread.h" + +int eal_cpuset_socket_id(rte_cpuset_t *cpusetp) +{ + unsigned cpu = 0; + int socket_id = SOCKET_ID_ANY; + int sid; + + if (cpusetp == NULL) + return SOCKET_ID_ANY; + + do { + if (!CPU_ISSET(cpu, cpusetp)) + continue; + + if (socket_id == SOCKET_ID_ANY) + socket_id = eal_cpu_socket_id(cpu); + + sid = eal_cpu_socket_id(cpu); + if (socket_id != sid) { + socket_id = SOCKET_ID_ANY; + break; + } + + } while (++cpu < RTE_MAX_LCORE); + + return socket_id; +} + +int +rte_thread_set_affinity(rte_cpuset_t *cpusetp) +{ + int s; + unsigned lcore_id; + pthread_t tid; + + tid = pthread_self(); + + s = pthread_setaffinity_np(tid, sizeof(rte_cpuset_t), cpusetp); + if (s != 0) { + RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n"); + return -1; + } + + /* store socket_id in TLS for quick access */ + RTE_PER_LCORE(_socket_id) = + eal_cpuset_socket_id(cpusetp); + + /* store cpuset in TLS for quick access */ + memmove(&RTE_PER_LCORE(_cpuset), cpusetp, + sizeof(rte_cpuset_t)); + + lcore_id = rte_lcore_id(); + if (lcore_id != (unsigned)LCORE_ID_ANY) { + /* EAL thread will update lcore_config */ + lcore_config[lcore_id].socket_id = RTE_PER_LCORE(_socket_id); + memmove(&lcore_config[lcore_id].cpuset, cpusetp, + sizeof(rte_cpuset_t)); + } + + return 0; +
[dpdk-dev] [PATCH v7 08/19] eal: standardize init sequence between linux and bsd
Signed-off-by: Cunming Liang --- lib/librte_eal/bsdapp/eal/eal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index 69f3c03..cb11b5c 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -509,6 +509,8 @@ rte_eal_init(int argc, char **argv) rte_eal_mcfg_complete(); + eal_thread_init_master(rte_config.master_lcore); + if (rte_eal_dev_init() < 0) rte_panic("Cannot init pmd devices\n"); @@ -532,8 +534,6 @@ rte_eal_init(int argc, char **argv) rte_panic("Cannot create thread\n"); } - eal_thread_init_master(rte_config.master_lcore); - /* * Launch a dummy function on all slave lcores, so that master lcore * knows they are all ready when this function returns. -- 1.8.1.4
[dpdk-dev] [PATCH v7 09/19] eal: add rte_gettid() to acquire unique system tid
The rte_gettid() wraps the linux and freebsd syscall gettid(). It provides a persistent unique thread id for the calling thread. It will save the unique id in TLS on the first time. Signed-off-by: Cunming Liang --- lib/librte_eal/bsdapp/eal/eal_thread.c | 9 + lib/librte_eal/common/include/rte_eal.h | 27 +++ lib/librte_eal/linuxapp/eal/eal_thread.c | 7 +++ 3 files changed, 43 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c index 10220c7..d0c077b 100644 --- a/lib/librte_eal/bsdapp/eal/eal_thread.c +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -233,3 +234,11 @@ eal_thread_loop(__attribute__((unused)) void *arg) /* pthread_exit(NULL); */ /* return NULL; */ } + +/* require calling thread tid by gettid() */ +int rte_sys_gettid(void) +{ + long lwpid; + thr_self(&lwpid); + return (int)lwpid; +} diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h index f4ecd2e..8ccdd65 100644 --- a/lib/librte_eal/common/include/rte_eal.h +++ b/lib/librte_eal/common/include/rte_eal.h @@ -41,6 +41,9 @@ */ #include +#include + +#include #ifdef __cplusplus extern "C" { @@ -262,6 +265,30 @@ rte_set_application_usage_hook( rte_usage_hook_t usage_func ); */ int rte_eal_has_hugepages(void); +/** + * A wrap API for syscall gettid. + * + * @return + * On success, returns the thread ID of calling process. + * It always successful. + */ +int rte_sys_gettid(void); + +/** + * Get system unique thread id. + * + * @return + * On success, returns the thread ID of calling process. + * It always successful. + */ +static inline int rte_gettid(void) +{ + static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1; + if (RTE_PER_LCORE(_thread_id) == -1) + RTE_PER_LCORE(_thread_id) = rte_sys_gettid(); + return RTE_PER_LCORE(_thread_id); +} + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c index 748a83a..ed20c93 100644 --- a/lib/librte_eal/linuxapp/eal/eal_thread.c +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -233,3 +234,9 @@ eal_thread_loop(__attribute__((unused)) void *arg) /* pthread_exit(NULL); */ /* return NULL; */ } + +/* require calling thread tid by gettid() */ +int rte_sys_gettid(void) +{ + return (int)syscall(SYS_gettid); +} -- 1.8.1.4
[dpdk-dev] [PATCH v7 10/19] eal: apply affinity of EAL thread by assigned cpuset
EAL threads use assigned cpuset to set core affinity during startup. It keeps 1:1 mapping, if no '--lcores' option is used. Signed-off-by: Cunming Liang --- v5 changes: add return check for dump_affinity call rte_thread_set_affinity() directly during EAL thread set lib/librte_eal/bsdapp/eal/eal.c | 10 +++-- lib/librte_eal/bsdapp/eal/eal_thread.c| 64 ++ lib/librte_eal/common/include/rte_lcore.h | 2 +- lib/librte_eal/linuxapp/eal/eal.c | 8 +++- lib/librte_eal/linuxapp/eal/eal_thread.c | 66 ++- 5 files changed, 37 insertions(+), 113 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index cb11b5c..b66f6c6 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -432,6 +432,7 @@ rte_eal_init(int argc, char **argv) int i, fctret, ret; pthread_t thread_id; static rte_atomic32_t run_once = RTE_ATOMIC32_INIT(0); + char cpuset[RTE_CPU_AFFINITY_STR_LEN]; if (!rte_atomic32_test_and_set(&run_once)) return -1; @@ -502,15 +503,18 @@ rte_eal_init(int argc, char **argv) if (rte_eal_pci_init() < 0) rte_panic("Cannot init PCI\n"); - RTE_LOG(DEBUG, EAL, "Master core %u is ready (tid=%p)\n", - rte_config.master_lcore, thread_id); - eal_check_mem_on_local_socket(); rte_eal_mcfg_complete(); eal_thread_init_master(rte_config.master_lcore); + ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN); + + RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%p;cpuset=[%s%s])\n", + rte_config.master_lcore, thread_id, cpuset, + ret == 0 ? "" : "..."); + if (rte_eal_dev_init() < 0) rte_panic("Cannot init pmd devices\n"); diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c index d0c077b..e16f685 100644 --- a/lib/librte_eal/bsdapp/eal/eal_thread.c +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c @@ -101,58 +101,13 @@ rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id) static int eal_thread_set_affinity(void) { - int s; - pthread_t thread; + unsigned lcore_id = rte_lcore_id(); -/* - * According to the section VERSIONS of the CPU_ALLOC man page: - * - * The CPU_ZERO(), CPU_SET(), CPU_CLR(), and CPU_ISSET() macros were added - * in glibc 2.3.3. - * - * CPU_COUNT() first appeared in glibc 2.6. - * - * CPU_AND(), CPU_OR(), CPU_XOR(),CPU_EQUAL(),CPU_ALLOC(), - * CPU_ALLOC_SIZE(), CPU_FREE(), CPU_ZERO_S(), CPU_SET_S(), CPU_CLR_S(), - * CPU_ISSET_S(), CPU_AND_S(), CPU_OR_S(), CPU_XOR_S(), and CPU_EQUAL_S() - * first appeared in glibc 2.7. - */ -#if defined(CPU_ALLOC) - size_t size; - cpu_set_t *cpusetp; - - cpusetp = CPU_ALLOC(RTE_MAX_LCORE); - if (cpusetp == NULL) { - RTE_LOG(ERR, EAL, "CPU_ALLOC failed\n"); - return -1; - } - - size = CPU_ALLOC_SIZE(RTE_MAX_LCORE); - CPU_ZERO_S(size, cpusetp); - CPU_SET_S(rte_lcore_id(), size, cpusetp); + /* acquire system unique id */ + rte_gettid(); - thread = pthread_self(); - s = pthread_setaffinity_np(thread, size, cpusetp); - if (s != 0) { - RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n"); - CPU_FREE(cpusetp); - return -1; - } - - CPU_FREE(cpusetp); -#else /* CPU_ALLOC */ - cpuset_t cpuset; - CPU_ZERO( &cpuset ); - CPU_SET( rte_lcore_id(), &cpuset ); - - thread = pthread_self(); - s = pthread_setaffinity_np(thread, sizeof( cpuset ), &cpuset); - if (s != 0) { - RTE_LOG(ERR, EAL, "pthread_setaffinity_np failed\n"); - return -1; - } -#endif - return 0; + /* update EAL thread core affinity */ + return rte_thread_set_affinity(&lcore_config[lcore_id].cpuset); } void eal_thread_init_master(unsigned lcore_id) @@ -174,6 +129,7 @@ eal_thread_loop(__attribute__((unused)) void *arg) unsigned lcore_id; pthread_t thread_id; int m2s, s2m; + char cpuset[RTE_CPU_AFFINITY_STR_LEN]; thread_id = pthread_self(); @@ -185,9 +141,6 @@ eal_thread_loop(__attribute__((unused)) void *arg) if (lcore_id == RTE_MAX_LCORE) rte_panic("cannot retrieve lcore id\n"); - RTE_LOG(DEBUG, EAL, "Core %u is ready (tid=%p)\n", - lcore_id, thread_id); - m2s = lcore_config[lcore_id].pipe_master2slave[0]; s2m = lcore_config[lcore_id].pipe_slave2master[1]; @@ -198,6 +151,11 @@ eal_thread_loop(__attribute__((unused)) void *arg) if (eal_thread_set_affinity() < 0) rte_panic("cannot set affinity\n"); + ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN); + + RTE_LOG(DEBUG, EAL, "lcore %u is ready
[dpdk-dev] [PATCH v7 11/19] enic: fix re-define freebsd compile complain
Some macro already been defined by freebsd 'sys/param.h'. Signed-off-by: Cunming Liang --- v5 changes: rename the redefined MACRO instead of undefine them lib/librte_pmd_enic/enic.h | 4 ++-- lib/librte_pmd_enic/enic_compat.h | 2 +- lib/librte_pmd_enic/vnic/vnic_dev.c | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h index c43417c..57b9c80 100644 --- a/lib/librte_pmd_enic/enic.h +++ b/lib/librte_pmd_enic/enic.h @@ -66,9 +66,9 @@ #define ENIC_CALC_IP_CKSUM 1 #define ENIC_CALC_TCP_UDP_CKSUM 2 #define ENIC_MAX_MTU9000 -#define PAGE_SIZE 4096 +#define ENIC_PAGE_SIZE 4096 #define PAGE_ROUND_UP(x) \ - unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1))) + unsigned long)(x)) + ENIC_PAGE_SIZE-1) & (~(ENIC_PAGE_SIZE-1))) #define ENICPMD_VFIO_PATH "/dev/vfio/vfio" /*#define ENIC_DESC_COUNT_MAKE_ODD (x) do{if ((~(x)) & 1) { (x)--; } }while(0)*/ diff --git a/lib/librte_pmd_enic/enic_compat.h b/lib/librte_pmd_enic/enic_compat.h index b1af838..40c9b44 100644 --- a/lib/librte_pmd_enic/enic_compat.h +++ b/lib/librte_pmd_enic/enic_compat.h @@ -67,7 +67,7 @@ #define pr_warn(y, args...) dev_warning(0, y, ##args) #define BUG() pr_err("BUG at %s:%d", __func__, __LINE__) -#define ALIGN(x, a) __ALIGN_MASK(x, (typeof(x))(a)-1) +#define VNIC_ALIGN(x, a) __ALIGN_MASK(x, (typeof(x))(a)-1) #define __ALIGN_MASK(x, mask)(((x)+(mask))&~(mask)) #define udelay usleep #define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) diff --git a/lib/librte_pmd_enic/vnic/vnic_dev.c b/lib/librte_pmd_enic/vnic/vnic_dev.c index 6407994..38b7f25 100644 --- a/lib/librte_pmd_enic/vnic/vnic_dev.c +++ b/lib/librte_pmd_enic/vnic/vnic_dev.c @@ -242,9 +242,9 @@ unsigned int vnic_dev_desc_ring_size(struct vnic_dev_ring *ring, if (desc_count == 0) desc_count = 4096; - ring->desc_count = ALIGN(desc_count, count_align); + ring->desc_count = VNIC_ALIGN(desc_count, count_align); - ring->desc_size = ALIGN(desc_size, desc_align); + ring->desc_size = VNIC_ALIGN(desc_size, desc_align); ring->size = ring->desc_count * ring->desc_size; ring->size_unaligned = ring->size + ring->base_align; @@ -294,7 +294,7 @@ int vnic_dev_alloc_desc_ring(__attribute__((unused)) struct vnic_dev *vdev, ring->base_addr_unaligned = (dma_addr_t)rz->phys_addr; - ring->base_addr = ALIGN(ring->base_addr_unaligned, + ring->base_addr = VNIC_ALIGN(ring->base_addr_unaligned, ring->base_align); ring->descs = (u8 *)ring->descs_unaligned + (ring->base_addr - ring->base_addr_unaligned); -- 1.8.1.4
[dpdk-dev] [PATCH v7 12/19] malloc: fix the issue of SOCKET_ID_ANY
Add check for rte_socket_id(), avoid get unexpected return like (-1). By using rte_malloc_socket(), socket id is assigned by socket_arg. If socket_arg set to SOCKET_ID_ANY, it expects to use the socket id to which the current cores belongs. As the thread may affinity on a cpuset, the cores in the cpuset may belongs to different NUMA nodes. The value of _socket_id probably be SOCKET_ID_ANY(-1), the case is not expected in origin malloc_get_numa_socket(). Signed-off-by: Cunming Liang --- v7 changes: reword comments lib/librte_malloc/malloc_heap.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/lib/librte_malloc/malloc_heap.h b/lib/librte_malloc/malloc_heap.h index b4aec45..a47136d 100644 --- a/lib/librte_malloc/malloc_heap.h +++ b/lib/librte_malloc/malloc_heap.h @@ -44,7 +44,12 @@ extern "C" { static inline unsigned malloc_get_numa_socket(void) { - return rte_socket_id(); + unsigned socket_id = rte_socket_id(); + + if (socket_id == (unsigned)SOCKET_ID_ANY) + return 0; + + return socket_id; } void * -- 1.8.1.4
[dpdk-dev] [PATCH v7 13/19] log: fix the gap to support non-EAL thread
For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE. The patch adds the check and allows only EAL thread using EAL per thread log level and log type. Others shares the global log level. Signed-off-by: Cunming Liang --- lib/librte_eal/common/eal_common_log.c | 17 +++-- lib/librte_eal/common/include/rte_log.h | 5 + 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c index cf57619..e8dc94a 100644 --- a/lib/librte_eal/common/eal_common_log.c +++ b/lib/librte_eal/common/eal_common_log.c @@ -193,11 +193,20 @@ rte_set_log_type(uint32_t type, int enable) rte_logs.type &= (~type); } +/* Get global log type */ +uint32_t +rte_get_log_type(void) +{ + return rte_logs.type; +} + /* get the current loglevel for the message beeing processed */ int rte_log_cur_msg_loglevel(void) { unsigned lcore_id; lcore_id = rte_lcore_id(); + if (lcore_id >= RTE_MAX_LCORE) + return rte_get_log_level(); return log_cur_msg[lcore_id].loglevel; } @@ -206,6 +215,8 @@ int rte_log_cur_msg_logtype(void) { unsigned lcore_id; lcore_id = rte_lcore_id(); + if (lcore_id >= RTE_MAX_LCORE) + return rte_get_log_type(); return log_cur_msg[lcore_id].logtype; } @@ -265,8 +276,10 @@ rte_vlog(__attribute__((unused)) uint32_t level, /* save loglevel and logtype in a global per-lcore variable */ lcore_id = rte_lcore_id(); - log_cur_msg[lcore_id].loglevel = level; - log_cur_msg[lcore_id].logtype = logtype; + if (lcore_id < RTE_MAX_LCORE) { + log_cur_msg[lcore_id].loglevel = level; + log_cur_msg[lcore_id].logtype = logtype; + } ret = vfprintf(f, format, ap); fflush(f); diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h index db1ea08..f83a0d9 100644 --- a/lib/librte_eal/common/include/rte_log.h +++ b/lib/librte_eal/common/include/rte_log.h @@ -144,6 +144,11 @@ uint32_t rte_get_log_level(void); void rte_set_log_type(uint32_t type, int enable); /** + * Get the global log type. + */ +uint32_t rte_get_log_type(void); + +/** * Get the current loglevel for the message being processed. * * Before calling the user-defined stream for logging, the log -- 1.8.1.4
[dpdk-dev] [PATCH v7 14/19] eal: set _lcore_id and _socket_id to (-1) by default
For those none EAL thread, *_lcore_id* shall always be LCORE_ID_ANY. The libraries using *_lcore_id* as index need to take care. *_socket_id* always be SOCKET_ID_ANY unitl the thread changes the affinity by rte_thread_set_affinity() Signed-off-by: Cunming Liang --- v5 changes: define LCORE_ID_ANY as UINT32_MAX lib/librte_eal/bsdapp/eal/eal_thread.c| 4 ++-- lib/librte_eal/common/include/rte_lcore.h | 4 ++-- lib/librte_eal/linuxapp/eal/eal_thread.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c index e16f685..ca95c72 100644 --- a/lib/librte_eal/bsdapp/eal/eal_thread.c +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c @@ -56,8 +56,8 @@ #include "eal_private.h" #include "eal_thread.h" -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); -RTE_DEFINE_PER_LCORE(unsigned, _socket_id); +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY; +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY; RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); /* diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h index 6a5bcbc..ad47221 100644 --- a/lib/librte_eal/common/include/rte_lcore.h +++ b/lib/librte_eal/common/include/rte_lcore.h @@ -48,7 +48,7 @@ extern "C" { #endif -#define LCORE_ID_ANY -1/**< Any lcore. */ +#define LCORE_ID_ANY UINT32_MAX /**< Any lcore. */ #if defined(__linux__) typedef cpu_set_t rte_cpuset_t; @@ -87,7 +87,7 @@ RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */ /** * Return the ID of the execution unit we are running on. * @return - * Logical core ID + * Logical core ID(in EAL thread) or LCORE_ID_ANY(in non-EAL thread) */ static inline unsigned rte_lcore_id(void) diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c b/lib/librte_eal/linuxapp/eal/eal_thread.c index 57b0515..5635c7d 100644 --- a/lib/librte_eal/linuxapp/eal/eal_thread.c +++ b/lib/librte_eal/linuxapp/eal/eal_thread.c @@ -56,8 +56,8 @@ #include "eal_private.h" #include "eal_thread.h" -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); -RTE_DEFINE_PER_LCORE(unsigned, _socket_id); +RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY; +RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY; RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); /* -- 1.8.1.4
[dpdk-dev] [PATCH v7 15/19] eal: fix recursive spinlock in non-EAL thraed
In non-EAL thread, lcore_id alrways be LCORE_ID_ANY. It cann't be used as unique id for recursive spinlock. Then use rte_gettid() to replace it. Signed-off-by: Cunming Liang --- lib/librte_eal/common/include/generic/rte_spinlock.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/generic/rte_spinlock.h b/lib/librte_eal/common/include/generic/rte_spinlock.h index dea885c..c7fb0df 100644 --- a/lib/librte_eal/common/include/generic/rte_spinlock.h +++ b/lib/librte_eal/common/include/generic/rte_spinlock.h @@ -179,7 +179,7 @@ static inline void rte_spinlock_recursive_init(rte_spinlock_recursive_t *slr) */ static inline void rte_spinlock_recursive_lock(rte_spinlock_recursive_t *slr) { - int id = rte_lcore_id(); + int id = rte_gettid(); if (slr->user != id) { rte_spinlock_lock(&slr->sl); @@ -212,7 +212,7 @@ static inline void rte_spinlock_recursive_unlock(rte_spinlock_recursive_t *slr) */ static inline int rte_spinlock_recursive_trylock(rte_spinlock_recursive_t *slr) { - int id = rte_lcore_id(); + int id = rte_gettid(); if (slr->user != id) { if (rte_spinlock_trylock(&slr->sl) == 0) -- 1.8.1.4
[dpdk-dev] [PATCH v7 16/19] mempool: add support to non-EAL thread
For non-EAL thread, bypass per lcore cache, directly use ring pool. It allows using rte_mempool in either EAL thread or any user pthread. As in non-EAL thread, it directly rely on rte_ring and it's none preemptive. It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool. It will get bad performance and has critical risk if scheduling policy is RT. Haven't found significant performance decrease by mempool_perf_test. Signed-off-by: Cunming Liang --- v6 changes: rollback v5 changes v5 changes: check __lcore_id with LCORE_ID_ANY instead of RTE_MAX_LCORE lib/librte_mempool/rte_mempool.h | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 3314651..4845f27 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -198,10 +198,12 @@ struct rte_mempool { * Number to add to the object-oriented statistics. */ #ifdef RTE_LIBRTE_MEMPOOL_DEBUG -#define __MEMPOOL_STAT_ADD(mp, name, n) do { \ - unsigned __lcore_id = rte_lcore_id(); \ - mp->stats[__lcore_id].name##_objs += n; \ - mp->stats[__lcore_id].name##_bulk += 1; \ +#define __MEMPOOL_STAT_ADD(mp, name, n) do {\ + unsigned __lcore_id = rte_lcore_id(); \ + if (__lcore_id < RTE_MAX_LCORE) { \ + mp->stats[__lcore_id].name##_objs += n; \ + mp->stats[__lcore_id].name##_bulk += 1; \ + } \ } while(0) #else #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0) @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, __MEMPOOL_STAT_ADD(mp, put, n); #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 - /* cache is not enabled or single producer */ - if (unlikely(cache_size == 0 || is_mp == 0)) + /* cache is not enabled or single producer or none EAL thread */ + if (unlikely(cache_size == 0 || is_mp == 0 || +lcore_id >= RTE_MAX_LCORE)) goto ring_enqueue; /* Go straight to ring if put would overflow mem allocated for cache */ @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, void **obj_table, uint32_t cache_size = mp->cache_size; /* cache is not enabled or single consumer */ - if (unlikely(cache_size == 0 || is_mc == 0 || n >= cache_size)) + if (unlikely(cache_size == 0 || is_mc == 0 || +n >= cache_size || lcore_id >= RTE_MAX_LCORE)) goto ring_dequeue; cache = &mp->local_cache[lcore_id]; -- 1.8.1.4
[dpdk-dev] [PATCH v7 17/19] ring: add support to non-EAL thread
ring debug stat won't take care non-EAL thread. Signed-off-by: Cunming Liang --- v6 changes: rollback v5 changes v5 changes: check __lcore_id with LCORE_ID_ANY instead of RTE_MAX_LCORE lib/librte_ring/rte_ring.h | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 7cd5f2d..39bacdd 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -188,10 +188,12 @@ struct rte_ring { * The number to add to the object-oriented statistics. */ #ifdef RTE_LIBRTE_RING_DEBUG -#define __RING_STAT_ADD(r, name, n) do { \ - unsigned __lcore_id = rte_lcore_id(); \ - r->stats[__lcore_id].name##_objs += n; \ - r->stats[__lcore_id].name##_bulk += 1; \ +#define __RING_STAT_ADD(r, name, n) do {\ + unsigned __lcore_id = rte_lcore_id(); \ + if (__lcore_id < RTE_MAX_LCORE) { \ + r->stats[__lcore_id].name##_objs += n; \ + r->stats[__lcore_id].name##_bulk += 1; \ + } \ } while(0) #else #define __RING_STAT_ADD(r, name, n) do {} while(0) -- 1.8.1.4
[dpdk-dev] [PATCH v7 18/19] ring: add sched_yield to avoid spin forever
Add a sched_yield() syscall if the thread spins for too long, waiting other thread to finish its operations on the ring. That gives pre-empted thread a chance to proceed and finish with ring enqnue/dequeue operation. The purpose is to reduce contention on the ring. By ring_perf_test, it doesn't shows additional perf penalty. Signed-off-by: Cunming Liang --- v6 changes: rename RTE_RING_PAUSE_REP to RTE_RING_PAUSE_REP_COUNT set default value as '0' in configure file v5 changes: add RTE_RING_PAUSE_REP to config file v4 changes: update and add more comments on sched_yield() v3 changes: new patch adding sched_yield() in rte_ring to avoid long spin config/common_bsdapp | 1 + config/common_linuxapp | 1 + lib/librte_ring/rte_ring.h | 31 +++ 3 files changed, 29 insertions(+), 4 deletions(-) diff --git a/config/common_bsdapp b/config/common_bsdapp index 57bacb8..b9a9eeb 100644 --- a/config/common_bsdapp +++ b/config/common_bsdapp @@ -234,6 +234,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y CONFIG_RTE_LIBRTE_RING=y CONFIG_RTE_LIBRTE_RING_DEBUG=n CONFIG_RTE_RING_SPLIT_PROD_CONS=n +CONFIG_RTE_RING_PAUSE_REP_COUNT=0 # # Compile librte_mempool diff --git a/config/common_linuxapp b/config/common_linuxapp index d428f84..abca5ff 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -242,6 +242,7 @@ CONFIG_RTE_PMD_PACKET_PREFETCH=y CONFIG_RTE_LIBRTE_RING=y CONFIG_RTE_LIBRTE_RING_DEBUG=n CONFIG_RTE_RING_SPLIT_PROD_CONS=n +CONFIG_RTE_RING_PAUSE_REP_COUNT=0 # # Compile librte_mempool diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 39bacdd..9bc1d5e 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -127,6 +127,11 @@ struct rte_ring_debug_stats { #define RTE_RING_NAMESIZE 32 /**< The maximum length of a ring name. */ #define RTE_RING_MZ_PREFIX "RG_" +#ifndef RTE_RING_PAUSE_REP_COUNT +#define RTE_RING_PAUSE_REP_COUNT 0 /**< yield after pause num of times, no yield + * if RTE_RING_PAUSE_REP not defined. */ +#endif + /** * An RTE ring structure. * @@ -410,7 +415,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table, uint32_t cons_tail, free_entries; const unsigned max = n; int success; - unsigned i; + unsigned i, rep = 0; uint32_t mask = r->prod.mask; int ret; @@ -468,9 +473,18 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table, * If there are other enqueues in progress that preceded us, * we need to wait for them to complete */ - while (unlikely(r->prod.tail != prod_head)) + while (unlikely(r->prod.tail != prod_head)) { rte_pause(); + /* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting +* for other thread finish. It gives pre-empted thread a chance +* to proceed and finish with ring denqnue operation. */ + if (RTE_RING_PAUSE_REP_COUNT && + ++rep == RTE_RING_PAUSE_REP_COUNT) { + rep = 0; + sched_yield(); + } + } r->prod.tail = prod_next; return ret; } @@ -589,7 +603,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table, uint32_t cons_next, entries; const unsigned max = n; int success; - unsigned i; + unsigned i, rep = 0; uint32_t mask = r->prod.mask; /* move cons.head atomically */ @@ -634,9 +648,18 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table, * If there are other dequeues in progress that preceded us, * we need to wait for them to complete */ - while (unlikely(r->cons.tail != cons_head)) + while (unlikely(r->cons.tail != cons_head)) { rte_pause(); + /* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting +* for other thread finish. It gives pre-empted thread a chance +* to proceed and finish with ring denqnue operation. */ + if (RTE_RING_PAUSE_REP_COUNT && + ++rep == RTE_RING_PAUSE_REP_COUNT) { + rep = 0; + sched_yield(); + } + } __RING_STAT_ADD(r, deq_success, n); r->cons.tail = cons_next; -- 1.8.1.4
[dpdk-dev] [PATCH v7 19/19] timer: add support to non-EAL thread
Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID). E.g. ? dynamically created thread will be able to reset/stop timer for lcore thread, but it will be not allowed to setup timer for itself or another non-lcore thread. rte_timer_manage() for non-lcore thread would simply do nothing and return straightway. Signed-off-by: Cunming Liang --- v6 changes: use 'RTE_MAX_LCORE' to check the EAL thread with valid lcore_id. use 'LCORE_ID_ANY' for any unspecified lcore_id assignment. v5 changes: add assert in rte_timer_manage remove duplicate check in timer_set_config_state lib/librte_timer/rte_timer.c | 32 +++- lib/librte_timer/rte_timer.h | 4 ++-- 2 files changed, 25 insertions(+), 11 deletions(-) diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c index 269a992..76c9cae 100644 --- a/lib/librte_timer/rte_timer.c +++ b/lib/librte_timer/rte_timer.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -79,9 +80,10 @@ static struct priv_timer priv_timer[RTE_MAX_LCORE]; /* when debug is enabled, store some statistics */ #ifdef RTE_LIBRTE_TIMER_DEBUG -#define __TIMER_STAT_ADD(name, n) do { \ - unsigned __lcore_id = rte_lcore_id(); \ - priv_timer[__lcore_id].stats.name += (n); \ +#define __TIMER_STAT_ADD(name, n) do { \ + unsigned __lcore_id = rte_lcore_id(); \ + if (__lcore_id < RTE_MAX_LCORE) \ + priv_timer[__lcore_id].stats.name += (n); \ } while(0) #else #define __TIMER_STAT_ADD(name, n) do {} while(0) @@ -135,7 +137,7 @@ timer_set_config_state(struct rte_timer *tim, /* timer is running on another core, exit */ if (prev_status.state == RTE_TIMER_RUNNING && - (unsigned)prev_status.owner != lcore_id) + prev_status.owner != (uint16_t)lcore_id) return -1; /* timer is being configured on another core */ @@ -366,9 +368,16 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, /* round robin for tim_lcore */ if (tim_lcore == (unsigned)LCORE_ID_ANY) { - tim_lcore = rte_get_next_lcore(priv_timer[lcore_id].prev_lcore, - 0, 1); - priv_timer[lcore_id].prev_lcore = tim_lcore; + if (lcore_id < RTE_MAX_LCORE) { + /* EAL thread with valid lcore_id */ + tim_lcore = rte_get_next_lcore( + priv_timer[lcore_id].prev_lcore, + 0, 1); + priv_timer[lcore_id].prev_lcore = tim_lcore; + } else + /* non-EAL thread do not run rte_timer_manage(), +* so schedule the timer on the first enabled lcore. */ + tim_lcore = rte_get_next_lcore(LCORE_ID_ANY, 0, 1); } /* wait that the timer is in correct status before update, @@ -378,7 +387,8 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire, return -1; __TIMER_STAT_ADD(reset, 1); - if (prev_status.state == RTE_TIMER_RUNNING) { + if (prev_status.state == RTE_TIMER_RUNNING && + lcore_id < RTE_MAX_LCORE) { priv_timer[lcore_id].updated = 1; } @@ -455,7 +465,8 @@ rte_timer_stop(struct rte_timer *tim) return -1; __TIMER_STAT_ADD(stop, 1); - if (prev_status.state == RTE_TIMER_RUNNING) { + if (prev_status.state == RTE_TIMER_RUNNING && + lcore_id < RTE_MAX_LCORE) { priv_timer[lcore_id].updated = 1; } @@ -499,6 +510,9 @@ void rte_timer_manage(void) uint64_t cur_time; int i, ret; + /* timer manager only runs on EAL thread with valid lcore_id */ + assert(lcore_id < RTE_MAX_LCORE); + __TIMER_STAT_ADD(manage, 1); /* optimize for the case where per-cpu list is empty */ if (priv_timer[lcore_id].pending_head.sl_next[0] == NULL) diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h index 4907cf5..35b8719 100644 --- a/lib/librte_timer/rte_timer.h +++ b/lib/librte_timer/rte_timer.h @@ -76,7 +76,7 @@ extern "C" { #define RTE_TIMER_RUNNING 2 /**< State: timer function is running. */ #define RTE_TIMER_CONFIG 3 /**< State: timer is being configured. */ -#define RTE_TIMER_NO_OWNER -1 /**< Timer has no owner. */ +#define RTE_TIMER_NO_OWNER -2 /**< Timer has no owner. */ /** * Timer type: Periodic or single (one-shot). @@ -310,7 +310,7 @@ int rte_timer_pending(struct rte_timer *tim); /** * Manage the timer list and execute callback functions. * - * This function must be called periodically from all cores + * This function m
[dpdk-dev] [PATCH 1/4] xen: allow choosing dom0 support at runtime
Hi Stephen, What do you mean ' allow choosing dom0 support at runtime'? If you mean user can choose DPDK to run Xen Dom0 or not on DOM0 by a runtime flag, I don't think your change can achieve this goal. Thanks Jijiang Liu > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger > Sent: Sunday, February 15, 2015 2:07 AM > To: dev at dpdk.org > Cc: Stephen Hemminger > Subject: [dpdk-dev] [PATCH 1/4] xen: allow choosing dom0 support at runtime > > The previous code would only allow building library and application so that it > ran on Xen DOM0 or not on DOM0. This changes that to a runtime flag. > > Signed-off-by: Stephen Hemminger > --- > lib/librte_eal/common/include/rte_memory.h | 4 +++ > lib/librte_eal/linuxapp/eal/eal_memory.c | 7 > lib/librte_ether/rte_ethdev.c | 22 > lib/librte_ether/rte_ethdev.h | 23 > lib/librte_mempool/rte_mempool.c | 26 +++--- > lib/librte_pmd_e1000/em_rxtx.c | 30 +++- > lib/librte_pmd_e1000/igb_rxtx.c| 52 +-- > lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 58 > +- > 8 files changed, 108 insertions(+), 114 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_memory.h > b/lib/librte_eal/common/include/rte_memory.h > index 7f8103f..ab6c1ff 100644 > --- a/lib/librte_eal/common/include/rte_memory.h > +++ b/lib/librte_eal/common/include/rte_memory.h > @@ -176,6 +176,10 @@ unsigned rte_memory_get_nchannel(void); unsigned > rte_memory_get_nrank(void); > > #ifdef RTE_LIBRTE_XEN_DOM0 > + > +/**< Internal use only - should DOM0 memory mapping be used */ extern > +int is_xen_dom0_supported(void); > + > /** > * Return the physical address of elt, which is an element of the pool mp. > * > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c > b/lib/librte_eal/linuxapp/eal/eal_memory.c > index a67a1b0..4afda2a 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c > @@ -98,6 +98,13 @@ > #include "eal_filesystem.h" > #include "eal_hugepages.h" > > +#ifdef RTE_LIBRTE_XEN_DOM0 > +int is_xen_dom0_supported(void) > +{ > + return internal_config.xen_dom0_support; } #endif > + > /** > * @file > * Huge page mapping under linux > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index > ea3a1fb..457e0bc 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -2825,6 +2825,27 @@ _rte_eth_dev_callback_process(struct rte_eth_dev > *dev, > } > rte_spinlock_unlock(&rte_eth_dev_cb_lock); > } > + > +const struct rte_memzone * > +rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char > *ring_name, > + uint16_t queue_id, size_t size, unsigned align, > + int socket_id) > +{ > + char z_name[RTE_MEMZONE_NAMESIZE]; > + const struct rte_memzone *mz; > + > + snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d", > + dev->driver->pci_drv.name, ring_name, > + dev->data->port_id, queue_id); > + > + mz = rte_memzone_lookup(z_name); > + if (mz) > + return mz; > + > + return rte_memzone_reserve_bounded(z_name, size, > +socket_id, 0, align, > RTE_PGSIZE_2M); } > + > #ifdef RTE_NIC_BYPASS > int rte_eth_dev_bypass_init(uint8_t port_id) { @@ -3003,6 +3024,7 @@ > rte_eth_dev_bypass_wd_reset(uint8_t port_id) > (*dev->dev_ops->bypass_wd_reset)(dev); > return 0; > } > + > #endif > > int > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index > 1200c1c..747acb5 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -3664,6 +3664,29 @@ int rte_eth_dev_filter_supported(uint8_t port_id, > enum rte_filter_type filter_ty int rte_eth_dev_filter_ctrl(uint8_t port_id, > enum rte_filter_type filter_type, > enum rte_filter_op filter_op, void *arg); > > +/** > + * Create memzone for HW rings. > + * malloc can't be used as the physical address is needed. > + * If the memzone is already created, then this function returns a ptr > + * to the old one. > + * > + * @param eth_dev > + * The *eth_dev* pointer is the address of the *rte_eth_dev* structure > + * @param name > + * The name of the memory zone > + * @param queue_id > + * The index of the queue to add to name > + * @param size > + * The sizeof of the memory area > + * @param align > + * Alignment for resulting memzone. Must be a power of 2. > + * @param socket_id > + * The *socket_id* argument is the socket identifier in case of NUMA. > + */ > +const struct rte_memzone * > +rte_eth_dma_zone_reserve(const struct rte_eth_dev *eth_dev, const char > *name, > + uint16_t queue_id, size_t size, > + unsigned align,
[dpdk-dev] [PATCH v2 0/5] Integrate flex filter in igb driver to new API
v2 changes: - split one patch to patch series - change the command's format in testpmd. - add doc changes in testpmd_funcs.rst - correct the errors reported by checkpatch.pl The patch set uses new filter_ctrl API to replace old flex filter APIs. It uses new functions and structure to replace old ones in igb driver, new commands to replace old ones in testpmd, and removes the old APIs. Jingjing Wu (5): ethdev: define flex filter type and its structure e1000: new functions replace old ones for flex filter testpmd: new commands for flex filter ethdev: remove old APIs and structures of syn filter doc: commands changed in testpmd_funcs for flex filter app/test-pmd/cmdline.c | 239 app/test-pmd/config.c | 33 --- app/test-pmd/testpmd.h | 1 - doc/guides/testpmd_app_ug/testpmd_funcs.rst | 56 + lib/librte_ether/rte_eth_ctrl.h | 20 ++ lib/librte_ether/rte_ethdev.c | 51 - lib/librte_ether/rte_ethdev.h | 89 lib/librte_pmd_e1000/e1000_ethdev.h | 27 ++- lib/librte_pmd_e1000/igb_ethdev.c | 332 +--- 9 files changed, 352 insertions(+), 496 deletions(-) -- 1.9.3
[dpdk-dev] [PATCH v2 1/5] ethdev: define flex filter type and its structure
This patch defines flex filter type RTE_ETH_FILTER_FLEXIBLE and its structure rte_eth_flex_filter. Signed-off-by: Jingjing Wu --- lib/librte_ether/rte_eth_ctrl.h | 20 1 file changed, 20 insertions(+) diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h index 0ce241e..beacfa3 100644 --- a/lib/librte_ether/rte_eth_ctrl.h +++ b/lib/librte_ether/rte_eth_ctrl.h @@ -53,6 +53,7 @@ enum rte_filter_type { RTE_ETH_FILTER_NONE = 0, RTE_ETH_FILTER_MACVLAN, RTE_ETH_FILTER_ETHERTYPE, + RTE_ETH_FILTER_FLEXIBLE, RTE_ETH_FILTER_TUNNEL, RTE_ETH_FILTER_FDIR, RTE_ETH_FILTER_HASH, @@ -116,6 +117,25 @@ struct rte_eth_ethertype_filter { uint16_t queue; /**< Queue assigned to when match*/ }; +#define RTE_FLEX_FILTER_MAXLEN 128 /**< bytes to use in flex filter. */ +#define RTE_FLEX_FILTER_MASK_SIZE \ + (RTE_ALIGN(RTE_FLEX_FILTER_MAXLEN, CHAR_BIT) / CHAR_BIT) + /**< mask bytes in flex filter. */ + +/** + * A structure used to define the flex filter entry + * to support RTE_ETH_FILTER_FLEXIBLE with RTE_ETH_FILTER_ADD, + * RTE_ETH_FILTER_DELETE and RTE_ETH_FILTER_GET operations. + */ +struct rte_eth_flex_filter { + uint16_t len; + uint8_t bytes[RTE_FLEX_FILTER_MAXLEN]; /**< flex bytes in big endian.*/ + uint8_t mask[RTE_FLEX_FILTER_MASK_SIZE];/**< if mask bit is 1b, do + not compare corresponding byte. */ + uint8_t priority; + uint16_t queue; /**< Queue assigned to when match. */ +}; + /** * Tunneled type. */ -- 1.9.3
[dpdk-dev] [PATCH v2 3/5] testpmd: new commands for flex filter
Following commands of flex filter are removed: - add_flex_filter (port_id) len (len_value) bytes (bytes_string) mask (mask_value) priority (prio_value) queue (queue_id) - remove_flex_filter (port_id) index (idx) - get_flex_filter (port_id) index (idx) New command is added for flex filter by using filter_ctrl API and new flex filter structure: - flex_filter (port_id) (add|del) len (len_value) bytes (bytes_value) mask (mask_value) priority (prio_value) queue (queue_id) Signed-off-by: Jingjing Wu --- app/test-pmd/cmdline.c | 239 +++-- app/test-pmd/config.c | 33 --- app/test-pmd/testpmd.h | 1 - 3 files changed, 91 insertions(+), 182 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 590e427..e78c7b3 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -692,15 +692,10 @@ static void cmd_help_long_parsed(void *parsed_result, "get_syn_filter (port_id) " "get syn filter info.\n\n" - "add_flex_filter (port_id) len (len_value) bytes (bytes_string) mask (mask_value)" - " priority (prio_value) queue (queue_id) index (idx)\n" - "add a flex filter.\n\n" - - "remove_flex_filter (port_id) index (idx)\n" - "remove a flex filter.\n\n" - - "get_flex_filter (port_id) index (idx)\n" - "get info of a flex filter.\n\n" + "flex_filter (port_id) (add|del) len (len_value)" + " bytes (bytes_value) mask (mask_value)" + " priority (prio_value) queue (queue_id)\n" + "Add/Del a flex filter.\n\n" "flow_director_filter (port_id) (add|del)" " flow (ip4|ip4-frag|ip6|ip6-frag)" @@ -7742,6 +7737,7 @@ cmdline_parse_inst_t cmd_get_5tuple_filter = { /* *** ADD/REMOVE A flex FILTER *** */ struct cmd_flex_filter_result { cmdline_fixed_string_t filter; + cmdline_fixed_string_t ops; uint8_t port_id; cmdline_fixed_string_t len; uint8_t len_value; @@ -7753,8 +7749,6 @@ struct cmd_flex_filter_result { uint8_t priority_value; cmdline_fixed_string_t queue; uint16_t queue_id; - cmdline_fixed_string_t index; - uint16_t index_value; }; static int xdigit2val(unsigned char c) @@ -7775,113 +7769,106 @@ cmd_flex_filter_parsed(void *parsed_result, __attribute__((unused)) void *data) { int ret = 0; - struct rte_flex_filter filter; + struct rte_eth_flex_filter filter; struct cmd_flex_filter_result *res = parsed_result; char *bytes_ptr, *mask_ptr; - uint16_t len, i, j; + uint16_t len, i, j = 0; char c; - int val, mod = 0; - uint32_t dword = 0; + int val; uint8_t byte = 0; - uint8_t hex = 0; - if (!strcmp(res->filter, "add_flex_filter")) { - if (res->len_value > 128) { - printf("the len exceed the max length 128\n"); - return; - } - memset(&filter, 0, sizeof(struct rte_flex_filter)); - filter.len = res->len_value; - filter.priority = res->priority_value; - bytes_ptr = res->bytes_value; - mask_ptr = res->mask_value; - - j = 0; -/* translate bytes string to uint_32 array. */ - if (bytes_ptr[0] == '0' && ((bytes_ptr[1] == 'x') || - (bytes_ptr[1] == 'X'))) - bytes_ptr += 2; - len = strnlen(bytes_ptr, res->len_value * 2); - if (len == 0 || (len % 8 != 0)) { - printf("please check len and bytes input\n"); + if (res->len_value > RTE_FLEX_FILTER_MAXLEN) { + printf("the len exceed the max length 128\n"); + return; + } + memset(&filter, 0, sizeof(struct rte_eth_flex_filter)); + filter.len = res->len_value; + filter.priority = res->priority_value; + filter.queue = res->queue_id; + bytes_ptr = res->bytes_value; + mask_ptr = res->mask_value; + +/* translate bytes string to array. */ + if (bytes_ptr[0] == '0' && ((bytes_ptr[1] == 'x') || + (bytes_ptr[1] == 'X'))) + bytes_ptr += 2; + len = strnlen(bytes_ptr, res->len_value * 2); + if (len == 0 || (len % 8 != 0)) { + printf("please check len and bytes input\n"); + return; + } + for (i = 0; i < len; i++) { + c = bytes_ptr[i]; + if (isxdigit(c) == 0) { + /* invalid characters. */ + printf("invalid input\n"); return;
[dpdk-dev] [PATCH v2 4/5] ethdev: remove old APIs and structures of flex filter
Structure rte_flex_filter is removed. Following APIs are removed: - rte_eth_dev_add_flex_filter - rte_eth_dev_remove_flex_filter - rte_eth_dev_get_flex_filter Signed-off-by: Jingjing Wu --- lib/librte_ether/rte_ethdev.c | 51 - lib/librte_ether/rte_ethdev.h | 89 --- 2 files changed, 140 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index ea3a1fb..f8b1e8a 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -3172,57 +3172,6 @@ rte_eth_dev_get_5tuple_filter(uint8_t port_id, uint16_t index, } int -rte_eth_dev_add_flex_filter(uint8_t port_id, uint16_t index, - struct rte_flex_filter *filter, uint16_t rx_queue) -{ - struct rte_eth_dev *dev; - - if (port_id >= nb_ports) { - PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); - return -ENODEV; - } - - dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->add_flex_filter, -ENOTSUP); - return (*dev->dev_ops->add_flex_filter)(dev, index, filter, rx_queue); -} - -int -rte_eth_dev_remove_flex_filter(uint8_t port_id, uint16_t index) -{ - struct rte_eth_dev *dev; - - if (port_id >= nb_ports) { - PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); - return -ENODEV; - } - - dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->remove_flex_filter, -ENOTSUP); - return (*dev->dev_ops->remove_flex_filter)(dev, index); -} - -int -rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t index, - struct rte_flex_filter *filter, uint16_t *rx_queue) -{ - struct rte_eth_dev *dev; - - if (filter == NULL || rx_queue == NULL) - return -EINVAL; - - if (port_id >= nb_ports) { - PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); - return -ENODEV; - } - - dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_flex_filter, -ENOTSUP); - return (*dev->dev_ops->get_flex_filter)(dev, index, filter, - rx_queue); -} - -int rte_eth_dev_filter_supported(uint8_t port_id, enum rte_filter_type filter_type) { struct rte_eth_dev *dev; diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 1200c1c..7d455b5 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -992,17 +992,6 @@ struct rte_2tuple_filter { }; /** - * A structure used to define a flex filter. - */ -struct rte_flex_filter { - uint16_t len; - uint32_t dwords[32]; /**< flex bytes in big endian. */ - uint8_t mask[16]; /**< if mask bit is 1b, do not compare - corresponding byte in dwords. */ - uint8_t priority; -}; - -/** * A structure used to define a 5tuple filter. */ struct rte_5tuple_filter { @@ -1391,20 +1380,6 @@ typedef int (*eth_get_5tuple_filter_t)(struct rte_eth_dev *dev, uint16_t *rx_queue); /**< @internal Get a 5tuple filter rule on an Ethernet device */ -typedef int (*eth_add_flex_filter_t)(struct rte_eth_dev *dev, - uint16_t index, struct rte_flex_filter *filter, - uint16_t rx_queue); -/**< @internal Setup a new flex filter rule on an Ethernet device */ - -typedef int (*eth_remove_flex_filter_t)(struct rte_eth_dev *dev, - uint16_t index); -/**< @internal Remove a flex filter rule on an Ethernet device */ - -typedef int (*eth_get_flex_filter_t)(struct rte_eth_dev *dev, - uint16_t index, struct rte_flex_filter *filter, - uint16_t *rx_queue); -/**< @internal Get a flex filter rule on an Ethernet device */ - typedef int (*eth_filter_ctrl_t)(struct rte_eth_dev *dev, enum rte_filter_type filter_type, enum rte_filter_op filter_op, @@ -1515,9 +1490,6 @@ struct eth_dev_ops { eth_add_5tuple_filter_tadd_5tuple_filter;/**< add 5tuple filter. */ eth_remove_5tuple_filter_t remove_5tuple_filter; /**< remove 5tuple filter. */ eth_get_5tuple_filter_tget_5tuple_filter;/**< get 5tuple filter. */ - eth_add_flex_filter_t add_flex_filter; /**< add flex filter. */ - eth_remove_flex_filter_t remove_flex_filter; /**< remove flex filter. */ - eth_get_flex_filter_t get_flex_filter; /**< get flex filter. */ eth_filter_ctrl_t filter_ctrl; /**< common filter control*/ }; @@ -3568,67 +3540,6 @@ int rte_eth_dev_get_5tuple_filter(uint8_t port_id, uint16_t index, struct rte_5tuple_filter *filter, uint16_t *rx_queue); /** - * Add a new flex filter rule on an Ethernet device. - *
[dpdk-dev] [PATCH v2 5/5] doc: commands changed in testpmd_funcs for flex filter
document of new command: - flex_filter (add|del) (port_id) len (len_value) bytes (bytes_value) mask (mask_value) priority (prio_value) queue (queue_id) Signed-off-by: Jingjing Wu --- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 56 ++--- 1 file changed, 10 insertions(+), 46 deletions(-) diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index 218835a..9e38423 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -1595,15 +1595,14 @@ Example: syn filter: on, priority: high, queue: 3 -add_flex_filter +flex_filter ~~~ -Add a Flex filter, -which recognizes any arbitrary pattern within the first 128 bytes of the packet +By flex filter, packets can be recognized by any arbitrary pattern within the first 128 bytes of the packet and routes packets into one of the receive queues. -add_flex_filter (port_id) len (len_value) bytes (bytes_string) mask (mask_value) -priority (prio_value) queue (queue_id) index (idx) +flex_filter (add|del) (port_id) len (len_value) bytes (bytes_value) +mask (mask_value) priority (prio_value) queue (queue_id) The available information parameters are: @@ -1611,55 +1610,20 @@ The available information parameters are: * len_value: filter length in byte, no greater than 128. -* bytes_string: a sting in format of octal, means the value the flex filter need to match. +* bytes_value: a sting in format of octal, means the value the flex filter need to match. -* mask_value: a sting in format of octal, bit 1 means corresponding byte in DWORD participates in the match. +* mask_value: a sting in format of octal, bit 1 means corresponding byte participates in the match. * prio_value: the priority of this filter. * queue_id: The receive queue associated with this Flex filter. -* index: the index of this Flex filter - Example: .. code-block:: console - testpmd> add_flex_filter 0 len 16 bytes 0x0806 mask 000C priority 3 queue 3 index 0 - -Assign a packet whose 13th and 14th bytes are 0x0806 to queue 3. - -remove_flex_filter -~~ - -Remove a Flex filter - -remove_flex_filter (port_id) index (idx) + testpmd> flex_filter 0 add len 16 bytes 0x0806 +mask 000C priority 3 queue 3 -get_flex_filter -~~~ - -Get and display a Flex filter - -get_flex_filter (port_id) index (idx) - -Example: - -.. code-block:: console - -testpmd> get_flex_filter 0 index 0 - -filter[0]: - -length: 16 - -dword[]: 0x 0806 - - - - -mask[]: -0b11 -00 - -priority: 3 queue: 3 + testpmd> flex_filter 0 del len 16 bytes 0x0806 +mask 000C priority 3 queue 3 -- 1.9.3
[dpdk-dev] [PATCH v2 2/5] e1000: new functions replace old ones for flex filter
This patch defines new functions dealing with flex filter. It removes old functions of flex filter in igb driver. Syn filter is dealt with through entrance eth_igb_filter_ctrl. Signed-off-by: Jingjing Wu --- lib/librte_pmd_e1000/e1000_ethdev.h | 27 ++- lib/librte_pmd_e1000/igb_ethdev.c | 332 ++-- 2 files changed, 231 insertions(+), 128 deletions(-) diff --git a/lib/librte_pmd_e1000/e1000_ethdev.h b/lib/librte_pmd_e1000/e1000_ethdev.h index d155e77..b91fcad 100644 --- a/lib/librte_pmd_e1000/e1000_ethdev.h +++ b/lib/librte_pmd_e1000/e1000_ethdev.h @@ -78,11 +78,16 @@ #define E1000_MAX_TTQF_FILTERS 8 #define E1000_2TUPLE_MAX_PRI 7 -#define E1000_MAX_FLEXIBLE_FILTERS 8 +#define E1000_MAX_FLEX_FILTERS 8 #define E1000_MAX_FHFT 4 #define E1000_MAX_FHFT_EXT 4 +#define E1000_FHFT_SIZE_IN_DWD 64 #define E1000_MAX_FLEX_FILTER_PRI7 #define E1000_MAX_FLEX_FILTER_LEN128 +#define E1000_MAX_FLEX_FILTER_DWDS \ + (E1000_MAX_FLEX_FILTER_LEN / sizeof(uint32_t)) +#define E1000_FLEX_FILTERS_MASK_SIZE \ + (E1000_MAX_FLEX_FILTER_DWDS / 4) #define E1000_FHFT_QUEUEING_LEN 0x007F #define E1000_FHFT_QUEUEING_QUEUE0x0700 #define E1000_FHFT_QUEUEING_PRIO 0x0007 @@ -131,6 +136,24 @@ struct e1000_vf_info { uint16_t tx_rate; }; +TAILQ_HEAD(e1000_flex_filter_list, e1000_flex_filter); + +struct e1000_flex_filter_info { + uint16_t len; + uint32_t dwords[E1000_MAX_FLEX_FILTER_DWDS]; /* flex bytes in dword. */ + /* if mask bit is 1b, do not compare corresponding byte in dwords. */ + uint8_t mask[E1000_FLEX_FILTERS_MASK_SIZE]; + uint8_t priority; +}; + +/* Flex filter structure */ +struct e1000_flex_filter { + TAILQ_ENTRY(e1000_flex_filter) entries; + uint16_t index; /* index of flex filter */ + struct e1000_flex_filter_info filter_info; + uint16_t queue; /* rx queue assigned to */ +}; + /* * Structure to store filters' info. */ @@ -138,6 +161,8 @@ struct e1000_filter_info { uint8_t ethertype_mask; /* Bit mask for every used ethertype filter */ /* store used ethertype filters*/ uint16_t ethertype_filters[E1000_MAX_ETQF_FILTERS]; + uint8_t flex_mask; /* Bit mask for every used flex filter */ + struct e1000_flex_filter_list flex_list; }; /* diff --git a/lib/librte_pmd_e1000/igb_ethdev.c b/lib/librte_pmd_e1000/igb_ethdev.c index 2a268b8..6e9240e 100644 --- a/lib/librte_pmd_e1000/igb_ethdev.c +++ b/lib/librte_pmd_e1000/igb_ethdev.c @@ -162,14 +162,14 @@ static int eth_igb_remove_2tuple_filter(struct rte_eth_dev *dev, static int eth_igb_get_2tuple_filter(struct rte_eth_dev *dev, uint16_t index, struct rte_2tuple_filter *filter, uint16_t *rx_queue); -static int eth_igb_add_flex_filter(struct rte_eth_dev *dev, - uint16_t index, - struct rte_flex_filter *filter, uint16_t rx_queue); -static int eth_igb_remove_flex_filter(struct rte_eth_dev *dev, - uint16_t index); +static int eth_igb_add_del_flex_filter(struct rte_eth_dev *dev, + struct rte_eth_flex_filter *filter, + bool add); static int eth_igb_get_flex_filter(struct rte_eth_dev *dev, - uint16_t index, - struct rte_flex_filter *filter, uint16_t *rx_queue); + struct rte_eth_flex_filter *filter); +static int eth_igb_flex_filter_handle(struct rte_eth_dev *dev, + enum rte_filter_op filter_op, + void *arg); static int eth_igb_add_5tuple_filter(struct rte_eth_dev *dev, uint16_t index, struct rte_5tuple_filter *filter, uint16_t rx_queue); @@ -271,9 +271,6 @@ static struct eth_dev_ops eth_igb_ops = { .add_2tuple_filter = eth_igb_add_2tuple_filter, .remove_2tuple_filter= eth_igb_remove_2tuple_filter, .get_2tuple_filter = eth_igb_get_2tuple_filter, - .add_flex_filter = eth_igb_add_flex_filter, - .remove_flex_filter = eth_igb_remove_flex_filter, - .get_flex_filter = eth_igb_get_flex_filter, .add_5tuple_filter = eth_igb_add_5tuple_filter, .remove_5tuple_filter= eth_igb_remove_5tuple_filter, .get_5tuple_filter = eth_igb_get_5tuple_filter, @@ -470,6 +467,8 @@ eth_igb_dev_init(__attribute__((unused)) struct eth_driver *eth_drv, E1000_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private); struct e1000_vfta * shadow_vfta = E1000_DEV_PRIVATE_TO_VFTA(eth_dev->data->dev_private); + struct e1000_filter_info *filter_info = + E1000_DEV_PRIVATE_TO_FILTER_INFO(eth_dev->data->dev_private); uint32_t ctrl_ext; pci_dev =
[dpdk-dev] [PATCH v2 0/7] unified flow types and RSS offload types
> -Original Message- > From: Zhang, Helin > Sent: Wednesday, February 04, 2015 3:16 PM > To: dev at dpdk.org > Cc: Wu, Jingjing; Cao, Waterman; Zhang, Helin > Subject: [PATCH v2 0/7] unified flow types and RSS offload types > > It unifies the flow types and RSS offload types for all PMDs. Previously flow > types are defined specifically for i40e, and there has different RSS offloads > tyeps for 1/10G and 40G seperately. This is not so convenient for application > development, and not good for adding new PMDs. > In addition, it enables new RSS offloads of 'tcp' and 'all' in testpmd. > > v2 changes: > * Integrated with configuring hash functions. > * Corrected the wrong help string of flow director parameters. > * Renamed the flow types from ETH_FLOW_TYPE_ to RTE_ETH_FLOW_. > * Removed useless annotations for flow type elements in rte_eth_ctrl.h. > > Helin Zhang (7): > app/test-pmd: code style fix > ethdev: code style fix > i40e: code style fix > ethdev: fix of calculating the size of flow type mask array > ethdev: unification of flow types > ethdev: unification of RSS offload types > app/testpmd: support new rss offloads > > app/test-pipeline/init.c| 2 +- > app/test-pmd/cmdline.c | 154 +++--- > app/test-pmd/config.c | 137 +- > examples/distributor/main.c | 9 +- > examples/ip_pipeline/init.c | 2 +- > examples/l3fwd-acl/main.c | 7 +- > lib/librte_ether/rte_eth_ctrl.h | 94 ++ > lib/librte_ether/rte_ethdev.h | 147 > lib/librte_pmd_e1000/e1000_ethdev.h | 11 +++ > lib/librte_pmd_e1000/igb_ethdev.c | 1 + > lib/librte_pmd_e1000/igb_rxtx.c | 27 ++ > lib/librte_pmd_i40e/i40e_ethdev.c | 164 + > --- > lib/librte_pmd_i40e/i40e_ethdev.h | 52 +- > lib/librte_pmd_i40e/i40e_ethdev_vf.c| 1 + > lib/librte_pmd_i40e/i40e_fdir.c | 99 ++- > lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 1 + > lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 11 +++ > lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 27 ++ > lib/librte_pmd_vmxnet3/vmxnet3_ethdev.c | 1 + > lib/librte_pmd_vmxnet3/vmxnet3_ethdev.h | 6 ++ > lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 10 +- > 21 files changed, 525 insertions(+), 438 deletions(-) > > -- > 1.9.3 Acked-by: Jingjing Wu
[dpdk-dev] [PATCH v5 00/17] lib/librte_pmd_fm10k : fm10k pmd driver
On 2/13/2015 4:20 PM, Chen, Jing D wrote: > From: "Chen Jing D(Mark)" > > The patch set add poll mode driver for the host interface of Intel > Ethernet Switch FM1 Series of silicons, which integrate NIC and > switch functionalities. The patch set include below features: > > 1. Basic RX/TX functions for PF/VF. > 2. Interrupt handling mechanism for PF/VF. > 3. per queue start/stop functions for PF/VF. > 4. Mailbox handling between PF/VF and PF/Switch Manager. > 5. Receive Side Scaling (RSS) for PF/VF. > 6. Scatter receive function for PF/VF. > 7. reta update/query for PF/VF. > 8. VLAN filter set for PF. > 9. Link status query for PF/VF. > > Change in v5: > - Add sanity check for mbuf allocation. > - Add a new patch to claim fm10k driver review > - Change commit log. > - Add unlikely in func rx_desc_to_ol_flags to gain performance > - Add a new patch to add ABI version > > Change in v4: > - Change commit log to remove improper words. > > Changes in v3: > - Update base driver. > - Define several macros to pass base driver compile. > > Changes in v2: > - Merge 3 patches into 1 to configure fm10k compile environment. > - Rework on log code to follow style in ixgbe. > - Rework log message, remove redundant '\n' > - Update Copyright year from "2014" to "2015" > - Change base driver directory name from SHARED to base > - Add more description in log for patch "add PF and VF interrupt" > - Merge 2 patches into 1 to register fm10k driver > - Define macro to replace numeric for lower 32-bit mask. > > Chen Jing D(Mark) (1): > maintainers: claim for fm10k review > > Jeff Shaw (15): > fm10k: add base driver > eal: add fm10k device id > fm10k: register fm10k pmd PF driver > Change config files to add fm10k into compile > fm10k: add reta update/requery functions > fm10k: add rx_queue_setup/release function > fm10k: add tx_queue_setup/release function > fm10k: add RX/TX single queue start/stop function > fm10k: add dev start/stop functions > fm10k: add receive and tranmit function > fm10k: add PF RSS support > fm10k: Add scatter receive function > fm10k: add function to set vlan > fm10k: Add SRIOV-VF support > fm10k: add PF and VF interrupt handling function > > Michael Qiu (1): > fm10k: Add ABI version of librte_pmd_fm10k > > MAINTAINERS |4 + > config/common_bsdapp| 11 + > config/common_linuxapp | 11 + > lib/Makefile|1 + > lib/librte_eal/common/include/rte_pci_dev_ids.h | 22 + > lib/librte_pmd_fm10k/Makefile | 100 + > lib/librte_pmd_fm10k/base/fm10k_api.c | 341 > lib/librte_pmd_fm10k/base/fm10k_api.h | 61 + > lib/librte_pmd_fm10k/base/fm10k_common.c| 572 ++ > lib/librte_pmd_fm10k/base/fm10k_common.h| 52 + > lib/librte_pmd_fm10k/base/fm10k_mbx.c | 2185 > +++ > lib/librte_pmd_fm10k/base/fm10k_mbx.h | 329 > lib/librte_pmd_fm10k/base/fm10k_osdep.h | 148 ++ > lib/librte_pmd_fm10k/base/fm10k_pf.c| 1992 + > lib/librte_pmd_fm10k/base/fm10k_pf.h| 155 ++ > lib/librte_pmd_fm10k/base/fm10k_tlv.c | 914 ++ > lib/librte_pmd_fm10k/base/fm10k_tlv.h | 199 ++ > lib/librte_pmd_fm10k/base/fm10k_type.h | 937 ++ > lib/librte_pmd_fm10k/base/fm10k_vf.c| 641 +++ > lib/librte_pmd_fm10k/base/fm10k_vf.h| 91 + > lib/librte_pmd_fm10k/fm10k.h| 293 +++ > lib/librte_pmd_fm10k/fm10k_ethdev.c | 1868 +++ > lib/librte_pmd_fm10k/fm10k_logs.h | 78 + > lib/librte_pmd_fm10k/fm10k_rxtx.c | 459 + > lib/librte_pmd_fm10k/rte_pmd_fm10k_version.map |4 + > mk/rte.app.mk |4 + > 26 files changed, 11472 insertions(+), 0 deletions(-) > create mode 100644 lib/librte_pmd_fm10k/Makefile > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_api.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_api.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_common.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_common.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_mbx.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_mbx.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_osdep.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_pf.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_pf.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_tlv.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_tlv.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_type.h > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_vf.c > create mode 100644 lib/librte_pmd_fm10k/base/fm10k_vf.h > create mode 100644 lib/librte_pmd_fm10k/fm10k.h > create mode 100644 lib/librte_pmd_fm10k
[dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API declaration
On Sun, Feb 15, 2015 at 01:13:07AM +, Liang, Cunming wrote: > Hi, > > > -Original Message- > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > Sent: Friday, February 13, 2015 9:58 PM > > To: Liang, Cunming > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API > > declaration > > > > On Fri, Feb 13, 2015 at 09:38:08AM +0800, Cunming Liang wrote: > > > 1. add two TLS *_socket_id* and *_cpuset* > > > 2. add two external API rte_thread_set/get_affinity > > > 3. add one internal API eal_thread_dump_affinity > > > > > > Signed-off-by: Cunming Liang > > > --- > > > v5 changes: > > >add comments for RTE_CPU_AFFINITY_STR_LEN > > >update comments for eal_thread_dump_affinity() > > >return void for rte_thread_get_affinity() > > >move rte_socket_id() change to another patch > > > > > > lib/librte_eal/bsdapp/eal/eal_thread.c| 2 ++ > > > lib/librte_eal/common/eal_thread.h| 36 > > +++ > > > lib/librte_eal/common/include/rte_lcore.h | 26 +- > > > lib/librte_eal/linuxapp/eal/eal_thread.c | 2 ++ > > > 4 files changed, 65 insertions(+), 1 deletion(-) > > > > > > diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c > > b/lib/librte_eal/bsdapp/eal/eal_thread.c > > > index ab05368..10220c7 100644 > > > --- a/lib/librte_eal/bsdapp/eal/eal_thread.c > > > +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c > > > @@ -56,6 +56,8 @@ > > > #include "eal_thread.h" > > > > > > RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); > > > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id); > > > +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); > > > > > > /* > > > * Send a message to a slave lcore identified by slave_id to call a > > > diff --git a/lib/librte_eal/common/eal_thread.h > > b/lib/librte_eal/common/eal_thread.h > > > index f1ce0bd..e4e76b9 100644 > > > --- a/lib/librte_eal/common/eal_thread.h > > > +++ b/lib/librte_eal/common/eal_thread.h > > > @@ -34,6 +34,8 @@ > > > #ifndef EAL_THREAD_H > > > #define EAL_THREAD_H > > > > > > +#include > > > + > > > /** > > > * basic loop of thread, called for each thread by eal_init(). > > > * > > > @@ -61,4 +63,38 @@ void eal_thread_init_master(unsigned lcore_id); > > > */ > > > unsigned eal_cpu_socket_id(unsigned cpu_id); > > > > > > +/** > > > + * Get the NUMA socket id from cpuset. > > > + * This function is private to EAL. > > > + * > > > + * @param cpusetp > > > + * The point to a valid cpu set. > > > + * @return > > > + * socket_id or SOCKET_ID_ANY > > > + */ > > > +int eal_cpuset_socket_id(rte_cpuset_t *cpusetp); > > > + > > > +/** > > > + * Default buffer size to use with eal_thread_dump_affinity() > > > + */ > > > +#define RTE_CPU_AFFINITY_STR_LEN256 > > > + > > > +/** > > > + * Dump the current pthread cpuset. > > > + * This function is private to EAL. > > > + * > > > + * Note: > > > + * If the dump size is greater than the size of given buffer, > > > + * the string will be truncated and with '\0' at the end. > > > + * > > > + * @param str > > > + * The string buffer the cpuset will dump to. > > > + * @param size > > > + * The string buffer size. > > > + * @return > > > + * 0 for success, -1 if truncation happens. > > > + */ > > > +int > > > +eal_thread_dump_affinity(char *str, unsigned size); > > > + > > > #endif /* EAL_THREAD_H */ > > > diff --git a/lib/librte_eal/common/include/rte_lcore.h > > b/lib/librte_eal/common/include/rte_lcore.h > > > index 4c7d6bb..33f558e 100644 > > > --- a/lib/librte_eal/common/include/rte_lcore.h > > > +++ b/lib/librte_eal/common/include/rte_lcore.h > > > @@ -80,7 +80,9 @@ struct lcore_config { > > > */ > > > extern struct lcore_config lcore_config[RTE_MAX_LCORE]; > > > > > > -RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per core "core id". */ > > > +RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per thread "lcore id". > > */ > > > +RTE_DECLARE_PER_LCORE(unsigned, _socket_id); /**< Per thread "socket id". > > */ > > > +RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". > > */ > > > > > > /** > > > * Return the ID of the execution unit we are running on. > > > @@ -229,6 +231,28 @@ rte_get_next_lcore(unsigned i, int skip_master, int > > wrap) > > >i > >i = rte_get_next_lcore(i, 1, 0)) > > > > > > +/** > > > + * Set core affinity of the current thread. > > > + * Support both EAL and none-EAL thread and update TLS. > > > + * > > > + * @param cpusetp > > > + * Point to cpu_set_t for setting current thread affinity. > > > + * @return > > > + * On success, return 0; otherwise return -1; > > > + */ > > > +int rte_thread_set_affinity(rte_cpuset_t *cpusetp); > > > + > > > +/** > > > + * Get core affinity of the current thread. > > > + * > > > + * @param cpusetp > > > + * Point to cpu_set_t for getting current thread cpu affinity. > > > + * It presumes input is not NULL, otherwise it causes panic. > > > + * > > > + */ > > > +void rte_thre
[dpdk-dev] [PATCH v1] test: add ut for eal flags --lcores
The patch add unit test for the new eal option "--lcores". Signed-off-by: Cunming Liang --- It depends on the previous patch which enabling EAL "--lcores" option. http://dpdk.org/ml/archives/dev/2015-February/013204.html app/test/test_eal_flags.c | 95 --- 1 file changed, 81 insertions(+), 14 deletions(-) diff --git a/app/test/test_eal_flags.c b/app/test/test_eal_flags.c index 0a8269c..0352f87 100644 --- a/app/test/test_eal_flags.c +++ b/app/test/test_eal_flags.c @@ -512,47 +512,114 @@ test_missing_c_flag(void) /* -c flag but no coremask value */ const char *argv1[] = { prgname, prefix, mp_flag, "-n", "3", "-c"}; - /* No -c or -l flag at all */ + /* No -c, -l or --lcores flag at all */ const char *argv2[] = { prgname, prefix, mp_flag, "-n", "3"}; /* bad coremask value */ - const char *argv3[] = { prgname, prefix, mp_flag, "-n", "3", "-c", "error" }; + const char *argv3[] = { prgname, prefix, mp_flag, + "-n", "3", "-c", "error" }; /* sanity check of tests - valid coremask value */ - const char *argv4[] = { prgname, prefix, mp_flag, "-n", "3", "-c", "1" }; + const char *argv4[] = { prgname, prefix, mp_flag, + "-n", "3", "-c", "1" }; /* -l flag but no corelist value */ - const char *argv5[] = { prgname, prefix, mp_flag, "-n", "3", "-l"}; - const char *argv6[] = { prgname, prefix, mp_flag, "-n", "3", "-l", " " }; + const char *argv5[] = { prgname, prefix, mp_flag, + "-n", "3", "-l"}; + const char *argv6[] = { prgname, prefix, mp_flag, + "-n", "3", "-l", " " }; /* bad corelist values */ - const char *argv7[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "error" }; - const char *argv8[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1-" }; - const char *argv9[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1," }; - const char *argv10[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1#2" }; + const char *argv7[] = { prgname, prefix, mp_flag, + "-n", "3", "-l", "error" }; + const char *argv8[] = { prgname, prefix, mp_flag, + "-n", "3", "-l", "1-" }; + const char *argv9[] = { prgname, prefix, mp_flag, + "-n", "3", "-l", "1," }; + const char *argv10[] = { prgname, prefix, mp_flag, +"-n", "3", "-l", "1#2" }; /* sanity check test - valid corelist value */ - const char *argv11[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1-2,3" }; + const char *argv11[] = { prgname, prefix, mp_flag, +"-n", "3", "-l", "1-2,3" }; + + /* --lcores flag but no lcores value */ + const char *argv12[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores" }; + const char *argv13[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", " " }; + /* bad lcores value */ + const char *argv14[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "1-3-5" }; + const char *argv15[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "0-1,,2" }; + const char *argv16[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "0-,1" }; + const char *argv17[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "(0-,2-4)" }; + const char *argv18[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "(-1,2)" }; + const char *argv19[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "(2-4)@(2-4-6)" }; + const char *argv20[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "(a,2)" }; + const char *argv21[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "1-3@(1,3)" }; + const char *argv22[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "3@((1,3)" }; + const char *argv23[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "(4-7)=(1,3)" }; + const char *argv24[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", "[4-7]@(1,3)" }; + /* sanity check of tests - valid lcores value */ + const char *argv25[] = { prgname, prefix, mp_flag, +"-n", "3", "--lcores", +"0-1,2@(5-7),(3-5)@(0,2),(0,6),7"}; if (launch_proc(argv1) == 0 || launch_proc(argv2) == 0 || launch_proc(argv3) == 0) { -
[dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API declaration
> -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Sunday, February 15, 2015 1:17 PM > To: Liang, Cunming > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 06/19] eal: new TLS definition and API > declaration > [...] > > > > > > > > RTE_DEFINE_PER_LCORE(unsigned, _lcore_id); > > > > +RTE_DEFINE_PER_LCORE(unsigned, _socket_id); > > > > +RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); > > > > > > > > /* > > > > * Send a message to a slave lcore identified by slave_id to call a > > > > -- > > > > 1.8.1.4 > > > > > > > > > > > All of these exported functions need to be exported in the version map. > > > Also, > I > > > don't think its a good idea to simply expose the per lcore cpuset > > > variables. It > > > would be far better to create an api around them > > [LCM] Thanks for the remind, I haven't taken care of the version map. > > The rte_thread_set/get_affinity() are the api around _cpuset, so do you > suggest we don't put 'per_lcore__cpuset' into rte_eal_version.map ? > > On this point, I agree with you and think we'd better not expose > 'per_lcore__socket_id' as well, what do you think ? > Yes, absolutely, you should wrap some API around them, and make them > defined > symbols, only inline them if they're going to be in the hot path. [LCM] _socket_id is wrapped by rte_socket_id() and rte_thread_set_affinity(). rte_socket_id() is defined as inline in rte_lcore.h. So finally two (rte_thread_set/get_affinity()) are added into version map. All these are updated in v7. > > Thanks > Neil > > > > > > > Neil > > > >
[dpdk-dev] [PATCH] enic: silence log message
Hi Stephen, David, I agree with you and shall submit this change. Thanks, -Sujith On 09/02/15 9:41 pm, "Stephen Hemminger" wrote: >Agree it should not use printf. >If you insist on keeping the useless message then it should be log level >debug
[dpdk-dev] [PATCH v3 00/20] enhance tx checksum offload API
> -Original Message- > From: Olivier Matz [mailto:olivier.matz at 6wind.com] > Sent: Friday, February 13, 2015 5:23 PM > To: dev at dpdk.org > Cc: Ananyev, Konstantin; Liu, Jijiang; Zhang, Helin; olivier.matz at 6wind.com > Subject: [PATCH v3 00/20] enhance tx checksum offload API > > The goal of this series is to clarify and simplify the mbuf offload API. > > - simplify the definitions of PKT_TX_IP_CKSUM and PKT_TX_IPV4, each > flag has now only one meaning. No impact on the code. > > - add a feature flag for OUTER_IP_CHECKSUM (from Jijiang's patches) > > - remove the PKT_TX_UDP_TUNNEL_PKT flag: it is useless from an API point > of view. It was added because i40e need this info for some reason. We > have 3 solutions: > > - remove the flag and adapt the driver to the API (the choice I made > for this series). > > - remove the flag and stop advertising OUTER_IP_CHECKSUM in i40e > > - keep this flag, penalizing performance of drivers that do not > require the flag. It would also mean that drivers won't support > outer IP checksum for all tunnel types, but only for the tunnel > types having a flag. > > - a side effect of this API clarification is that there is only one > way for doing one operation. If the hardware has several ways to > do the same operation, a choice has to be made in the driver. > > The series also provide some enhancements and fixes related to this API > rework: > > - new tunnel types to testpmd csum forward engine. > - fixes in i40e to adapt to new api and support more tunnel types. > > [1] http://dpdk.org/ml/archives/dev/2015-January/011127.html > > Changes in v2: > - fix test of rx offload flag in parse_vlan() pointed out by Jijiang > > Changes in v3: > - more detailed API comments for PKT_TX_IPV4 and PKT_TX_IPV6 > - do not calculate the outer UDP checksum if packet is not UDP > - add a likely() in i40e > - remove a unlikely() in i40e > - fix a patch split issue > - rebase on head > > Jijiang Liu (2): > ethdev: add outer IP offload capability flag > i40e: advertise outer IPv4 checksum capability > > Olivier Matz (18): > mbuf: remove PKT_TX_IPV4_CSUM > mbuf: enhance the API documentation of offload flags > i40e: call i40e_txd_enable_checksum only for offloaded packets > i40e: remove the use of PKT_TX_UDP_TUNNEL_PKT flag > mbuf: remove PKT_TX_UDP_TUNNEL_PKT flag > testpmd: replace tx_checksum command by csum > testpmd: move csum_show in a function > testpmd: add csum parse_tunnel command > testpmd: rename vxlan in outer_ip in csum commands > testpmd: introduce parse_ipv* in csum fwd engine > testpmd: use a structure to store offload info in csum fwd engine > testpmd: introduce parse_vxlan in csum fwd engine > testpmd: support gre tunnels in csum fwd engine > testpmd: support ipip tunnel in csum forward engine > testpmd: add a warning if outer ip cksum requested but not supported > testpmd: fix TSO when using outer checksum offloads > i40e: fix offloading of outer checksum for ip in ip tunnels > i40e: add debug logs for tx context descriptors > > app/test-pmd/cmdline.c| 234 ++--- > app/test-pmd/csumonly.c | 425 ++- > --- > app/test-pmd/testpmd.h| 9 +- > lib/librte_ether/rte_ethdev.h | 1 + > lib/librte_mbuf/rte_mbuf.c| 1 - > lib/librte_mbuf/rte_mbuf.h| 51 +++-- > lib/librte_pmd_i40e/i40e_ethdev.c | 3 +- > lib/librte_pmd_i40e/i40e_rxtx.c | 55 +++-- > 8 files changed, 529 insertions(+), 250 deletions(-) > > -- > 2.1.4 Acked-by: Jijiang Liu < Jijiang.liu at intel.com>
[dpdk-dev] [PATCH 0/2] enable SRIOV switch in i40e driver
Test by: min.cao Patch name: [PATCH 0/2] enable SRIOV switch in i40e driver Test Flag: Tested-by Tester name:min.cao at intel.com Result summary: total 1 cases, 1 passed, 0 failed Test Case 1: Name: packet forwarding of SRIOV switch in i40e driver Environment:OS: Fedora20 3.11.10-301.fc20.x86_64 gcc (GCC) 4.8.2 CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz NIC: Fortville eagle Test result:PASSED Detail: packet forwarding of SRIOV switch in i40e driver are successful between 2 vms. -Original Message- From: Wu, Jingjing Sent: Thursday, January 29, 2015 9:42 AM To: dev at dpdk.org Cc: Wu, Jingjing; Zhang, Helin; Chen, Jing D; Cao, Min Subject: [PATCH 0/2] enable SRIOV switch in i40e driver Enable SRIOV switch in i40e driver. With this patch set, SRIOV switch can be done on Fortville NICs. Jingjing Wu (2): i40e: fix the bug when configuring vsi i40e: enable internal switch of pf lib/librte_pmd_i40e/i40e_ethdev.c | 38 +- 1 file changed, 37 insertions(+), 1 deletion(-) -- 1.9.3
[dpdk-dev] [PATCH] enic: silence log message
Stephen, Saw your patch. Will take a look. Thanks, -Sujith On 15/02/15 11:43 am, "Sujith Sankar (ssujith)" wrote: >Hi Stephen, David, > >I agree with you and shall submit this change. > >Thanks, >-Sujith > >On 09/02/15 9:41 pm, "Stephen Hemminger" >wrote: > >>Agree it should not use printf. >>If you insist on keeping the useless message then it should be log level >>debug >
[dpdk-dev] [PATCH 0/2] enable SRIOV switch in i40e driver
Tested-by: min.cao Patch name: [PATCH 0/2] enable SRIOV switch in i40e driver Test Flag: Tested-by Tester name:min.cao at intel.com Result summary: total 1 cases, 1 passed, 0 failed Test Case 1: Name: packet forwarding of SRIOV switch in i40e driver Environment:OS: Fedora20 3.11.10-301.fc20.x86_64 gcc (GCC) 4.8.2 CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz NIC: Fortville eagle Test result:PASSED Detail: packet forwarding of SRIOV switch in i40e driver are successful between 2 vms. -Original Message- From: Wu, Jingjing Sent: Thursday, January 29, 2015 9:42 AM To: dev at dpdk.org Cc: Wu, Jingjing; Zhang, Helin; Chen, Jing D; Cao, Min Subject: [PATCH 0/2] enable SRIOV switch in i40e driver Enable SRIOV switch in i40e driver. With this patch set, SRIOV switch can be done on Fortville NICs. Jingjing Wu (2): i40e: fix the bug when configuring vsi i40e: enable internal switch of pf lib/librte_pmd_i40e/i40e_ethdev.c | 38 +- 1 file changed, 37 insertions(+), 1 deletion(-) -- 1.9.3
[dpdk-dev] [PATCH v1] test: add ut for eal flags --lcores
Hi, Steve Why not post this patch within your enabling EAL "--lcores" option patch set? As it is not merged yet. Just a suggestion, depends you. Thanks, Michael On 2/15/2015 1:48 PM, Cunming Liang wrote: > The patch add unit test for the new eal option "--lcores". > > Signed-off-by: Cunming Liang > --- > It depends on the previous patch which enabling EAL "--lcores" option. > http://dpdk.org/ml/archives/dev/2015-February/013204.html > > app/test/test_eal_flags.c | 95 > --- > 1 file changed, 81 insertions(+), 14 deletions(-) > > diff --git a/app/test/test_eal_flags.c b/app/test/test_eal_flags.c > index 0a8269c..0352f87 100644 > --- a/app/test/test_eal_flags.c > +++ b/app/test/test_eal_flags.c > @@ -512,47 +512,114 @@ test_missing_c_flag(void) > > /* -c flag but no coremask value */ > const char *argv1[] = { prgname, prefix, mp_flag, "-n", "3", "-c"}; > - /* No -c or -l flag at all */ > + /* No -c, -l or --lcores flag at all */ > const char *argv2[] = { prgname, prefix, mp_flag, "-n", "3"}; > /* bad coremask value */ > - const char *argv3[] = { prgname, prefix, mp_flag, "-n", "3", "-c", > "error" }; > + const char *argv3[] = { prgname, prefix, mp_flag, > + "-n", "3", "-c", "error" }; > /* sanity check of tests - valid coremask value */ > - const char *argv4[] = { prgname, prefix, mp_flag, "-n", "3", "-c", "1" > }; > + const char *argv4[] = { prgname, prefix, mp_flag, > + "-n", "3", "-c", "1" }; > /* -l flag but no corelist value */ > - const char *argv5[] = { prgname, prefix, mp_flag, "-n", "3", "-l"}; > - const char *argv6[] = { prgname, prefix, mp_flag, "-n", "3", "-l", " " > }; > + const char *argv5[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l"}; > + const char *argv6[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", " " }; > /* bad corelist values */ > - const char *argv7[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > "error" }; > - const char *argv8[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1-" > }; > - const char *argv9[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1," > }; > - const char *argv10[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > "1#2" }; > + const char *argv7[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", "error" }; > + const char *argv8[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", "1-" }; > + const char *argv9[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", "1," }; > + const char *argv10[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", "1#2" }; > /* sanity check test - valid corelist value */ > - const char *argv11[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > "1-2,3" }; > + const char *argv11[] = { prgname, prefix, mp_flag, > + "-n", "3", "-l", "1-2,3" }; > + > + /* --lcores flag but no lcores value */ > + const char *argv12[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores" }; > + const char *argv13[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", " " }; > + /* bad lcores value */ > + const char *argv14[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "1-3-5" }; > + const char *argv15[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "0-1,,2" }; > + const char *argv16[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "0-,1" }; > + const char *argv17[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "(0-,2-4)" }; > + const char *argv18[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "(-1,2)" }; > + const char *argv19[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "(2-4)@(2-4-6)" }; > + const char *argv20[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "(a,2)" }; > + const char *argv21[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "1-3@(1,3)" }; > + const char *argv22[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "3@((1,3)" }; > + const char *argv23[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "(4-7)=(1,3)" }; > + const char *argv24[] = { prgname, prefix, mp_flag, > + "-n", "3", "--lcores", "[4-7]@(1,3)" }; > + /* sanity check of tests - valid lcores value */ > + const char *argv25[] = { prgname, prefix, mp_flag, > +
[dpdk-dev] [PULL REQUEST] i40e: Performance workaround for XL710, enable
The following changes since commit ed2547b68fb0b47d2b17ce6a16a5b8f299b0ead4: pci: fix max VFs for non igb_uio drivers (2015-02-13 14:48:16 +0100) are available in the git repository at: helin at dpdk.org:dpdk-i40e-next.git master for you to fetch changes up to 2f88e48bb4aaab118edf9344e895bf6bf2dd92bd: i40e: enable internal switch of pf (2015-02-15 01:39:43 -0500) Helin Zhang (1): i40e: workaround for XL710 performance Jingjing Wu (2): i40e: fix the bug when configuring vsi i40e: enable internal switch of pf lib/librte_pmd_i40e/i40e_ethdev.c | 82 --- 1 file changed, 67 insertions(+), 15 deletions(-)
[dpdk-dev] [PATCH v1] test: add ut for eal flags --lcores
Hi, > -Original Message- > From: Qiu, Michael > Sent: Sunday, February 15, 2015 2:59 PM > To: Liang, Cunming; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v1] test: add ut for eal flags --lcores > > Hi, Steve > > Why not post this patch within your enabling EAL "--lcores" option patch > set? > As it is not merged yet. > [LCM] As that patch series already go through several round review, I won't expect to involve the totally new update on it. > Just a suggestion, depends you. > > Thanks, > Michael > On 2/15/2015 1:48 PM, Cunming Liang wrote: > > The patch add unit test for the new eal option "--lcores". > > > > Signed-off-by: Cunming Liang > > --- > > It depends on the previous patch which enabling EAL "--lcores" option. > > http://dpdk.org/ml/archives/dev/2015-February/013204.html > > > > app/test/test_eal_flags.c | 95 > --- > > 1 file changed, 81 insertions(+), 14 deletions(-) > > > > diff --git a/app/test/test_eal_flags.c b/app/test/test_eal_flags.c > > index 0a8269c..0352f87 100644 > > --- a/app/test/test_eal_flags.c > > +++ b/app/test/test_eal_flags.c > > @@ -512,47 +512,114 @@ test_missing_c_flag(void) > > > > /* -c flag but no coremask value */ > > const char *argv1[] = { prgname, prefix, mp_flag, "-n", "3", "-c"}; > > - /* No -c or -l flag at all */ > > + /* No -c, -l or --lcores flag at all */ > > const char *argv2[] = { prgname, prefix, mp_flag, "-n", "3"}; > > /* bad coremask value */ > > - const char *argv3[] = { prgname, prefix, mp_flag, "-n", "3", "-c", > > "error" }; > > + const char *argv3[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-c", "error" }; > > /* sanity check of tests - valid coremask value */ > > - const char *argv4[] = { prgname, prefix, mp_flag, "-n", "3", "-c", "1" > > }; > > + const char *argv4[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-c", "1" }; > > /* -l flag but no corelist value */ > > - const char *argv5[] = { prgname, prefix, mp_flag, "-n", "3", "-l"}; > > - const char *argv6[] = { prgname, prefix, mp_flag, "-n", "3", "-l", " " > > }; > > + const char *argv5[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-l"}; > > + const char *argv6[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-l", " " }; > > /* bad corelist values */ > > - const char *argv7[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > > "error" }; > > - const char *argv8[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1-" > > }; > > - const char *argv9[] = { prgname, prefix, mp_flag, "-n", "3", "-l", "1," > > }; > > - const char *argv10[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > > "1#2" }; > > + const char *argv7[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-l", "error" }; > > + const char *argv8[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-l", "1-" }; > > + const char *argv9[] = { prgname, prefix, mp_flag, > > + "-n", "3", "-l", "1," }; > > + const char *argv10[] = { prgname, prefix, mp_flag, > > +"-n", "3", "-l", "1#2" }; > > /* sanity check test - valid corelist value */ > > - const char *argv11[] = { prgname, prefix, mp_flag, "-n", "3", "-l", > > "1-2,3" }; > > + const char *argv11[] = { prgname, prefix, mp_flag, > > +"-n", "3", "-l", "1-2,3" }; > > + > > + /* --lcores flag but no lcores value */ > > + const char *argv12[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores" }; > > + const char *argv13[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", " " }; > > + /* bad lcores value */ > > + const char *argv14[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "1-3-5" }; > > + const char *argv15[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "0-1,,2" }; > > + const char *argv16[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "0-,1" }; > > + const char *argv17[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "(0-,2-4)" }; > > + const char *argv18[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "(-1,2)" }; > > + const char *argv19[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "(2-4)@(2-4-6)" }; > > + const char *argv20[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "(a,2)" }; > > + const char *argv21[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "1-3@(1,3)" }; > > + const char *argv22[] = { prgname, prefix, mp_flag, > > +"-n", "3", "--lcores", "3@((1,3
[dpdk-dev] ACL lookup doesn't work for some schemes
Hi,I noticed that ACL lookup doesn't work for some schemes.1. If the first field is not uint8_t, even all fields are wildcard, lookup doesn't find the matching rule. See acl_8last.c.2. I prepended a uint8_t field, keep other fields be wildcard, lookup returns the correct result. See acl_8last2.c3. Then I change last field from 8bitmask_WILDCARD to 8bitmask(1, 0x1) (matches odd numbers) or 8bitmask(0, 0x1) (match even numbers), lookup doesn't return the correct. See acl_8last3.c. And I noticed the similar behavior for uint16_t ranges(date doesn't match 0-0x8000 nor 0x8001-0x).Above behaviors are tricky. Does ACL do some undocumented assumptions or the table schema? Regards,Zhichang Yu
[dpdk-dev] ACL lookup doesn't work for some schemes
Sorry I forgot to attach the sample code in previous mail. See the attached. From: yuzhichang_...@hotmail.com To: dev at dpdk.org Subject: ACL lookup doesn't work for some schemes Date: Sun, 15 Feb 2015 17:18:55 +0800 Hi,I noticed that ACL lookup doesn't work for some schemes.1. If the first field is not uint8_t, even all fields are wildcard, lookup doesn't find the matching rule. See acl_8last.c.2. I prepended a uint8_t field, keep other fields be wildcard, lookup returns the correct result. See acl_8last2.c3. Then I change last field from 8bitmask_WILDCARD to 8bitmask(1, 0x1) (matches odd numbers) or 8bitmask(0, 0x1) (match even numbers), lookup doesn't return the correct. See acl_8last3.c. And I noticed the similar behavior for uint16_t ranges(date doesn't match 0-0x8000 nor 0x8001-0x).Above behaviors are tricky. Does ACL do some undocumented assumptions or the table schema? Regards,Zhichang Yu -- next part -- A non-text attachment was scrubbed... Name: poc_acl_mp.zip Type: application/zip Size: 14549 bytes Desc: not available URL: <http://dpdk.org/ml/archives/dev/attachments/20150215/1553d3ad/attachment-0001.zip>
[dpdk-dev] ACL lookup doesn't work for some schemes
I tested against DPDK 1.7.0, 1.8.0 and trunk(?ed2547b6).? > From: yuzhichang_scl at hotmail.com > To: dev at dpdk.org > Subject: RE: ACL lookup doesn't work for some schemes > Date: Sun, 15 Feb 2015 17:23:53 +0800 > > Sorry I forgot to attach the sample code in previous mail. See the attached. > > > From: yuzhichang_scl at hotmail.com > To: dev at dpdk.org > Subject: ACL lookup doesn't work for some schemes > Date: Sun, 15 Feb 2015 17:18:55 +0800 > > Hi, > I noticed that ACL lookup doesn't work for some schemes. > 1. If the first field is not uint8_t, even all fields are wildcard, > lookup doesn't find the matching rule. See acl_8last.c. > 2. I prepended a uint8_t field, keep other fields be wildcard, lookup > returns the correct result. See acl_8last2.c > 3. Then I change last field from 8bitmask_WILDCARD to 8bitmask(1, 0x1) > (matches odd numbers) or 8bitmask(0, 0x1) (match even numbers), lookup > doesn't return the correct. See acl_8last3.c. And I noticed the > similar behavior for uint16_t ranges(date doesn't match 0-0x8000 nor > 0x8001-0x). > Above behaviors are tricky. Does ACL do some > undocumented assumptions or the table schema? > Regards, > Zhichang Yu
[dpdk-dev] [PATCH v6 12/19] malloc: fix the issue of SOCKET_ID_ANY
On Sun, Feb 15, 2015 at 12:43:03AM +, Liang, Cunming wrote: > Hi, > > > -Original Message- > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > Sent: Saturday, February 14, 2015 1:57 AM > > To: Liang, Cunming > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v6 12/19] malloc: fix the issue of > > SOCKET_ID_ANY > > > > On Fri, Feb 13, 2015 at 09:38:14AM +0800, Cunming Liang wrote: > > > Add check for rte_socket_id(), avoid get unexpected return like (-1). > > > > > > Signed-off-by: Cunming Liang > > > --- > > > lib/librte_malloc/malloc_heap.h | 7 ++- > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > diff --git a/lib/librte_malloc/malloc_heap.h > > > b/lib/librte_malloc/malloc_heap.h > > > index b4aec45..a47136d 100644 > > > --- a/lib/librte_malloc/malloc_heap.h > > > +++ b/lib/librte_malloc/malloc_heap.h > > > @@ -44,7 +44,12 @@ extern "C" { > > > static inline unsigned > > > malloc_get_numa_socket(void) > > > { > > > - return rte_socket_id(); > > > + unsigned socket_id = rte_socket_id(); > > > + > > > + if (socket_id == (unsigned)SOCKET_ID_ANY) > > > + return 0; > > > + > > > + return socket_id; > > Why is -1 unexpected? Isn't it reasonable to assume that some memory is > > equidistant from all cpu numa nodes? > [LCM] One piece of memory will be whole allocated from one specific NUMA > node. But won't be like some part from one and the other part from another. > If no specific NUMA node assigned(SOCKET_ID_ANY/-1), it firstly asks for the > current NUMA node where current core belongs to. > 'malloc_get_numa_socket()' is called on that time. When the time 1:1 > thread/core mapping is assumed and the default value is 0, it always will > return a none (-1) value. > Now rte_socket_id() may return -1 in the case the pthread runs on multi-cores > which are not belongs to one NUMA node, or in the case _socket_id is not yet > assigned and the default value is (-1). So if current _socket_id is -1, then > just pick up the first node as the candidate. Probably I shall add more > comments for this. > > Ok, but doesn't that provide an abnormal bias for node 0? I was thinking it might be better to be honest with the application so that it can choose a node according to its own policy. Neil > > Neil > > > > > } > > > > > > void * > > > -- > > > 1.8.1.4 > > > > > > >
[dpdk-dev] [PATCH 1/4] xen: allow choosing dom0 support at runtime
On Sun, 15 Feb 2015 04:07:21 + "Liu, Jijiang" wrote: > Hi Stephen, > > What do you mean ' allow choosing dom0 support at runtime'? > If you mean user can choose DPDK to run Xen Dom0 or not on DOM0 by a runtime > flag, I don't think your change can achieve this goal. > > Thanks > Jijiang Liu With the existing DPDK model if application is built with DOM0 support it will not work (it crashes) if the application is run in a non DOM0 environment (with real huge pages). And vice-versa if application is built without DOM0 support and it will crash if run in Xen Paravirt mode. This patch allows the library to be built in such a way that only one version needs to be shipped which is important for distro's like RHEL who want to ship a shared library. And also important for users like Brocade/Vyatta who build one binary that needs to work on bare Linux and in Xen PV mode.
[dpdk-dev] [PATCH 1/5] xen: allow choosing dom0 support at runtime
The previous code would only allow building library and application so that it ran on Xen DOM0 or not on DOM0. This changes that to a runtime flag. Signed-off-by: Stephen Hemminger --- v2 -- fix i40e as well lib/librte_eal/common/include/rte_memory.h | 4 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 7 lib/librte_ether/rte_ethdev.c | 22 lib/librte_ether/rte_ethdev.h | 23 lib/librte_mempool/rte_mempool.c | 26 +++--- lib/librte_pmd_e1000/em_rxtx.c | 30 +++- lib/librte_pmd_e1000/igb_rxtx.c| 52 +-- lib/librte_pmd_i40e/i40e_ethdev.c | 16 + lib/librte_pmd_i40e/i40e_fdir.c| 8 +++-- lib/librte_pmd_i40e/i40e_rxtx.c| 57 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 58 +- 11 files changed, 156 insertions(+), 147 deletions(-) diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h index 7f8103f..ab6c1ff 100644 --- a/lib/librte_eal/common/include/rte_memory.h +++ b/lib/librte_eal/common/include/rte_memory.h @@ -176,6 +176,10 @@ unsigned rte_memory_get_nchannel(void); unsigned rte_memory_get_nrank(void); #ifdef RTE_LIBRTE_XEN_DOM0 + +/**< Internal use only - should DOM0 memory mapping be used */ +extern int is_xen_dom0_supported(void); + /** * Return the physical address of elt, which is an element of the pool mp. * diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index a67a1b0..4afda2a 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -98,6 +98,13 @@ #include "eal_filesystem.h" #include "eal_hugepages.h" +#ifdef RTE_LIBRTE_XEN_DOM0 +int is_xen_dom0_supported(void) +{ + return internal_config.xen_dom0_support; +} +#endif + /** * @file * Huge page mapping under linux diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index ea3a1fb..457e0bc 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -2825,6 +2825,27 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev, } rte_spinlock_unlock(&rte_eth_dev_cb_lock); } + +const struct rte_memzone * +rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name, +uint16_t queue_id, size_t size, unsigned align, +int socket_id) +{ + char z_name[RTE_MEMZONE_NAMESIZE]; + const struct rte_memzone *mz; + + snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d", +dev->driver->pci_drv.name, ring_name, +dev->data->port_id, queue_id); + + mz = rte_memzone_lookup(z_name); + if (mz) + return mz; + + return rte_memzone_reserve_bounded(z_name, size, + socket_id, 0, align, RTE_PGSIZE_2M); +} + #ifdef RTE_NIC_BYPASS int rte_eth_dev_bypass_init(uint8_t port_id) { @@ -3003,6 +3024,7 @@ rte_eth_dev_bypass_wd_reset(uint8_t port_id) (*dev->dev_ops->bypass_wd_reset)(dev); return 0; } + #endif int diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 1200c1c..747acb5 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -3664,6 +3664,29 @@ int rte_eth_dev_filter_supported(uint8_t port_id, enum rte_filter_type filter_ty int rte_eth_dev_filter_ctrl(uint8_t port_id, enum rte_filter_type filter_type, enum rte_filter_op filter_op, void *arg); +/** + * Create memzone for HW rings. + * malloc can't be used as the physical address is needed. + * If the memzone is already created, then this function returns a ptr + * to the old one. + * + * @param eth_dev + * The *eth_dev* pointer is the address of the *rte_eth_dev* structure + * @param name + * The name of the memory zone + * @param queue_id + * The index of the queue to add to name + * @param size + * The sizeof of the memory area + * @param align + * Alignment for resulting memzone. Must be a power of 2. + * @param socket_id + * The *socket_id* argument is the socket identifier in case of NUMA. + */ +const struct rte_memzone * +rte_eth_dma_zone_reserve(const struct rte_eth_dev *eth_dev, const char *name, +uint16_t queue_id, size_t size, +unsigned align, int socket_id); #ifdef __cplusplus } #endif diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 4cf6c25..5056a4f 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -372,19 +372,21 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size, int socket_id, unsigned flags) { #ifdef RTE_LIBRTE_XEN_DOM0 - return (rte_dom0_mempool_create(name, n, elt_size, - cache_si
[dpdk-dev] [PATCH 2/5] enic: fix device to work with Xen DOM0
It is possible to passthrough a PCI device when running in Xen Paravirt mode. The device driver has to accomodate by using memory zones differently. This patch models the memory allocation for ENIC device based on changes already done for ixgbe and igb. Build tested only; has not been tested on ENIC hardware. --- v2 -- this patch is added lib/librte_pmd_enic/enic_main.c | 19 --- lib/librte_pmd_enic/vnic/vnic_dev.c | 19 +++ 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/lib/librte_pmd_enic/enic_main.c b/lib/librte_pmd_enic/enic_main.c index 48fdca2..0be5172 100644 --- a/lib/librte_pmd_enic/enic_main.c +++ b/lib/librte_pmd_enic/enic_main.c @@ -537,8 +537,14 @@ enic_alloc_consistent(__rte_unused void *priv, size_t size, const struct rte_memzone *rz; *dma_handle = 0; - rz = rte_memzone_reserve_aligned((const char *)name, - size, 0, 0, ENIC_ALIGN); +#ifdef RTE_LIBRTE_XEN_DOM0 + if (is_xen_dom0_supported()) + rz = rte_memzone_reserve_bounded((char *)name, size, +0, 0, ENIC_ALIGN, RTE_PGSIZE_2M); + else +#endif + rz = rte_memzone_reserve_aligned((char *)name, size, +0, 0, ENIC_ALIGN); if (!rz) { pr_err("%s : Failed to allocate memory requested for %s", __func__, name); @@ -546,7 +552,14 @@ enic_alloc_consistent(__rte_unused void *priv, size_t size, } vaddr = rz->addr; - *dma_handle = (dma_addr_t)rz->phys_addr; + +#ifdef RTE_LIBRTE_XEN_DOM0 + if (is_xen_dom0_supported()) + *dma_handle = rte_mem_phy2mch(rz->memseg_id, + rz->phys_addr); + else +#endif + *dma_handle = (dma_addr_t)rz->phys_addr; return vaddr; } diff --git a/lib/librte_pmd_enic/vnic/vnic_dev.c b/lib/librte_pmd_enic/vnic/vnic_dev.c index 6407994..e660aaf 100644 --- a/lib/librte_pmd_enic/vnic/vnic_dev.c +++ b/lib/librte_pmd_enic/vnic/vnic_dev.c @@ -276,9 +276,14 @@ int vnic_dev_alloc_desc_ring(__attribute__((unused)) struct vnic_dev *vdev, vnic_dev_desc_ring_size(ring, desc_count, desc_size); - rz = rte_memzone_reserve_aligned(z_name, - ring->size_unaligned, socket_id, - 0, ENIC_ALIGN); +#ifdef RTE_LIBRTE_XEN_DOM0 + if (is_xen_dom0_supported()) + rz = rte_memzone_reserve_bounded(z_name, ring->size_unaligned, +socket_id, 0, ENIC_ALIGN, RTE_PGSIZE_2M); + else +#endif + rz = rte_memzone_reserve_aligned(z_name, ring->size_unaligned, +socket_id, 0, ENIC_ALIGN); if (!rz) { pr_err("Failed to allocate ring (size=%d), aborting\n", (int)ring->size); @@ -292,7 +297,13 @@ int vnic_dev_alloc_desc_ring(__attribute__((unused)) struct vnic_dev *vdev, return -ENOMEM; } - ring->base_addr_unaligned = (dma_addr_t)rz->phys_addr; +#ifdef RTE_LIBRTE_XEN_DOM0 + if (is_xen_dom0_supported()) + ring->base_addr_unaligned = rte_mem_phy2mch(rz->memseg_id, + rz->phys_addr); + else +#endif + ring->base_addr_unaligned = (dma_addr_t)rz->phys_addr; ring->base_addr = ALIGN(ring->base_addr_unaligned, ring->base_align); -- 2.1.4
[dpdk-dev] [PATCH 3/5] xen: add phys-addr command line argument
Allow overriding default Xen DOM0 behavior to use physical addresses insted of mfn Signed-off-by: Stephen Hemminger --- v2 -- no changes lib/librte_eal/common/eal_common_options.c | 5 + lib/librte_eal/common/eal_internal_cfg.h | 1 + lib/librte_eal/common/eal_options.h| 2 ++ lib/librte_eal/common/include/rte_memory.h | 3 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 5 + lib/librte_mempool/rte_dom0_mempool.c | 10 -- 6 files changed, 24 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 67e02dc..1742364 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -83,6 +83,7 @@ eal_long_options[] = { {OPT_LOG_LEVEL, 1, NULL, OPT_LOG_LEVEL_NUM}, {OPT_BASE_VIRTADDR, 1, 0, OPT_BASE_VIRTADDR_NUM}, {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM}, + {OPT_XEN_PHYS_ADDR, 0, 0, OPT_XEN_PHYS_ADDR_NUM}, {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM}, {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM}, {0, 0, 0, 0} @@ -491,6 +492,10 @@ eal_parse_common_option(int opt, const char *optarg, } conf->log_level = log; break; + + case OPT_XEN_PHYS_ADDR_NUM: + conf->xen_phys_addr_support = 1; + break; } /* don't know what to do, leave this to caller */ diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h index e2ecb0d..41b4169 100644 --- a/lib/librte_eal/common/eal_internal_cfg.h +++ b/lib/librte_eal/common/eal_internal_cfg.h @@ -65,6 +65,7 @@ struct internal_config { volatile unsigned force_nrank;/**< force number of ranks */ volatile unsigned no_hugetlbfs; /**< true to disable hugetlbfs */ volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/ + volatile unsigned xen_phys_addr_support; /**< support phys addr */ volatile unsigned no_pci; /**< true to disable PCI */ volatile unsigned no_hpet;/**< true to disable HPET */ volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h index e476f8d..8aee959 100644 --- a/lib/librte_eal/common/eal_options.h +++ b/lib/librte_eal/common/eal_options.h @@ -73,6 +73,8 @@ enum { OPT_BASE_VIRTADDR_NUM, #define OPT_XEN_DOM0"xen-dom0" OPT_XEN_DOM0_NUM, +#define OPT_XEN_PHYS_ADDR "xen-phys-addr" + OPT_XEN_PHYS_ADDR_NUM, #define OPT_CREATE_UIO_DEV "create-uio-dev" OPT_CREATE_UIO_DEV_NUM, #define OPT_VFIO_INTR"vfio-intr" diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h index ab6c1ff..c3b8a98 100644 --- a/lib/librte_eal/common/include/rte_memory.h +++ b/lib/librte_eal/common/include/rte_memory.h @@ -180,6 +180,9 @@ unsigned rte_memory_get_nrank(void); /**< Internal use only - should DOM0 memory mapping be used */ extern int is_xen_dom0_supported(void); +/**< Internal use only - should DOM0 use physical addresses insted of mfn */ +extern int is_xen_phys_addr_supported(void); + /** * Return the physical address of elt, which is an element of the pool mp. * diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 4afda2a..a759ac9 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -103,6 +103,11 @@ int is_xen_dom0_supported(void) { return internal_config.xen_dom0_support; } + +int is_xen_phys_addr_supported(void) +{ + return internal_config.xen_phys_addr_support; +} #endif /** diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c index 9ec68fb..ab35826 100644 --- a/lib/librte_mempool/rte_dom0_mempool.c +++ b/lib/librte_mempool/rte_dom0_mempool.c @@ -74,8 +74,14 @@ get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num, virt_addr =(uintptr_t) mcfg->memseg[memseg_id].addr; for (i = 0; i != pg_num; i++) { -mfn_id = ((uintptr_t)va + i * pg_sz - virt_addr) / RTE_PGSIZE_2M; -pa[i] = mcfg->memseg[memseg_id].mfn[mfn_id] * page_size; + if (!is_xen_phys_addr_supported()) { + mfn_id = ((uintptr_t)va + i * pg_sz - + virt_addr) / RTE_PGSIZE_2M; + pa[i] = mcfg->memseg[memseg_id].mfn[mfn_id] * page_size; + } else { + pa[i] = mcfg->memseg[memseg_id].phys_addr + i * pg_sz + + (uintptr_t)va - virt_addr; + } } } -- 2.1.4
[dpdk-dev] [PATCH 4/5] xen: add uio driver
New uio helper kernel driver for Xen netfront UIO poll mode driver. Signed-off-by: Stephen Hemminger --- v2 -- use PMD_REGISTER lib/librte_eal/linuxapp/Makefile | 3 + lib/librte_eal/linuxapp/xen_uio/Makefile | 55 ++ lib/librte_eal/linuxapp/xen_uio/xen_uio.c | 837 ++ 3 files changed, 895 insertions(+) create mode 100644 lib/librte_eal/linuxapp/xen_uio/Makefile create mode 100644 lib/librte_eal/linuxapp/xen_uio/xen_uio.c diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile index 8fcfdf6..d3893e5 100644 --- a/lib/librte_eal/linuxapp/Makefile +++ b/lib/librte_eal/linuxapp/Makefile @@ -41,5 +41,8 @@ endif ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y) DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0 endif +ifeq ($(CONFIG_RTE_LIBRTE_XEN_PMD),y) +DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_uio +endif include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/lib/librte_eal/linuxapp/xen_uio/Makefile b/lib/librte_eal/linuxapp/xen_uio/Makefile new file mode 100644 index 000..25a9f35 --- /dev/null +++ b/lib/librte_eal/linuxapp/xen_uio/Makefile @@ -0,0 +1,55 @@ +# BSD LICENSE +# +# Copyright (c) 2013-2015 Brocade Communications Systems, Inc. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# module name and path +# +MODULE = xen_uio +MODULE_PATH = drivers/net/xen_uio + +# +# CFLAGS +# +MODULE_CFLAGS += -I$(SRCDIR) --param max-inline-insns-single=100 +MODULE_CFLAGS += -I$(RTE_OUTPUT)/include +MODULE_CFLAGS += -Winline -Wall -Werror +MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h + +# +# all source are stored in SRCS-y +# +SRCS-y := xen_uio.c + + +include $(RTE_SDK)/mk/rte.module.mk diff --git a/lib/librte_eal/linuxapp/xen_uio/xen_uio.c b/lib/librte_eal/linuxapp/xen_uio/xen_uio.c new file mode 100644 index 000..b25b1f3 --- /dev/null +++ b/lib/librte_eal/linuxapp/xen_uio/xen_uio.c @@ -0,0 +1,837 @@ +/* + * Virtual network driver for conversing with remote driver backends. + * + * Copyright (c) 2002-2005, K A Fraser + * Copyright (c) 2005, XenSource Ltd + * Copyright (c) 2013-2015 Brocade Communications Systems, Inc. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version 2 + * as published by the Free Software Foundation; or, when distributed + * separately from the Linux kernel or incorporated into other + * software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TO
[dpdk-dev] [PATCH 5/5] xen: net-front poll mode driver
This driver implements DPDK driver that has the same functionality as net-front driver in Linux kernel. Signed-off-by: Stephen Hemminger --- v2 -- no changes config/common_linuxapp| 6 + lib/Makefile | 1 + lib/librte_pmd_xen/Makefile | 30 ++ lib/librte_pmd_xen/virt_dev.c | 400 + lib/librte_pmd_xen/virt_dev.h | 30 ++ lib/librte_pmd_xen/xen_adapter_info.h | 64 lib/librte_pmd_xen/xen_dev.c | 375 +++ lib/librte_pmd_xen/xen_dev.h | 97 ++ lib/librte_pmd_xen/xen_logs.h | 23 ++ lib/librte_pmd_xen/xen_rxtx.c | 546 ++ lib/librte_pmd_xen/xen_rxtx.h | 110 +++ mk/rte.app.mk | 4 + 12 files changed, 1686 insertions(+) create mode 100644 lib/librte_pmd_xen/Makefile create mode 100644 lib/librte_pmd_xen/virt_dev.c create mode 100644 lib/librte_pmd_xen/virt_dev.h create mode 100644 lib/librte_pmd_xen/xen_adapter_info.h create mode 100644 lib/librte_pmd_xen/xen_dev.c create mode 100644 lib/librte_pmd_xen/xen_dev.h create mode 100644 lib/librte_pmd_xen/xen_logs.h create mode 100644 lib/librte_pmd_xen/xen_rxtx.c create mode 100644 lib/librte_pmd_xen/xen_rxtx.h diff --git a/config/common_linuxapp b/config/common_linuxapp index d428f84..668fc8d 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -232,6 +232,12 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y CONFIG_RTE_LIBRTE_PMD_XENVIRT=n # +# Compile XEN net-front PMD driver +# +CONFIG_RTE_LIBRTE_XEN_PMD=n +CONFIG_RTE_LIBRTE_XEN_DEBUG_INIT=n + +# # Do prefetch of packet data within PMD driver receive function # CONFIG_RTE_PMD_PACKET_PREFETCH=y diff --git a/lib/Makefile b/lib/Makefile index d617d81..f405e40 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -52,6 +52,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += librte_pmd_af_packet DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt +DIRS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += librte_pmd_xen DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm diff --git a/lib/librte_pmd_xen/Makefile b/lib/librte_pmd_xen/Makefile new file mode 100644 index 000..d294d03 --- /dev/null +++ b/lib/librte_pmd_xen/Makefile @@ -0,0 +1,30 @@ +# +# Copyright (c) 2013-2015 Brocade Communications Systems, Inc. +# All rights reserved. +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_pmd_xen.a + +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +VPATH += $(RTE_SDK)/lib/librte_pmd_xen + +# +# all source are stored in SRCS-y +# +SRCS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += virt_dev.c +SRCS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += xen_dev.c +SRCS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += xen_rxtx.c + +# this lib depends upon: +DEPDIRS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += lib/librte_eal lib/librte_ether +DEPDIRS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += lib/librte_mempool lib/librte_mbuf +DEPDIRS-$(CONFIG_RTE_LIBRTE_XEN_PMD) += lib/librte_net lib/librte_malloc + +include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_pmd_xen/virt_dev.c b/lib/librte_pmd_xen/virt_dev.c new file mode 100644 index 000..f824977 --- /dev/null +++ b/lib/librte_pmd_xen/virt_dev.c @@ -0,0 +1,400 @@ +/* + * Copyright (c) 2013-2015 Brocade Communications Systems, Inc. + * All rights reserved. + */ + +#include +#include +#include +#include + +#include +#include + +#include +#include + +#include "virt_dev.h" + +struct uio_map { + void *addr; + uint64_t offset; + uint64_t size; + uint64_t phaddr; +}; + +struct uio_resource { + TAILQ_ENTRY(uio_resource) next; + struct rte_pci_addr pci_addr; + char path[PATH_MAX]; + size_t nb_maps; + struct uio_map maps[PCI_MAX_RESOURCE]; +}; + +static int +virt_parse_sysfs_value(const char *filename, uint64_t *val) +{ + FILE *f; + char buf[BUFSIZ]; + char *end = NULL; + + f = fopen(filename, "r"); + if (f == NULL) { + RTE_LOG(ERR, EAL, "cannot open sysfs value %s", filename); + return -1; + } + + if (fgets(buf, sizeof(buf), f) == NULL) { + RTE_LOG(ERR, EAL, "cannot read sysfs value %s", filename); + fclose(f); + return -1; + } + + *val = strtoull(buf, &end, 0); + if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) { + RTE_LOG(ERR, EAL, "cannot parse sysfs value %s", filename); + fclose(f); + return -1; + } + + fclose(f); + return 0; +} + +#define OFF_MAX ((uint64_t)(off_t)-1) +static ssize_t +virt_uio_get_mappings(const char *devname, struct uio_map maps[], + size_t nb_maps) +{ + size_t i; + char dirname[PATH_MA
[dpdk-dev] [PATCH v2] testpmd: fix port parsing in show port info command
> > the port number type should be consistent with librte_cmdline, > > else there is potential endian issue. > > > > Signed-off-by: Xuelin Shi > > Acked-by: Olivier Matz Applied, thanks
[dpdk-dev] [PATCH] testpmd: remove duplicated parameter parsing
> > Several parameters were being parsed twice in testpmd, so this patch gets > > rid of the second parsing. > > > > Signed-off-by: Pablo de Lara > > > Acked-by: Sergio Gonzalez Monroy Applied, thanks
[dpdk-dev] [PATCH] testpmd: remove incorrect parameter limits in help command line
> > Ring threshold parameters an RX/TX queue (pthresh, wthresh and hthresh) > > had an incorrect range of values shown in help command line. > > > > Signed-off-by: Pablo de Lara > > > Acked-by: Sergio Gonzalez Monroy Applied, thanks
[dpdk-dev] [PATCH] testpmd: force user to stop forwarding when changing port/core list
> > Testpmd has the capability of changing the forwarding cores and ports in > > runtime. > > If these are changed when forwarding, two issues may be encountered: > > > > - If "show config fwd" is used, changes made in the core list are applied. > > Therefore, trying to stop forwarding may hang testpmd, > > since it could be waiting for cores to stop that are not actually running > > anything > > > > - If the port list is changed, when stopping forwarding, > > it may miss the stats of some of the ports that were actually being used. > > > > Signed-off-by: Pablo de Lara > Acked-by: Helin Zhang Applied, thanks
[dpdk-dev] [PATCH] testpmd: use default rx/tx port configuration values
> > Function to get rx/tx port configuration from the PMDs was added in > > previous release to simplify the port configuration in all sample apps, > > but testpmd was not modified. > > > > This patch makes testpmd get the default rx/tx port configuration, but > > still uses the parameters passed by the command line. > > > > This patch depends on patch "testpmd: remove duplicated parameter parsing" > > (http://dpdk.org/dev/patchwork/patch/3015) > > > > Signed-off-by: Pablo de Lara > > Acked-by: John McNamara Applied, thanks
[dpdk-dev] [PATCH v1] ixgbe: fix link issue in loopback mode
> > In loopback mode, it's expected force link up even when there's no cable > > connect. > > But in codes, setup_sfp() rewrites the related register. > > It causes in the case 'multispeed_fiber', it can't link up without cable > > connect. > > > > Signed-off-by: Cunming Liang > Acked-by: Patrick Lu Applied, thanks