Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
Hi > -Original Message- > From: Neil Horman [mailto:nhor...@tuxdriver.com] > Sent: Friday, December 1, 2017 2:10 PM > To: Gaëtan Rivet > Cc: Matan Azrad ; Thomas Monjalon > ; Jingjing Wu ; > dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote: > > Hello Matan, Neil, > > > > I like the port ownership concept. I think it is needed to clarify > > some operations and should be useful to several subsystems. > > > > This patch could certainly be sub-divided however, and your current > > 1/5 should probably come after this one. > > > > Some comments inline. > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote: > > > On Tue, Nov 28, 2017 at 11:57:58AM +, Matan Azrad wrote: > > > > The ownership of a port is implicit in DPDK. > > > > Making it explicit is better from the next reasons: > > > > 1. It may be convenient for multi-process applications to know which > > > >process is in charge of a port. > > > > 2. A library could work on top of a port. > > > > 3. A port can work on top of another port. > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd. > > > > We need to check that the user is not trying to use a port which > > > > is already managed by fail-safe. > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple > > > > management of a device by different DPDK entities. > > > > > > > > A port owner is built from owner id(number) and owner name(string) > > > > while the owner id must be unique to distinguish between two > > > > identical entity instances and the owner name can be any name. > > > > The name helps to logically recognize the owner by different DPDK > > > > entities and allows easy debug. > > > > Each DPDK entity can allocate an owner unique identifier and can > > > > use it and its preferred name to owns valid ethdev ports. > > > > Each DPDK entity can get any port owner status to decide if it can > > > > manage the port or not. > > > > > > > > The current ethdev internal port management is not affected by > > > > this feature. > > > > > > > > The internal port management is not affected, but the external > > interface is, however. In order to respect port ownership, > > applications are forced to modify their port iterator, as shown by your > testpmd patch. > > > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that > > would represent the application itself (probably with the ID 0 and an > > owner string ""). Only with specific additional configuration should > > this default subset of ethdev be divided. > > > > This would make this evolution seamless for applications, at no cost > > to the complexity of the design. > > > > > > Signed-off-by: Matan Azrad > > > > > > > > > This seems fairly racy. What if one thread attempts to set > > > ownership on a port, while another is checking it on another cpu in > > > parallel. It doesn't seem like it will protect against that at all. > > > It also doesn't protect against the possibility of multiple threads > > > attempting to poll for rx in parallel, which I think was part of > > > Thomas's origional statement regarding port ownership (he noted that > > > the lockless design implied only a single thread should be allowed to poll > for receive or make configuration changes at a time. > > > > > > Neil > > > > > > > Isn't this race already there for any configuration operation / > > polling function? The DPDK arch is expecting applications to solve it. > > Why should port ownership be designed differently from other DPDK > components? > > > Yes, but that doesn't mean it should exist in purpituity, nor does it mean > that > your new api should contain it as well. > > > Embedding checks for port ownership within operations will force > > everyone to bear their costs, even those not interested in using this > > API. These checks should be kept outside, within the entity claiming > > ownership of the port, in the form of using the proper port iterator > > IMO. > > > No. At the very least, you need to make the API itself exclusive. That is to > say, you should at least ensure that a port ownership get call doesn't race > with a port ownership set call. It seems rediculous to just leave that sort > of > locking as an exercize to the user. > > Neil > Neil, As Thomas mentioned, a DPDK port is designed to be managed by only one thread (or synchronized DPDK entity). So all the port management includes port ownership shouldn't be synchronized, i.e. locks are not needed. If some application want to do dual thread port management, the responsibility to synchronize the port ownership or any other port management is on this application. Port ownership doesn't come to allow synchronized management of the port by two DPDK entities in parallel, it is just nice way to answer next questions: 1. I
Re: [dpdk-dev] [PATCH] app/testpmd: app/testpmd: add device removal command
Hi Ferruh, Yes I believe this patch is no more required. Kindest regards, Raslan Darawsheh -Original Message- From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] Sent: Wednesday, November 29, 2017 12:02 AM To: Wu, Jingjing ; Gaëtan Rivet Cc: Raslan Darawsheh ; Thomas Monjalon ; dev@dpdk.org; Saleh Alsouqi Subject: Re: [dpdk-dev] [PATCH] app/testpmd: app/testpmd: add device removal command On 9/7/2017 1:17 AM, Wu, Jingjing wrote: >>> >>> Since dealing with device is kind of new, it can be OK to create new >>> command tree, but there are already hotplug commands per port: >>> "port attach #PCI|#VDEV_NAME" >>> "port detach #P" >>> >> >> Those two commands deal with the etherdev hotplug API. >> The new command should test the rte_dev one. >> > I was confused. The forwarding in testpmd setup is based on port id, right? > And all the devices listed in testpmd is etherdev, and all functional > testings are all based on port id, right? > Then what is the difference to use port id or device name in testpmd? > And if no difference, what is the difference between detach and remove? > > The only difference here I think is > Remove command is using rte_bus hotplug framework. > Attach/detach is using rte_eth_dev_detach API. Hi Raslan, With latest code "detach" does rte_bus hotplug remove. So I believe this patch is no more required. Can you please confirm? Thanks, ferruh > > I think remove command should be an important command, and should not have > ambiguity. > At least we need to doc it clearly. > > >> As Thomas pointed out, the etherdev one deals only with eth ports, >> while hotplug could be generalized to other devices, such as cryptodev. >> >>> perhaps it can be good to keep "attach", "detach" keywords for >>> device to be consistent? >>> >>> "device attach #name" >>> "device detach #name" >>> >> >> I made a point of naming the hotplug operations in rte_bus >> plug/unplug to avoid the confusion with the etherdev API. hotplug_add >> / hotplug_remove also marks the distinction. >> >> I don't know if it would be helpful for a developer writing a PMD, >> searching for a way to test a functionality to have an API name >> mismatch. >> >>> Also a show equivalent can be added to work in device level: >>> "show device info" >>> >> >> I think it would be useful. >> >>> >>> [1] >>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdp >>> dk.org%2Fml%2Farchives%2Fdev%2F2017-August%2F072764.html&data=02%7C0 >>> 1%7Crasland%40mellanox.com%7Cdde3d3196af240dc46e508d536abb3fd%7Ca652 >>> 971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636475033483509610&sdata=tddN >>> jwaa3H%2BNh2n7%2F%2BraWc82h9jNuFp9JXZx%2F%2BASH8M%3D&reserved=0 >>> >>> >> >> -- >> Gaëtan Rivet >> 6WIND >
Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
Hi Matan, > -Original Message- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Matan Azrad > Sent: Sunday, December 3, 2017 8:05 AM > To: Neil Horman ; Gaëtan Rivet > Cc: Thomas Monjalon ; Wu, Jingjing > ; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > Hi > > > -Original Message- > > From: Neil Horman [mailto:nhor...@tuxdriver.com] > > Sent: Friday, December 1, 2017 2:10 PM > > To: Gaëtan Rivet > > Cc: Matan Azrad ; Thomas Monjalon > > ; Jingjing Wu ; > > dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote: > > > Hello Matan, Neil, > > > > > > I like the port ownership concept. I think it is needed to clarify > > > some operations and should be useful to several subsystems. > > > > > > This patch could certainly be sub-divided however, and your current > > > 1/5 should probably come after this one. > > > > > > Some comments inline. > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote: > > > > On Tue, Nov 28, 2017 at 11:57:58AM +, Matan Azrad wrote: > > > > > The ownership of a port is implicit in DPDK. > > > > > Making it explicit is better from the next reasons: > > > > > 1. It may be convenient for multi-process applications to know which > > > > >process is in charge of a port. > > > > > 2. A library could work on top of a port. > > > > > 3. A port can work on top of another port. > > > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd. > > > > > We need to check that the user is not trying to use a port which > > > > > is already managed by fail-safe. > > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple > > > > > management of a device by different DPDK entities. > > > > > > > > > > A port owner is built from owner id(number) and owner name(string) > > > > > while the owner id must be unique to distinguish between two > > > > > identical entity instances and the owner name can be any name. > > > > > The name helps to logically recognize the owner by different DPDK > > > > > entities and allows easy debug. > > > > > Each DPDK entity can allocate an owner unique identifier and can > > > > > use it and its preferred name to owns valid ethdev ports. > > > > > Each DPDK entity can get any port owner status to decide if it can > > > > > manage the port or not. > > > > > > > > > > The current ethdev internal port management is not affected by > > > > > this feature. > > > > > > > > > > > The internal port management is not affected, but the external > > > interface is, however. In order to respect port ownership, > > > applications are forced to modify their port iterator, as shown by your > > testpmd patch. > > > > > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV > > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that > > > would represent the application itself (probably with the ID 0 and an > > > owner string ""). Only with specific additional configuration should > > > this default subset of ethdev be divided. > > > > > > This would make this evolution seamless for applications, at no cost > > > to the complexity of the design. > > > > > > > > Signed-off-by: Matan Azrad > > > > > > > > > > > > This seems fairly racy. What if one thread attempts to set > > > > ownership on a port, while another is checking it on another cpu in > > > > parallel. It doesn't seem like it will protect against that at all. > > > > It also doesn't protect against the possibility of multiple threads > > > > attempting to poll for rx in parallel, which I think was part of > > > > Thomas's origional statement regarding port ownership (he noted that > > > > the lockless design implied only a single thread should be allowed to > > > > poll > > for receive or make configuration changes at a time. > > > > > > > > Neil > > > > > > > > > > Isn't this race already there for any configuration operation / > > > polling function? The DPDK arch is expecting applications to solve it. > > > Why should port ownership be designed differently from other DPDK > > components? > > > > > Yes, but that doesn't mean it should exist in purpituity, nor does it mean > > that > > your new api should contain it as well. > > > > > Embedding checks for port ownership within operations will force > > > everyone to bear their costs, even those not interested in using this > > > API. These checks should be kept outside, within the entity claiming > > > ownership of the port, in the form of using the proper port iterator > > > IMO. > > > > > No. At the very least, you need to make the API itself exclusive. That is > > to > > say, you should at least ensure that a port ownership get call doesn't race > > with a port ownership set call. It seems rediculous to just leave that > > sort of > > locking as an exercize to the user. > > > > Neil > > > Neil, > As Thomas mention
Re: [dpdk-dev] [PATCH V6 1/3] eal/arm64: remove the braces {} for dmb() and dsb()
-Original Message- > Date: Sun, 26 Nov 2017 18:00:22 -0800 > From: Jia He > To: jerin.ja...@caviumnetworks.com, dev@dpdk.org, > bruce.richard...@intel.com, konstantin.anan...@intel.com > Cc: olivier.m...@6wind.com, jianbo@arm.com, hemant.agra...@nxp.com, Jia > He , sta...@dpdk.org, Jia He > Subject: [PATCH V6 1/3] eal/arm64: remove the braces {} for dmb() and dsb() > X-Mailer: git-send-email 2.7.4 > > for the code as follows: > if (condition) > rte_smp_rmb(); > else > rte_smp_wmb(); > Without this patch, compiler will report this error: > error: 'else' without a previous 'if' > > Fixes: 84733fd0d75e ("eal/arm: fix memory barrier definition") > Cc: sta...@dpdk.org > Signed-off-by: Jia He Please fix the below checkpatch errors. Wrong tag: Suggested-by: Ananyev, Konstantin Wrong 'Fixes' reference: Fixes: 84733fd0d75e ("eal/arm: fix memory barrier definition") With above fix: Acked-by: Jerin Jacob
Re: [dpdk-dev] [PATCH v6] net/i40e: determine number of queues per VF during run time
Hi Wei, > -Original Message- > From: Dai, Wei > Sent: Monday, November 27, 2017 8:09 AM > To: Wu, Jingjing ; Xing, Beilei > ; Ananyev, Konstantin > Cc: dev@dpdk.org; Dai, Wei > Subject: [PATCH v6] net/i40e: determine number of queues per VF during run > time > > Without this patch, the number of queues per i40e VF is defined as 4 > by CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 in config/common_base. > It is fixed value determined in building time and can't be changed > during run time. > With this patch, the number of queues per i40e VF can be determinated > during run time. For example, if the PCI address of an i40e VF is > :bb.cc, with the EAL parameter -w :bb.cc,queue-num-per-vf=8, > the number of queues per VF is 8. > If there is no "queue-num-per-vf" setting in EAL parameters, it is 4 > by default as before. And if the value after the "queue-num-per-vf" > is invalid, it is set as 4 forcibly. The valid values include 1, 2, 4, > 8, 16 . > > Signed-off-by: Wei Dai > > --- > v6: > fix a small bug when detecting end character of strtoul > v5: > fix git log message and WARNING of coding stype > v4: > use rte_kvargs instead of pervious parsing function; > use malloc/free instead of rte_zmalloc/rte_free. > v3: > fix WARNING of coding style issues from checkpa...@dpdk.org > v2: > fix WARNING of coding style issues from checkpa...@dpdk.org > --- > config/common_base | 1 - > drivers/net/i40e/i40e_ethdev.c | 67 > -- > 2 files changed, 65 insertions(+), 3 deletions(-) > > diff --git a/config/common_base b/config/common_base > index e74febe..4e20389 100644 > --- a/config/common_base > +++ b/config/common_base > @@ -208,7 +208,6 @@ CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y > CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y > CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n > CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64 > -CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 > CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4 > # interval up to 8160 us, aligned to 2 (or default value) > CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1 > diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c > index 811cc9f..6eceea1 100644 > --- a/drivers/net/i40e/i40e_ethdev.c > +++ b/drivers/net/i40e/i40e_ethdev.c > @@ -3971,6 +3971,67 @@ i40e_get_cap(struct i40e_hw *hw) > return ret; > } > > +#define RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF 4 > +#define QUEUE_NUM_PER_VF_ARG "queue-num-per-vf" > +static int i40e_pf_parse_vf_queue_number_handler(const char *key, > + const char *value, > + void *opaque) > +{ > + struct i40e_pf *pf; > + unsigned long num; > + char *end; > + > + pf = (struct i40e_pf *)opaque; > + RTE_SET_USED(key); > + > + errno = 0; > + num = strtoul(value, &end, 0); > + if (errno != 0 || end == value || *end != 0) { > + PMD_DRV_LOG(WARNING, "Wrong VF queue number = %s, Now it is " > + "kept the value = %hu", value, pf->vf_nb_qp_max); > + return -(EINVAL); > + } > + > + if (num <= 16 && rte_is_power_of_2(num)) As a nit - better to use some macro instead of '16' here. Apart from that - looks good to me. Acked-by: Konstantin Ananyev > + pf->vf_nb_qp_max = (uint16_t)num; > + else > + /* here return 0 to make next valid same argument work */ > + PMD_DRV_LOG(WARNING, "Wrong VF queue number = %lu, it must be " > + "power of 2 and equal or less than 16 !, Now it is " > + "kept the value = %hu", num, pf->vf_nb_qp_max); > + > + return 0; > +} > + > +static int i40e_pf_config_vf_rxq_number(struct rte_eth_dev *dev) > +{ > + static const char * const valid_keys[] = {QUEUE_NUM_PER_VF_ARG, ""}; > + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); > + struct rte_kvargs *kvlist; > + > + /* set default queue number per VF as 4 */ > + pf->vf_nb_qp_max = RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF; > + > + if (dev->device->devargs == NULL) > + return 0; > + > + kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys); > + if (kvlist == NULL) > + return -(EINVAL); > + > + if (rte_kvargs_count(kvlist, QUEUE_NUM_PER_VF_ARG) > 1) > + PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only " > + "the first invalid or last valid one is used !", > + QUEUE_NUM_PER_VF_ARG); > + > + rte_kvargs_process(kvlist, QUEUE_NUM_PER_VF_ARG, > +i40e_pf_parse_vf_queue_number_handler, pf); > + > + rte_kvargs_free(kvlist); > + > + return 0; > +} > + > static int > i40e_pf_parameter_init(struct rte_eth_dev *dev) > { > @@ -3983,6 +4044,9 @@ i40e_pf_parameter_init(struct rte_eth_dev *dev) > PMD_INIT_LOG(ERR, "HW configuration doesn't support SRIOV"); >
Re: [dpdk-dev] [PATCH V6 2/3] ring: introduce new header file to include common functions
-Original Message- > Date: Sun, 26 Nov 2017 18:00:23 -0800 > From: Jia He > To: jerin.ja...@caviumnetworks.com, dev@dpdk.org, > bruce.richard...@intel.com, konstantin.anan...@intel.com > Cc: olivier.m...@6wind.com, jianbo@arm.com, hemant.agra...@nxp.com, Jia > He , Jia He > Subject: [PATCH V6 2/3] ring: introduce new header file to include common > functions > X-Mailer: git-send-email 2.7.4 > > move the common part of rte_ring.h into rte_ring_generic.h. > move the memory barrier part into update_tail(). > > no functional changes here. > > Signed-off-by: Jia He > Suggested-by: Jerin Jacob > Suggested-by: Ananyev, Konstantin Wrong tag: complaint from checkpatch. Suggested-by: Ananyev, Konstantin > --- > + */ > + > +#ifndef _RTE_RING_GENERIC_H_ > +#define _RTE_RING_GENERIC_H_ > + > +static __rte_always_inline void > +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, > + uint32_t single, uint32_t enqueue) > +{ How about making enqueue as const. ie. const uint32_t enqueue ? > + if (enqueue) > + rte_smp_wmb(); > + else > + rte_smp_rmb(); Other than that, it looks good to me. Acked-by: Jerin Jacob
Re: [dpdk-dev] [PATCH V6 3/3] ring: introduce new header file to support C11 memory model
-Original Message- > Date: Sun, 26 Nov 2017 18:00:24 -0800 > From: Jia He > To: jerin.ja...@caviumnetworks.com, dev@dpdk.org, > bruce.richard...@intel.com, konstantin.anan...@intel.com > Cc: olivier.m...@6wind.com, jianbo@arm.com, hemant.agra...@nxp.com, Jia > He , Jia He > Subject: [PATCH V6 3/3] ring: introduce new header file to support C11 > memory model > X-Mailer: git-send-email 2.7.4 > > To support C11 memory model barrier, 2 options are suggested by Jerin: > 1. use rte_smp_rmb > 2. use load_acquire/store_release(refer to [1]). > CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "y" > only on arm64 so far. > > The reason why providing 2 options is due to the performance benchmark > difference in different arm machines, refer to [2]. > > We haven't tested on ppc64. If anyone verifies it, he can add > CONFIG_RTE_RING_USE_C11_MEM_MODEL=y to ppc64 config files. > > [1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170 > [2] http://dpdk.org/ml/archives/dev/2017-October/080861.html > > Signed-off-by: Jia He > Suggested-by: Jerin Jacob Acked-by: Jerin Jacob
Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on AArch64
Jerin, Thanks a lot for your review and comments. Please find my comments below inline. Best regards, Herbert > -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Wednesday, November 29, 2017 20:32 > To: Herbert Guan > Cc: Jianbo Liu ; dev@dpdk.org > Subject: Re: [PATCH] arch/arm: optimization for memcpy on AArch64 > > -Original Message- > > Date: Mon, 27 Nov 2017 15:49:45 +0800 > > From: Herbert Guan > > To: jerin.ja...@caviumnetworks.com, jianbo@arm.com, dev@dpdk.org > > CC: Herbert Guan > > Subject: [PATCH] arch/arm: optimization for memcpy on AArch64 > > X-Mailer: git-send-email 1.8.3.1 > > + > > +/** > > + * Beginning of customization section > > +**/ > > +#define ALIGNMENT_MASK 0x0F > > +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN > > +// Only src unalignment will be treaed as unaligned copy > > C++ style comments. It may generate check patch errors. I'll change it to use C style comment in the version 2. > > > +#define IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & > > +ALIGNMENT_MASK) #else // Both dst and src unalignment will be treated > > +as unaligned copy #define IS_UNALIGNED_COPY(dst, src) \ > > +(((uintptr_t)(dst) | (uintptr_t)(src)) & ALIGNMENT_MASK) > #endif > > + > > + > > +// If copy size is larger than threshold, memcpy() will be used. > > +// Run "memcpy_perf_autotest" to determine the proper threshold. > > +#define ALIGNED_THRESHOLD ((size_t)(0x)) > > +#define UNALIGNED_THRESHOLD ((size_t)(0x)) > > Do you see any case where this threshold is useful. Yes, on some platforms, and/or with some glibc version, the glibc memcpy has better performance in larger size (e.g., >512, >4096...). So developers should run unit test to find the best threshold. The default value of 0x should be modified with evaluated values. > > > + > > +static inline void *__attribute__ ((__always_inline__)) > > +rte_memcpy(void *restrict dst, const void *restrict src, size_t n) > > +{ > > +if (n < 16) { > > +rte_memcpy_lt16((uint8_t *)dst, (const uint8_t *)src, n); > > +return dst; > > +} > > +if (n < 64) { > > +rte_memcpy_ge16_lt64((uint8_t *)dst, (const uint8_t *)src, > n); > > +return dst; > > +} > > Unfortunately we have 128B cache arm64 implementation too. Could you > please take care that based on RTE_CACHE_LINE_SIZE > Here the value of '64' is not the cache line size. But for the reason that prefetch itself will cost some cycles, it's not worthwhile to do prefetch for small size (e.g. < 64 bytes) copy. Per my test, prefetching for small size copy will actually lower the performance. In the other hand, I can only find one 128B cache line aarch64 machine here. And it do exist some specific optimization for this machine. Not sure if it'll be beneficial for other 128B cache machines or not. I prefer not to put it in this patch but in a later standalone specific patch for 128B cache machines. > > +__builtin_prefetch(src, 0, 0); // rte_prefetch_non_temporal(src); > > +__builtin_prefetch(dst, 1, 0); // * unchanged * > > See above point and Please use DPDK equivalents. rte_prefetch*() I can use the " rte_prefetch_non_temporal()" for read prefetch. However, there's no DPDK equivalents for the write prefetch. Would you suggest that we add one API for DPDK? BTW, the current DPDK rte_prefetch*() are using ASM instructions. It might be better to use __builtin_prefetch(src, 0, 0/1/2/3) for better compatibility of future aarch64 architectures. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
[dpdk-dev] [PATCH] config/thunderx: disable c11 mem model ring implementation
On thunderx and octeontx, ring_perf_autotest and ring_pmd_perf_autotest test shows better performance when disabling CONFIG_RTE_RING_USE_C11_MEM_MODEL. On the other hand, Enabling CONFIG_RTE_RING_USE_C11_MEM_MODEL shows better performance on thunderx2. Since thunderx2 is using the default armv8 config, no particular change is required. Signed-off-by: Jerin Jacob --- This patch is depended on following patch from Jia He. support c11 memory model barrier in librte_ring http://dpdk.org/ml/archives/dev/2017-November/082368.html --- config/defconfig_arm64-thunderx-linuxapp-gcc | 1 + 1 file changed, 1 insertion(+) diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc index 45038b119..60c68ec96 100644 --- a/config/defconfig_arm64-thunderx-linuxapp-gcc +++ b/config/defconfig_arm64-thunderx-linuxapp-gcc @@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="thunderx" CONFIG_RTE_CACHE_LINE_SIZE=128 CONFIG_RTE_MAX_NUMA_NODES=2 CONFIG_RTE_MAX_LCORE=96 +CONFIG_RTE_RING_USE_C11_MEM_MODEL=n # # Compile PMD for octeontx sso event device -- 2.15.1
Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on AArch64
Pavan, Thanks for review and comments. Please find my comments inline below. Best regards, Herbert > -Original Message- > From: Pavan Nikhilesh Bhagavatula > [mailto:pbhagavat...@caviumnetworks.com] > Sent: Saturday, December 2, 2017 15:33 > To: Herbert Guan ; Jianbo Liu > > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on > AArch64 > > On Mon, Nov 27, 2017 at 03:49:45PM +0800, Herbert Guan wrote: > > This patch provides an option to do rte_memcpy() using 'restrict' > > qualifier, which can induce GCC to do optimizations by using more > > efficient instructions, providing some performance gain over memcpy() > > on some AArch64 platforms/enviroments. > > > > The memory copy performance differs between different AArch64 > > platforms. And a more recent glibc (e.g. 2.23 or later) can provide a > > better memcpy() performance compared to old glibc versions. It's > > always suggested to use a more recent glibc if possible, from which > > the entire system can get benefit. If for some reason an old glibc has > > to be used, this patch is provided for an alternative. > > > > This implementation can improve memory copy on some AArch64 > platforms, > > when an old glibc (e.g. 2.19, 2.17...) is being used. > > It is disabled by default and needs "RTE_ARCH_ARM64_MEMCPY" > > defined to activate. It's not always proving better performance than > > memcpy() so users need to run DPDK unit test "memcpy_perf_autotest" > > and customize parameters in "customization section" in rte_memcpy_64.h > > for best performance. > > > > Compiler version will also impact the rte_memcpy() performance. > > It's observed on some platforms and with the same code, GCC 7.2.0 > > compiled binary can provide better performance than GCC 4.8.5. It's > > suggested to use GCC 5.4.0 or later. > > > > Signed-off-by: Herbert Guan > > --- > > .../common/include/arch/arm/rte_memcpy_64.h| 193 > + > > 1 file changed, 193 insertions(+) > > > > diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > index b80d8ba..1f42b3c 100644 > > --- a/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > +++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h > > @@ -42,6 +42,197 @@ > > > > #include "generic/rte_memcpy.h" > > > > +#ifdef RTE_ARCH_ARM64_MEMCPY > > There is an existing flag for arm32 to enable neon based memcpy > RTE_ARCH_ARM_NEON_MEMCPY we could reuse that here as restrict does > the same. > This implementation is actually not using ARM NEON instructions so the existing flag is not describing the option exactly. It'll be good if the existing flag is "RTE_ARCH_ARM_MEMCPY" but unfortunately it might be too late now to get the flags aligned. > > +#include > > +#include > > + > > > +/* > *** > > +*** > > + * The memory copy performance differs on different AArch64 micro- > architectures. > > + * And the most recent glibc (e.g. 2.23 or later) can provide a > > +better memcpy() > > + * performance compared to old glibc versions. It's always suggested > > +to use a > > + * more recent glibc if possible, from which the entire system can get > benefit. > > + * > > + * This implementation improves memory copy on some aarch64 > > +micro-architectures, > > + * when an old glibc (e.g. 2.19, 2.17...) is being used. It is > > +disabled by > > + * default and needs "RTE_ARCH_ARM64_MEMCPY" defined to activate. > > +It's not > > + * always providing better performance than memcpy() so users need to > > +run unit > > + * test "memcpy_perf_autotest" and customize parameters in > > +customization section > > + * below for best performance. > > + * > > + * Compiler version will also impact the rte_memcpy() performance. > > +It's observed > > + * on some platforms and with the same code, GCC 7.2.0 compiled > > +binaries can > > + * provide better performance than GCC 4.8.5 compiled binaries. > > + > > > +* > > > +*/ > > + > > +/** > > + * Beginning of customization section > > +**/ > > +#define ALIGNMENT_MASK 0x0F > > +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN > > +// Only src unalignment will be treaed as unaligned copy #define > > +IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & ALIGNMENT_MASK) > > We can use existing `rte_is_aligned` function instead. The exising 'rte_is_aligned()' inline function is defined in a relatively complex way, and there will be more instructions generated (using GCC 7.2.0): : // using rte_is_aligned() 0:91003c01 addx1, x0, #0xf 4:927cec21 andx1, x1, #0xfff0 8:eb01001f cmpx0, x1 c:1a9f07e0 csetw0, ne // ne = any 10:d65f03c0 ret 14:d503201f nop 0018 : // using above expression 18:12000c00
Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
Hi Konstantine > -Original Message- > From: Ananyev, Konstantin [mailto:konstantin.anan...@intel.com] > Sent: Sunday, December 3, 2017 1:10 PM > To: Matan Azrad ; Neil Horman > ; Gaëtan Rivet > Cc: Thomas Monjalon ; Wu, Jingjing > ; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > > > Hi Matan, > > > -Original Message- > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Matan Azrad > > Sent: Sunday, December 3, 2017 8:05 AM > > To: Neil Horman ; Gaëtan Rivet > > > > Cc: Thomas Monjalon ; Wu, Jingjing > > ; dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > > > Hi > > > > > -Original Message- > > > From: Neil Horman [mailto:nhor...@tuxdriver.com] > > > Sent: Friday, December 1, 2017 2:10 PM > > > To: Gaëtan Rivet > > > Cc: Matan Azrad ; Thomas Monjalon > > > ; Jingjing Wu ; > > > dev@dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership > > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote: > > > > Hello Matan, Neil, > > > > > > > > I like the port ownership concept. I think it is needed to clarify > > > > some operations and should be useful to several subsystems. > > > > > > > > This patch could certainly be sub-divided however, and your > > > > current > > > > 1/5 should probably come after this one. > > > > > > > > Some comments inline. > > > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote: > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +, Matan Azrad wrote: > > > > > > The ownership of a port is implicit in DPDK. > > > > > > Making it explicit is better from the next reasons: > > > > > > 1. It may be convenient for multi-process applications to know > which > > > > > >process is in charge of a port. > > > > > > 2. A library could work on top of a port. > > > > > > 3. A port can work on top of another port. > > > > > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd. > > > > > > We need to check that the user is not trying to use a port > > > > > > which is already managed by fail-safe. > > > > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid > > > > > > multiple management of a device by different DPDK entities. > > > > > > > > > > > > A port owner is built from owner id(number) and owner > > > > > > name(string) while the owner id must be unique to distinguish > > > > > > between two identical entity instances and the owner name can be > any name. > > > > > > The name helps to logically recognize the owner by different > > > > > > DPDK entities and allows easy debug. > > > > > > Each DPDK entity can allocate an owner unique identifier and > > > > > > can use it and its preferred name to owns valid ethdev ports. > > > > > > Each DPDK entity can get any port owner status to decide if it > > > > > > can manage the port or not. > > > > > > > > > > > > The current ethdev internal port management is not affected by > > > > > > this feature. > > > > > > > > > > > > > > The internal port management is not affected, but the external > > > > interface is, however. In order to respect port ownership, > > > > applications are forced to modify their port iterator, as shown by > > > > your > > > testpmd patch. > > > > > > > > I think it would be better to modify the current > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and > > > > introduce a default owner that would represent the application > > > > itself (probably with the ID 0 and an owner string ""). Only with > > > > specific additional configuration should this default subset of ethdev > > > > be > divided. > > > > > > > > This would make this evolution seamless for applications, at no > > > > cost to the complexity of the design. > > > > > > > > > > Signed-off-by: Matan Azrad > > > > > > > > > > > > > > > This seems fairly racy. What if one thread attempts to set > > > > > ownership on a port, while another is checking it on another cpu > > > > > in parallel. It doesn't seem like it will protect against that at > > > > > all. > > > > > It also doesn't protect against the possibility of multiple > > > > > threads attempting to poll for rx in parallel, which I think was > > > > > part of Thomas's origional statement regarding port ownership > > > > > (he noted that the lockless design implied only a single thread > > > > > should be allowed to poll > > > for receive or make configuration changes at a time. > > > > > > > > > > Neil > > > > > > > > > > > > > Isn't this race already there for any configuration operation / > > > > polling function? The DPDK arch is expecting applications to solve it. > > > > Why should port ownership be designed differently from other DPDK > > > components? > > > > > > > Yes, but that doesn't mean it should exist in purpituity, nor does > > > it mean that your new api should contain it as well. > > > > > > > Embedding checks for port ownership within operations will force > > > > everyone
Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on AArch64
On Sun, Dec 03, 2017 at 12:38:35PM +, Herbert Guan wrote: > Pavan, > > Thanks for review and comments. Please find my comments inline below. > > Best regards, > Herbert > > > There is an existing flag for arm32 to enable neon based memcpy > > RTE_ARCH_ARM_NEON_MEMCPY we could reuse that here as restrict does > > the same. > > > This implementation is actually not using ARM NEON instructions so the > existing flag is not describing the option exactly. It'll be good if the > existing flag is "RTE_ARCH_ARM_MEMCPY" but unfortunately it might be too late > now to get the flags aligned. > Correct me if I'm wrong but doesn't restrict tell the compiler to do SIMD optimization? Anyway can we put RTE_ARCH_ARM64_MEMCPY into config/common_base as CONFIG_RTE_ARCH_ARM64_MEMCPY=n so that it would be easier to enable/disable. > > > +#include > > > +#include > > > + > > > > > +/* > > *** > > > +*** > > > + * The memory copy performance differs on different AArch64 micro- > > architectures. > > > + * And the most recent glibc (e.g. 2.23 or later) can provide a > > > +better memcpy() > > > + * performance compared to old glibc versions. It's always suggested > > > +to use a > > > + * more recent glibc if possible, from which the entire system can get > > benefit. > > > + * > > > + * This implementation improves memory copy on some aarch64 > > > +micro-architectures, > > > + * when an old glibc (e.g. 2.19, 2.17...) is being used. It is > > > +disabled by > > > + * default and needs "RTE_ARCH_ARM64_MEMCPY" defined to activate. > > > +It's not > > > + * always providing better performance than memcpy() so users need to > > > +run unit > > > + * test "memcpy_perf_autotest" and customize parameters in > > > +customization section > > > + * below for best performance. > > > + * > > > + * Compiler version will also impact the rte_memcpy() performance. > > > +It's observed > > > + * on some platforms and with the same code, GCC 7.2.0 compiled > > > +binaries can > > > + * provide better performance than GCC 4.8.5 compiled binaries. > > > + > > > > > +* > > > > > +*/ > > > + > > > +/** > > > + * Beginning of customization section > > > +**/ > > > +#define ALIGNMENT_MASK 0x0F > > > +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN > > > +// Only src unalignment will be treaed as unaligned copy #define > > > +IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & ALIGNMENT_MASK) > > > > We can use existing `rte_is_aligned` function instead. > > The exising 'rte_is_aligned()' inline function is defined in a relatively > complex way, and there will be more instructions generated (using GCC 7.2.0): > > : // using rte_is_aligned() >0:91003c01 addx1, x0, #0xf >4:927cec21 andx1, x1, #0xfff0 >8:eb01001f cmpx0, x1 >c:1a9f07e0 csetw0, ne // ne = any > 10:d65f03c0 ret > 14:d503201f nop > > 0018 : // using above expression > 18:12000c00 andw0, w0, #0xf > 1c:d65f03c0 ret > > So to get better performance, it's better to use the simple logic. Agreed, I have noticed that too maybe we could change rte_is_aligned to be simpler (Not in this patch). > > Would doing this still benifit if size is compile time constant? i.e. when > > __builtin_constant_p(n) is true. > > > Yes, performance margin is observed if size is compile time constant on some > tested platforms. > Sorry I didn't get you but which is better? If size is compile time constant is using libc memcpy is better or going with restrict implementation better. If the former then we could do what 32bit rte_memcpy is using i.e. #define rte_memcpy(dst, src, n) \ __extension__ ({ \ (__builtin_constant_p(n)) ? \ memcpy((dst), (src), (n)) : \ rte_memcpy_func((dst), (src), (n)); }) Regards, Pavan.
[dpdk-dev] [PATCH V7 1/3] eal/arm64: remove the braces {} for dmb() and dsb()
for the code as follows: if (condition) rte_smp_rmb(); else rte_smp_wmb(); Without this patch, compiler will report this error: error: 'else' without a previous 'if' Fixes: 84733fd0d75e ("eal/arm64: fix memory barrier definition") Cc: sta...@dpdk.org Signed-off-by: Jia He Acked-by: Jerin Jacob --- lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h index 0b70d62..71da29c 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h @@ -43,8 +43,8 @@ extern "C" { #include "generic/rte_atomic.h" -#define dsb(opt) { asm volatile("dsb " #opt : : : "memory"); } -#define dmb(opt) { asm volatile("dmb " #opt : : : "memory"); } +#define dsb(opt) asm volatile("dsb " #opt : : : "memory") +#define dmb(opt) asm volatile("dmb " #opt : : : "memory") #define rte_mb() dsb(sy) -- 2.7.4
[dpdk-dev] [PATCH V7 0/3] support c11 memory model barrier in librte_ring
To support C11 memory model barrier, 2 options are suggested by Jerin: 1. use rte_smp_rmb 2. use load_acquire/store_release CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "y" only on arm64 so far. The reason why providing 2 options is due to the performance benchmark difference in different arm machines. Already fuctionally tested on the machines as follows: - on X86 - on arm64 with CONFIG_RTE_RING_USE_C11_MEM_MODEL=y - on arm64 with CONFIG_RTE_RING_USE_C11_MEM_MODEL=n --- Changelog: V7: fix check-git-log warnings which is suggested by Jerin V6: minor change in subject and log V5: split it into 2 patchset due to the milestone concerns, this is the 2st one. Also fix checkpatch.pl warnings V4: split into small patches V3: arch specific implementation for enqueue/dequeue barrier V2: let users choose whether using load_acquire/store_release V1: rte_smp_rmb() between 2 loads Jia He (3): eal/arm64: remove the braces {} for dmb() and dsb() ring: introduce new header file to include common functions ring: introduce new header file to support C11 memory model config/common_armv8a_linuxapp | 2 + .../common/include/arch/arm/rte_atomic_64.h| 4 +- lib/librte_eventdev/rte_event_ring.h | 6 +- lib/librte_ring/Makefile | 4 +- lib/librte_ring/rte_ring.h | 173 ++ lib/librte_ring/rte_ring_c11_mem.h | 186 lib/librte_ring/rte_ring_generic.h | 195 + 7 files changed, 406 insertions(+), 164 deletions(-) create mode 100644 lib/librte_ring/rte_ring_c11_mem.h create mode 100644 lib/librte_ring/rte_ring_generic.h -- 2.7.4
[dpdk-dev] [PATCH V7 3/3] ring: introduce new header file to support C11 memory model
To support C11 memory model barrier, 2 options are suggested by Jerin: 1. use rte_smp_rmb 2. use load_acquire/store_release(refer to [1]). CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "y" only on arm64 so far. The reason why providing 2 options is due to the performance benchmark difference in different arm machines, refer to [2]. We haven't tested on ppc64. If anyone verifies it, he can add CONFIG_RTE_RING_USE_C11_MEM_MODEL=y to ppc64 config files. [1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170 [2] http://dpdk.org/ml/archives/dev/2017-October/080861.html Signed-off-by: Jia He Suggested-by: Jerin Jacob Acked-by: Jerin Jacob --- config/common_armv8a_linuxapp | 2 + lib/librte_ring/Makefile | 3 +- lib/librte_ring/rte_ring.h | 14 ++- lib/librte_ring/rte_ring_c11_mem.h | 186 + 4 files changed, 203 insertions(+), 2 deletions(-) create mode 100644 lib/librte_ring/rte_ring_c11_mem.h diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp index 6732d1e..5b7b2eb 100644 --- a/config/common_armv8a_linuxapp +++ b/config/common_armv8a_linuxapp @@ -49,3 +49,5 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n CONFIG_RTE_LIBRTE_AVP_PMD=n CONFIG_RTE_SCHED_VECTOR=n + +CONFIG_RTE_RING_USE_C11_MEM_MODEL=y diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile index c959945..a2682e7 100644 --- a/lib/librte_ring/Makefile +++ b/lib/librte_ring/Makefile @@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c # install includes SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \ - rte_ring_generic.h + rte_ring_generic.h \ + rte_ring_c11_mem.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 519614c..3343eba 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -356,8 +356,20 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r); } \ } while (0) -/* Move common functions to generic file */ +/* Between load and load. there might be cpu reorder in weak model + * (powerpc/arm). + * There are 2 choices for the users + * 1.use rmb() memory barrier + * 2.use one-direcion load_acquire/store_release barrier,defined by + * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y + * It depends on performance test results. + * By default, move common functions to rte_ring_generic.h + */ +#ifdef RTE_RING_USE_C11_MEM_MODEL +#include "rte_ring_c11_mem.h" +#else #include "rte_ring_generic.h" +#endif /** * @internal Enqueue several objects on the ring diff --git a/lib/librte_ring/rte_ring_c11_mem.h b/lib/librte_ring/rte_ring_c11_mem.h new file mode 100644 index 000..ca21e95 --- /dev/null +++ b/lib/librte_ring/rte_ring_c11_mem.h @@ -0,0 +1,186 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2017 hxt-semitech. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of hxt-semitech nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_RING_C11_MEM_H_ +#define _RTE_RING_C11_MEM_H_ + +static __rte_always_inline void +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, + uint32_t single, uint32_t enqueue) +{ + RTE_SET_USED(enqueue); + + /* +* If there are other enqueues/dequeues in progress that preceded us, +* we need to wait for them to complete
[dpdk-dev] [PATCH V7 2/3] ring: introduce new header file to include common functions
move the common part of rte_ring.h into rte_ring_generic.h. move the memory barrier part into update_tail(). no functional changes here. Signed-off-by: Jia He Suggested-by: Jerin Jacob Suggested-by: Ananyev Konstantin Acked-by: Jerin Jacob --- lib/librte_eventdev/rte_event_ring.h | 6 +- lib/librte_ring/Makefile | 3 +- lib/librte_ring/rte_ring.h | 161 + lib/librte_ring/rte_ring_generic.h | 195 +++ 4 files changed, 203 insertions(+), 162 deletions(-) create mode 100644 lib/librte_ring/rte_ring_generic.h diff --git a/lib/librte_eventdev/rte_event_ring.h b/lib/librte_eventdev/rte_event_ring.h index ea9b688..3e49458 100644 --- a/lib/librte_eventdev/rte_event_ring.h +++ b/lib/librte_eventdev/rte_event_ring.h @@ -126,9 +126,8 @@ rte_event_ring_enqueue_burst(struct rte_event_ring *r, goto end; ENQUEUE_PTRS(&r->r, &r[1], prod_head, events, n, struct rte_event); - rte_smp_wmb(); - update_tail(&r->r.prod, prod_head, prod_next, 1); + update_tail(&r->r.prod, prod_head, prod_next, 1, 1); end: if (free_space != NULL) *free_space = free_entries - n; @@ -168,9 +167,8 @@ rte_event_ring_dequeue_burst(struct rte_event_ring *r, goto end; DEQUEUE_PTRS(&r->r, &r[1], cons_head, events, n, struct rte_event); - rte_smp_rmb(); - update_tail(&r->r.cons, cons_head, cons_next, 1); + update_tail(&r->r.cons, cons_head, cons_next, 1, 0); end: if (available != NULL) diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile index e34d9d9..c959945 100644 --- a/lib/librte_ring/Makefile +++ b/lib/librte_ring/Makefile @@ -45,6 +45,7 @@ LIBABIVER := 1 SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h +SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \ + rte_ring_generic.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index e924438..519614c 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -356,91 +356,8 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r); } \ } while (0) -static __rte_always_inline void -update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val, - uint32_t single) -{ - /* -* If there are other enqueues/dequeues in progress that preceded us, -* we need to wait for them to complete -*/ - if (!single) - while (unlikely(ht->tail != old_val)) - rte_pause(); - - ht->tail = new_val; -} - -/** - * @internal This function updates the producer head for enqueue - * - * @param r - * A pointer to the ring structure - * @param is_sp - * Indicates whether multi-producer path is needed or not - * @param n - * The number of elements we will want to enqueue, i.e. how far should the - * head be moved - * @param behavior - * RTE_RING_QUEUE_FIXED:Enqueue a fixed number of items from a ring - * RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring - * @param old_head - * Returns head value as it was before the move, i.e. where enqueue starts - * @param new_head - * Returns the current/new head value i.e. where enqueue finishes - * @param free_entries - * Returns the amount of free space in the ring BEFORE head was moved - * @return - * Actual number of objects enqueued. - * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. - */ -static __rte_always_inline unsigned int -__rte_ring_move_prod_head(struct rte_ring *r, int is_sp, - unsigned int n, enum rte_ring_queue_behavior behavior, - uint32_t *old_head, uint32_t *new_head, - uint32_t *free_entries) -{ - const uint32_t capacity = r->capacity; - unsigned int max = n; - int success; - - do { - /* Reset n to the initial burst count */ - n = max; - - *old_head = r->prod.head; - - /* add rmb barrier to avoid load/load reorder in weak -* memory model. It is noop on x86 -*/ - rte_smp_rmb(); - - const uint32_t cons_tail = r->cons.tail; - /* -* The subtraction is done between two unsigned 32bits value -* (the result is always modulo 32 bits even if we have -* *old_head > cons_tail). So 'free_entries' is always between 0 -* and capacity (which is < size). -*/ - *free_entries = (capacity + cons_tail - *old_head); - - /* check that we have enough room in ring */ - if (unlikely(n > *free_entries)) - n = (behavior == RTE_RING_QUEUE_FIXED) ? -
Re: [dpdk-dev] [PATCH 0/7] Sync with MUSDK-17.10
The 12/01/2017 16:19, Tomasz Duszynski wrote: > This patchset brings following changes: > > o Sync with MUSDK-17.10. Latest version of the library comes with many > improvements and fixes thus switching to it is beneficial. > > o A few code and documentation updates. > > Changes since v1: > o Add extra error log in case setting link up fails. > o Cram all MUSDK related headers into the same the driver header. > > Tomasz Duszynski (7): > net/mrvl: sync compilation with musdk-17.10 > net/mrvl: query link status using library API > net/mrvl: do not enable port after setting MAC address > net/mrvl: check if ppio is initialized > net/mrvl: add extra error logs > devtools/test-build: add MRVL NET PMD to test-build > net/mrvl: update MRVL NET PMD documentation > > devtools/test-build.sh | 1 + > doc/guides/nics/mrvl.rst | 69 --- > drivers/net/mrvl/Makefile | 4 +- > drivers/net/mrvl/mrvl_ethdev.c | 92 > +- > drivers/net/mrvl/mrvl_ethdev.h | 5 +++ > 5 files changed, 117 insertions(+), 54 deletions(-) > > -- > 2.7.4 > Acked-by: Jianbo Liu PS, don't forget to add version number in the mail subject next time :-) -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Re: [dpdk-dev] [PATCH 0/2] Sync compilation with MUSDK-17.10
The 11/30/2017 14:32, Tomasz Duszynski wrote: > This patchset comes with the following changes: > > o MUSDK-17.10 is the latest version of the library. Since it brings > improvements and fixes switch is necessary. > > o Some minor updates to the documentation. > > Tomasz Duszynski (2): > crypto/mrvl: sync compilation with musdk-17.10 > crypto/mrvl: update MRVL CRYPTO PMD documentation > > doc/guides/cryptodevs/mrvl.rst| 28 > doc/guides/nics/mrvl.rst | 2 ++ > drivers/crypto/mrvl/Makefile | 3 ++- > drivers/crypto/mrvl/rte_mrvl_compat.h | 1 + > 4 files changed, 9 insertions(+), 25 deletions(-) > > -- > 2.7.4 > Acked-by: Jianbo Liu -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Re: [dpdk-dev] [PATCH] eventdev: set rte errno in port link/unlink functions
-Original Message- > Date: Tue, 14 Nov 2017 16:44:10 -0600 > From: Gage Eads > To: dev@dpdk.org > CC: jerin.ja...@caviumnetworks.com, harry.van.haa...@intel.com, > bruce.richard...@intel.com, hemant.agra...@nxp.com, nipun.gu...@nxp.com, > santosh.shu...@caviumnetworks.com, pbhagavat...@caviumnetworks.com > Subject: [PATCH] eventdev: set rte errno in port link/unlink functions > X-Mailer: git-send-email 2.7.4 > > The return value for rte_event_port_{link, unlink}() is defined as the > "number of {links, unlinks} actually established." However, the eventdev > layer's error checking returns negative error values. This commit aligns > the eventdev code with the API definition by having it set rte_errno and > return 0 if it detects an error. > > Fixes: 4f0804bbdfb9 ("eventdev: implement the northbound APIs") > > Signed-off-by: Gage Eads Acked-by: Jerin Jacob
Re: [dpdk-dev] [PATCH] eventdev: set rte errno in port link/unlink functions
-Original Message- > Date: Mon, 4 Dec 2017 08:36:20 +0530 > From: Jerin Jacob > To: Gage Eads > CC: dev@dpdk.org, harry.van.haa...@intel.com, bruce.richard...@intel.com, > hemant.agra...@nxp.com, nipun.gu...@nxp.com, > santosh.shu...@caviumnetworks.com, pbhagavat...@caviumnetworks.com > Subject: Re: [dpdk-dev] [PATCH] eventdev: set rte errno in port link/unlink > functions > User-Agent: Mutt/1.9.1 (2017-09-22) > > -Original Message- > > Date: Tue, 14 Nov 2017 16:44:10 -0600 > > From: Gage Eads > > To: dev@dpdk.org > > CC: jerin.ja...@caviumnetworks.com, harry.van.haa...@intel.com, > > bruce.richard...@intel.com, hemant.agra...@nxp.com, nipun.gu...@nxp.com, > > santosh.shu...@caviumnetworks.com, pbhagavat...@caviumnetworks.com > > Subject: [PATCH] eventdev: set rte errno in port link/unlink functions > > X-Mailer: git-send-email 2.7.4 > > > > The return value for rte_event_port_{link, unlink}() is defined as the > > "number of {links, unlinks} actually established." However, the eventdev > > layer's error checking returns negative error values. This commit aligns > > the eventdev code with the API definition by having it set rte_errno and > > return 0 if it detects an error. > > > > Fixes: 4f0804bbdfb9 ("eventdev: implement the northbound APIs") > > > > Signed-off-by: Gage Eads > > Acked-by: Jerin Jacob Applied to dpdk-next-eventdev/master. Thanks. >
[dpdk-dev] [PATCH v2 1/2] net/virtio: make control queue thread-safe
The virtio_send_command function may be called from app's configuration routine, but also from an interrupt handler called when live migration is done on the backup side. So this patch makes control queue thread-safe first. Signed-off-by: Xiao Wang --- v2: - Use spaces instead of tabs between the code and comments. - Remove unnecessary parentheses. --- drivers/net/virtio/virtio_ethdev.c | 7 ++- drivers/net/virtio/virtio_rxtx.c | 1 + drivers/net/virtio/virtio_rxtx.h | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index e0328f6..ac73950 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -177,6 +177,8 @@ struct rte_virtio_xstats_name_off { PMD_INIT_LOG(ERR, "Control queue is not supported."); return -1; } + + rte_spinlock_lock(&cvq->sl); vq = cvq->vq; head = vq->vq_desc_head_idx; @@ -184,8 +186,10 @@ struct rte_virtio_xstats_name_off { "vq->hw->cvq = %p vq = %p", vq->vq_desc_head_idx, status, vq->hw->cvq, vq); - if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) + if (vq->vq_free_cnt < pkt_num + 2 || pkt_num < 1) { + rte_spinlock_unlock(&cvq->sl); return -1; + } memcpy(cvq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl)); @@ -261,6 +265,7 @@ struct rte_virtio_xstats_name_off { result = cvq->virtio_net_hdr_mz->addr; + rte_spinlock_unlock(&cvq->sl); return result->status; } diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 390c137..6a24fde 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -407,6 +407,7 @@ struct virtio_hw *hw = dev->data->dev_private; if (hw->cvq && hw->cvq->vq) { + rte_spinlock_init(&hw->cvq->sl); VIRTQUEUE_DUMP((struct virtqueue *)hw->cvq->vq); } } diff --git a/drivers/net/virtio/virtio_rxtx.h b/drivers/net/virtio/virtio_rxtx.h index 54f1e84..71b5798 100644 --- a/drivers/net/virtio/virtio_rxtx.h +++ b/drivers/net/virtio/virtio_rxtx.h @@ -84,6 +84,7 @@ struct virtnet_ctl { rte_iova_t virtio_net_hdr_mem; /**< hdr for each xmit packet */ uint16_t port_id; /**< Device port identifier. */ const struct rte_memzone *mz; /**< mem zone to populate CTL ring. */ + rte_spinlock_t sl; /**< spinlock for control queue. */ }; int virtio_rxq_vec_setup(struct virtnet_rx *rxvq); -- 1.8.3.1
[dpdk-dev] [PATCH v2 0/2] net/virtio: support GUEST ANNOUNCE
When live migration is finished, the backup VM needs to proactively announce its new location. DPDK vhost has implemented VHOST_USER_PROTOCOL_F_RARP to generate a RARP packet to switch in dequeue path. Another method is to let the guest proactively send out RARP packet using VIRTIO_NET_F_GUEST_ANNOUNCE feature. This patch set enables this feature in virtio pmd, to support VM running virtio pmd be migrated without vhost supporting RARP generation. v2: - Use spaces instead of tabs between the code and comments. - Remove unnecessary parentheses. - Use rte_pktmbuf_mtod directly to get eth_hdr addr. - Fix virtio_dev_pause return value check. Xiao Wang (2): net/virtio: make control queue thread-safe net/virtio: support GUEST ANNOUNCE drivers/net/virtio/virtio_ethdev.c | 138 - drivers/net/virtio/virtio_ethdev.h | 4 ++ drivers/net/virtio/virtio_pci.h| 1 + drivers/net/virtio/virtio_rxtx.c | 82 ++ drivers/net/virtio/virtio_rxtx.h | 1 + drivers/net/virtio/virtqueue.h | 11 +++ 6 files changed, 234 insertions(+), 3 deletions(-) -- 1.8.3.1
[dpdk-dev] [PATCH v2 2/2] net/virtio: support GUEST ANNOUNCE
When live migration is done, for the backup VM, either the virtio frontend or the vhost backend needs to send out gratuitous RARP packet to announce its new network location. This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live migration scenario where the vhost backend doesn't have the ability to generate RARP packet. Brief introduction of the work flow: 1. QEMU finishes live migration, pokes the backup VM with an interrupt. 2. Virtio interrupt handler reads out the interrupt status value, and realizes it needs to send out RARP packet to announce its location. 3. Pause device to stop worker thread touching the queues. 4. Inject a RARP packet into a Tx Queue. 5. Ack the interrupt via control queue. 6. Resume device to continue packet processing. Signed-off-by: Xiao Wang --- v2: - Use rte_pktmbuf_mtod directly to get eth_hdr addr. - Fix virtio_dev_pause return value check. --- drivers/net/virtio/virtio_ethdev.c | 131 - drivers/net/virtio/virtio_ethdev.h | 4 ++ drivers/net/virtio/virtio_pci.h| 1 + drivers/net/virtio/virtio_rxtx.c | 81 +++ drivers/net/virtio/virtqueue.h | 11 5 files changed, 226 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index ac73950..4c937c6 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -48,6 +48,8 @@ #include #include #include +#include +#include #include #include #include @@ -55,6 +57,7 @@ #include #include #include +#include #include "virtio_ethdev.h" #include "virtio_pci.h" @@ -106,6 +109,13 @@ static int virtio_dev_queue_stats_mapping_set( uint8_t stat_idx, uint8_t is_rx); +static int make_rarp_packet(struct rte_mbuf *rarp_mbuf, + const struct ether_addr *mac); +static int virtio_dev_pause(struct rte_eth_dev *dev); +static void virtio_dev_resume(struct rte_eth_dev *dev); +static void generate_rarp(struct rte_eth_dev *dev); +static void virtnet_ack_link_announce(struct rte_eth_dev *dev); + /* * The set of PCI devices this driver supports */ @@ -1249,9 +1259,116 @@ static int virtio_dev_xstats_get_names(struct rte_eth_dev *dev, return 0; } +#define RARP_PKT_SIZE 64 + +static int +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr *mac) +{ + struct ether_hdr *eth_hdr; + struct arp_hdr *rarp; + + if (rarp_mbuf->buf_len < RARP_PKT_SIZE) { + PMD_DRV_LOG(ERR, "mbuf size too small %u (< %d)", + rarp_mbuf->buf_len, RARP_PKT_SIZE); + return -1; + } + + /* Ethernet header. */ + eth_hdr = rte_pktmbuf_mtod(rarp_mbuf, struct ether_hdr *); + memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN); + ether_addr_copy(mac, ð_hdr->s_addr); + eth_hdr->ether_type = htons(ETHER_TYPE_RARP); + + /* RARP header. */ + rarp = (struct arp_hdr *)(eth_hdr + 1); + rarp->arp_hrd = htons(ARP_HRD_ETHER); + rarp->arp_pro = htons(ETHER_TYPE_IPv4); + rarp->arp_hln = ETHER_ADDR_LEN; + rarp->arp_pln = 4; + rarp->arp_op = htons(ARP_OP_REVREQUEST); + + ether_addr_copy(mac, &rarp->arp_data.arp_sha); + ether_addr_copy(mac, &rarp->arp_data.arp_tha); + memset(&rarp->arp_data.arp_sip, 0x00, 4); + memset(&rarp->arp_data.arp_tip, 0x00, 4); + + rarp_mbuf->data_len = RARP_PKT_SIZE; + rarp_mbuf->pkt_len = RARP_PKT_SIZE; + + return 0; +} + +static int +virtio_dev_pause(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw = dev->data->dev_private; + + if (hw->started == 0) + return -1; + hw->started = 0; + /* +* Prevent the worker thread from touching queues to avoid condition, +* 1 ms should be enough for the ongoing Tx function to finish. +*/ + rte_delay_ms(1); + return 0; +} + +static void +virtio_dev_resume(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw = dev->data->dev_private; + + hw->started = 1; +} + +static void +generate_rarp(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw = dev->data->dev_private; + struct rte_mbuf *rarp_mbuf = NULL; + struct virtnet_tx *txvq = dev->data->tx_queues[0]; + struct virtnet_rx *rxvq = dev->data->rx_queues[0]; + + rarp_mbuf = rte_mbuf_raw_alloc(rxvq->mpool); + if (rarp_mbuf == NULL) { + PMD_DRV_LOG(ERR, "mbuf allocate failed"); + return; + } + + if (make_rarp_packet(rarp_mbuf, (struct ether_addr *)hw->mac_addr)) { + rte_pktmbuf_free(rarp_mbuf); + rarp_mbuf = NULL; + return; + } + + /* If virtio port just stopped, no need to send RARP */ + if (virtio_dev_pause(dev) < 0) + return; + + virtio_inject_pkts(txvq, &rarp_mbuf, 1); + /* Recover the s
[dpdk-dev] [PATCH] maintainers: update for i40e
Signed-off-by: Jingjing Wu --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index f0baeb4..426764e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -414,8 +414,8 @@ F: doc/guides/nics/intel_vf.rst F: doc/guides/nics/features/ixgbe*.ini Intel i40e -M: Jingjing Wu M: Beilei Xing +M: Qi Zhang F: drivers/net/i40e/ F: doc/guides/nics/i40e.rst F: doc/guides/nics/intel_vf.rst -- 2.4.11
[dpdk-dev] [PATCH] maintainers: update for testpmd
Signed-off-by: Jingjing Wu --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index f0baeb4..460fa0f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -924,6 +924,7 @@ F: test/test/virtual_pmd.c F: test/test/virtual_pmd.h Driver testing tool +M: Wenzhuo Lu M: Jingjing Wu F: app/test-pmd/ F: doc/guides/testpmd_app_ug/ -- 2.4.11
Re: [dpdk-dev] [PATCH] maintainers: update for testpmd
> -Original Message- > From: Wu, Jingjing > Sent: Monday, December 4, 2017 2:18 PM > To: dev@dpdk.org > Cc: Lu, Wenzhuo; Zhang, Helin; Wu, Jingjing > Subject: [PATCH] maintainers: update for testpmd > > Signed-off-by: Jingjing Wu Acked-by: Helin Zhang > --- > MAINTAINERS | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index f0baeb4..460fa0f 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -924,6 +924,7 @@ F: test/test/virtual_pmd.c > F: test/test/virtual_pmd.h > > Driver testing tool > +M: Wenzhuo Lu > M: Jingjing Wu > F: app/test-pmd/ > F: doc/guides/testpmd_app_ug/ > -- > 2.4.11 Actually testpmd is a big application, I'd suggest to ask for more maintainers from other major vendor users, like Mellanox. Thanks, Helin
Re: [dpdk-dev] [PATCH] maintainers: update for i40e
> -Original Message- > From: Wu, Jingjing > Sent: Monday, December 4, 2017 2:16 PM > To: dev@dpdk.org > Cc: Zhang, Qi Z; Zhang, Helin; Wu, Jingjing > Subject: [PATCH] maintainers: update for i40e > > Signed-off-by: Jingjing Wu Acked-by: Helin Zhang > --- > MAINTAINERS | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index f0baeb4..426764e 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -414,8 +414,8 @@ F: doc/guides/nics/intel_vf.rst > F: doc/guides/nics/features/ixgbe*.ini > > Intel i40e > -M: Jingjing Wu > M: Beilei Xing > +M: Qi Zhang > F: drivers/net/i40e/ > F: doc/guides/nics/i40e.rst > F: doc/guides/nics/intel_vf.rst > -- > 2.4.11
[dpdk-dev] [PATCH] net/fm10k: remove RSS restriction with num of queues
FM10K HW does not have such restrictions. Enabling RSS with single queue is not used to distribute flow but to compute a RSS hash value. It can reduce cpu cycles of computing a hash value with five tuples. In addition, there is an explicit method to disable RSS instead of an obscure way. Signed-off-by: Yangchao Zhou --- drivers/net/fm10k/fm10k_ethdev.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c index 2d05a46..403f6c5 100644 --- a/drivers/net/fm10k/fm10k_ethdev.c +++ b/drivers/net/fm10k/fm10k_ethdev.c @@ -533,9 +533,8 @@ fm10k_dev_rss_configure(struct rte_eth_dev *dev) 0x6A, 0x42, 0xB7, 0x3B, 0xBE, 0xAC, 0x01, 0xFA, }; - if (dev->data->nb_rx_queues == 1 || - dev_conf->rxmode.mq_mode != ETH_MQ_RX_RSS || - dev_conf->rx_adv_conf.rss_conf.rss_hf == 0) { + if (dev_conf->rxmode.mq_mode != ETH_MQ_RX_RSS || + dev_conf->rx_adv_conf.rss_conf.rss_hf == 0) { FM10K_WRITE_REG(hw, FM10K_MRQC(0), 0); return; } -- 2.9.0.windows.1
Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on AArch64
> -Original Message- > From: Pavan Nikhilesh Bhagavatula > [mailto:pbhagavat...@caviumnetworks.com] > Sent: Sunday, December 3, 2017 22:21 > To: Herbert Guan ; Jianbo Liu > > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] arch/arm: optimization for memcpy on > AArch64 > > On Sun, Dec 03, 2017 at 12:38:35PM +, Herbert Guan wrote: > > Pavan, > > > > Thanks for review and comments. Please find my comments inline below. > > > > Best regards, > > Herbert > > > > > > There is an existing flag for arm32 to enable neon based memcpy > > > RTE_ARCH_ARM_NEON_MEMCPY we could reuse that here as restrict > does > > > the same. > > > > > This implementation is actually not using ARM NEON instructions so the > existing flag is not describing the option exactly. It'll be good if the > existing > flag is "RTE_ARCH_ARM_MEMCPY" but unfortunately it might be too late > now to get the flags aligned. > > > > Correct me if I'm wrong but doesn't restrict tell the compiler to do SIMD > optimization? > Anyway can we put RTE_ARCH_ARM64_MEMCPY into config/common_base > as CONFIG_RTE_ARCH_ARM64_MEMCPY=n so that it would be easier to > enable/disable. > The result of using 'restrict' is to generate codes with ldp/stp instructions. These instructions actually belong to the "data transfer instructions", though they are loading/storing a pair of registers. 'ld1/st1' are SIMD (NEON) instructions. I can add CONFIG_RTE_ARCH_ARM64_MEMCPY=n into common_armv8a_linuxapp in the new version as you've suggested. > > > > +#include > > > > +#include > > > > + > > > > > > > > +/* > > > *** > > > > +*** > > > > + * The memory copy performance differs on different AArch64 > > > > +micro- > > > architectures. > > > > + * And the most recent glibc (e.g. 2.23 or later) can provide a > > > > +better memcpy() > > > > + * performance compared to old glibc versions. It's always > > > > +suggested to use a > > > > + * more recent glibc if possible, from which the entire system > > > > +can get > > > benefit. > > > > + * > > > > + * This implementation improves memory copy on some aarch64 > > > > +micro-architectures, > > > > + * when an old glibc (e.g. 2.19, 2.17...) is being used. It is > > > > +disabled by > > > > + * default and needs "RTE_ARCH_ARM64_MEMCPY" defined to > activate. > > > > +It's not > > > > + * always providing better performance than memcpy() so users > > > > +need to run unit > > > > + * test "memcpy_perf_autotest" and customize parameters in > > > > +customization section > > > > + * below for best performance. > > > > + * > > > > + * Compiler version will also impact the rte_memcpy() performance. > > > > +It's observed > > > > + * on some platforms and with the same code, GCC 7.2.0 compiled > > > > +binaries can > > > > + * provide better performance than GCC 4.8.5 compiled binaries. > > > > + > > > > > > > > +* > > > > > > > +*/ > > > > + > > > > +/** > > > > + * Beginning of customization section > > > > +**/ > > > > +#define ALIGNMENT_MASK 0x0F > > > > +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN > > > > +// Only src unalignment will be treaed as unaligned copy #define > > > > +IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & ALIGNMENT_MASK) > > > > > > We can use existing `rte_is_aligned` function instead. > > > > The exising 'rte_is_aligned()' inline function is defined in a relatively > complex way, and there will be more instructions generated (using GCC > 7.2.0): > > > > : // using rte_is_aligned() > >0:91003c01 addx1, x0, #0xf > >4:927cec21 andx1, x1, #0xfff0 > >8:eb01001f cmpx0, x1 > >c:1a9f07e0 csetw0, ne // ne = any > > 10:d65f03c0 ret > > 14:d503201f nop > > > > 0018 : // using above expression > > 18:12000c00 andw0, w0, #0xf > > 1c:d65f03c0 ret > > > > So to get better performance, it's better to use the simple logic. > > Agreed, I have noticed that too maybe we could change rte_is_aligned to be > simpler (Not in this patch). > > > > > Would doing this still benifit if size is compile time constant? > > > i.e. when > > > __builtin_constant_p(n) is true. > > > > > Yes, performance margin is observed if size is compile time constant on > some tested platforms. > > > > Sorry I didn't get you but which is better? If size is compile time constant > is > using libc memcpy is better or going with restrict implementation better. > > If the former then we could do what 32bit rte_memcpy is using i.e. > > #define rte_memcpy(dst, src, n) \ > __extension__ ({ \ > (__builtin_constant_p(n)) ? \ > memcpy((dst), (src), (n)) : \ > rte_memcpy_func((dst), (src), (n)); }) > Per my test, it usually has the same direction. Means if the variable s
Re: [dpdk-dev] [PATCH v2] net/virtio: fix an incorrect behavior of device stop/start
On Sat, Dec 02, 2017 at 12:30:33PM +0800, Tiwei Bie wrote: > Hi Antonio, > > On Sat, Dec 02, 2017 at 01:17:58AM +0800, Fischetti, Antonio wrote: > > Hi All, > > I've got an update on this. > > I could replicate the same issue by using testpmd + a VM (= Virtual > > Machine). > > > > The test topology I'm using is: > > > > > > [Traffic gen][NIC port #0][testpmd][vhost port #2]+ > > | > > | > > [testpmd in > > the VM] > > | > > | > > [Traffic gen][NIC port #1][testpmd][vhost port #3]+ > > > > > > So there's no OvS now in the picture: one testpmd running on the host > > and one testpmd running on the VM. > > > > The issue is that no packet goes through testpmd in the VM. > > It seems this is happening after this patch was upstreamed. > > > > Please note > > --- > > To replicate this issue both the next 2 conditions must be met: > > - the traffic is already being sent before launching testpmd in the VM > > - there are at least 2 forwarding lcores. > > > [...] > > Do you see anything I missed? Or can you reproduce the issue with the > setup I'm using? > Hi Antonio, Are you using vector Rx in your test? After some further investigations, I found that the vector Rx could be broken if backend has consumed all the avail descs before the device is started. Because in current implementation, the vector Rx will return immediately without refilling the avail ring if the used ring is empty. So we have to refill the avail ring after flushing the elements in the used ring. Best regards, Tiwei Bie
Re: [dpdk-dev] [PATCH] mmap(2) returns MAP_FAILED, not NULL, on failure
On 02.12.2017 01:54, Ferruh Yigit wrote: On 11/30/2017 10:51 PM, Michael McConville wrote: Signed-off-by: Michael McConville --- lib/librte_eal/bsdapp/eal/eal_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_memory.c b/lib/librte_eal/bsdapp/eal/eal_memory.c index 6ba058578..2c8a4b592 100644 --- a/lib/librte_eal/bsdapp/eal/eal_memory.c +++ b/lib/librte_eal/bsdapp/eal/eal_memory.c @@ -155,7 +155,7 @@ rte_eal_hugepage_attach(void) /* Map the shared hugepage_info into the process address spaces */ hpi = mmap(NULL, sizeof(struct hugepage_info), PROT_READ, MAP_PRIVATE, fd_hugepage_info, 0); - if (hpi == NULL) { + if (hpi == MAP_FAILED) { Hi Matej, Can you fix same thing in szedata2 PMD please [1] ? [1] http://dpdk.org/browse/dpdk/tree/drivers/net/szedata2/rte_eth_szedata2.c?h=v17.11#n1556 Hi Ferruh, I will send a patch. Thanks, Matej RTE_LOG(ERR, EAL, "Could not mmap %s\n", eal_hugepage_info_path()); goto error; }
Re: [dpdk-dev] [PATCH v3] examples/ipsec-secgw: fix usage of incorrect port
Hi Anoob, On 11/29/2017 9:51 AM, Anoob Joseph wrote: Hi Akhil, On 24-11-2017 16:19, Akhil Goyal wrote: Hi Anoob, On 11/24/2017 3:28 PM, Anoob wrote: static inline void route4_pkts(struct rt_ctx *rt_ctx, struct rte_mbuf *pkts[], uint8_t nb_pkts) { uint32_t hop[MAX_PKT_BURST * 2]; uint32_t dst_ip[MAX_PKT_BURST * 2]; + int32_t pkt_hop = 0; uint16_t i, offset; + uint16_t lpm_pkts = 0; if (nb_pkts == 0) return; + /* Need to do an LPM lookup for non-offload packets. Offload packets + * will have port ID in the SA + */ + for (i = 0; i < nb_pkts; i++) { - offset = offsetof(struct ip, ip_dst); - dst_ip[i] = *rte_pktmbuf_mtod_offset(pkts[i], - uint32_t *, offset); - dst_ip[i] = rte_be_to_cpu_32(dst_ip[i]); + if (!(pkts[i]->ol_flags & PKT_TX_SEC_OFFLOAD)) { + /* Security offload not enabled. So an LPM lookup is + * required to get the hop + */ + offset = offsetof(struct ip, ip_dst); + dst_ip[lpm_pkts] = *rte_pktmbuf_mtod_offset(pkts[i], + uint32_t *, offset); + dst_ip[lpm_pkts] = rte_be_to_cpu_32(dst_ip[lpm_pkts]); + lpm_pkts++; + } } - rte_lpm_lookup_bulk((struct rte_lpm *)rt_ctx, dst_ip, hop, nb_pkts); + rte_lpm_lookup_bulk((struct rte_lpm *)rt_ctx, dst_ip, hop, lpm_pkts); + + lpm_pkts = 0; for (i = 0; i < nb_pkts; i++) { - if ((hop[i] & RTE_LPM_LOOKUP_SUCCESS) == 0) { + if (pkts[i]->ol_flags & PKT_TX_SEC_OFFLOAD) { + /* Read hop from the SA */ + pkt_hop = get_hop_for_offload_pkt(pkts[i]); + } else { + /* Need to use hop returned by lookup */ + pkt_hop = hop[lpm_pkts++]; + if ((pkt_hop & RTE_LPM_LOOKUP_SUCCESS) == 0) + pkt_hop = -1; + } + I believe the following check is redundant for non inline case. I believe get_hop_for_offload_pkt can also set the RTE_LPM_LOOKUP_SUCCESS if route is success and take the (pkt_hop & RTE_LPM_LOOKUP_SUCCESS) == 0 check outside the if else block and free the packet if it is unsuccessful. Same comment for route6_pkts. Checking with -1 may not be a good idea if we have a flag available for the same. Others can comment. The problem is ipv4 & ipv6 LPM lookups return different error values, but we are using a single routine to get the hop for offload packets. The flag(RTE_LPM_LOOKUP_SUCCESS) is only for ipv4 lookups. For ipv6, error is -1. If we need a cleaner solution, we can have ipv4 & ipv6 variants of "get_hop_for_offload_pkt". But that would be repetition of some code. my concern over this patch is that there is an addition of an extra check in the non inline case and we can get rid of that with some changes in the code(lib/app). Regarding route6_pkts, the code looks cleaner than route4_pkts If we have ipv4 and ipv6 variants of the "get_hop_for_offload_packet" function, the code would look much cleaner. Shall I update the patch with such a change and send v4? I believe we shall get rid of "RTE_LPM_LOOKUP_SUCCESS" from the rte_lpm_lookup_bulk(), we shall have similar error flags for v4 and v6 APIs. Either we can have RTE_LPM_LOOKUP_SUCCESS or -1 as check for errors. Sergio can comment on this. Duplicating code for get_hop_for_offload_packet may not be a good idea. -Akhil