date:20220811

[RFC] ethdev: add direction info when creating the transfer table

2022-08-11 Thread Rongwei Liu

The transfer domain rule is able to match traffic wire/vf
origin and it means two directions' underlayer resource.

In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.

Introduce one new member transfer_mode into rte_flow_attr to
indicate the flow table direction property: from wire, from vf
or bi-direction(default).

It helps to save underlayer memory also on insertion rate.

By default, the transfer domain is bi-direction, and no behavior changes.

1. Match wire origin only
   flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
   flow template_table 0 create group 0 priority 0 transfer vf_orig...

Signed-off-by: Rongwei Liu 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 26 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  3 ++-
 lib/ethdev/rte_flow.h   |  9 ++-
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7f50028eb7..b25b595e82 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -177,6 +177,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+   TABLE_TRANSFER_WIRE_ORIG,
+   TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+   TABLE_TRANSFER_WIRE_ORIG,
+   TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+   [TABLE_TRANSFER_WIRE_ORIG] = {
+   .name = "wire_orig",
+   .help = "affect rule direction to transfer",
+   .next = NEXT(next_table_attr),
+   .call = parse_table,
+   },
+   [TABLE_TRANSFER_VF_ORIG] = {
+   .name = "vf_orig",
+   .help = "affect rule direction to transfer",
+   .next = NEXT(next_table_attr),
+   .call = parse_table,
+   },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8894,6 +8910,16 @@ parse_table(struct context *ctx, const struct token 
*token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+   case TABLE_TRANSFER_WIRE_ORIG:
+   if (!out->args.table.attr.flow_attr.transfer)
+   return -1;
+   out->args.table.attr.flow_attr.transfer_mode = 1;
+   return len;
+   case TABLE_TRANSFER_VF_ORIG:
+   if (!out->args.table.attr.flow_attr.transfer)
+   return -1;
+   out->args.table.attr.flow_attr.transfer_mode = 2;
+   return len;
default:
return -1;
}
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 330e34427d..603b7988dd 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3332,7 +3332,8 @@ It is bound to ``rte_flow_template_table_create()``::
 
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
-   [priority {level}] [ingress] [egress] [transfer]
+   [priority {level}] [ingress] [egress]
+   [transfer [vf_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index a79f1e7ef0..512b08d817 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -130,7 +130,14 @@ struct rte_flow_attr {
 * through a suitable port. @see rte_flow_pick_transfer_proxy().
 */
uint32_t transfer:1;
-   uint32_t reserved:29; /**< Reserved, must be zero. */
+   /**
+* 0 means bidirection,
+* 0x1 origin uplink,
+* 0x2 origin vport,
+* N/A both set.
+*/
+   uint32_t transfer_mode:2;
+   uint32_t reserved:27; /**< Reserved, must be zero. */
 };
 
 /**
-- 
2.27.0

RE: 19.11.13 patches review and test

2022-08-11 Thread Jiang, YuX

> -Original Message-
> From: christian.ehrha...@canonical.com 
> Sent: Thursday, August 4, 2022 4:22 PM
> To: sta...@dpdk.org
> Cc: dev@dpdk.org; Abhishek Marathe ;
> Ali Alnubani ; Walker, Benjamin
> ; David Christensen
> ; Hemant Agrawal ;
> Stokes, Ian ; Jerin Jacob ;
> Mcnamara, John ; Ju-Hyoung Lee
> ; Kevin Traynor ; Luca
> Boccassi ; Pei Zhang ; Xu, Qian
> Q ; Raslan Darawsheh ;
> Thomas Monjalon ; Yanghang Liu
> ; Peng, Yuan ; Chen,
> Zhaoyan 
> Subject: 19.11.13 patches review and test
> 
> Hi all,
> 
> This is -rc3 just after -rc2 as a build issue was hiding in the former.
> Sorry for the extra noise, but other than that it all stays the same.
> 
> there were three patches close to the deadline that I missed and I
> considered postponing them to 19.11.14 at first. But in the meantime there
> arrived 11 more and I think that justfies a new tag for 19.11.13.
> 
> We still have almost 4 weeks left - I hope that is ok.
> 
> Here is the combined list of patches (the same as before plus the new 14)
> targeted for stable release 19.11.13.
> 
> The planned date for the final release is August 29th.
> 
> Please help with testing and validation of your use cases and report any
> issues/results with reply-all to this mail. For the final release the fixes 
> and
> reported validations will be added to the release notes.
> 
> A release candidate tarball can be found at:
> 
> https://dpdk.org/browse/dpdk-stable/tag/?id=v19.11.13-rc3
> 
> These patches are located at branch 19.11 of dpdk-stable repo:
> https://dpdk.org/browse/dpdk-stable/
> 
> Thanks.
> 
> Christian Ehrhardt 
> 
> ---
Update the test status for Intel part. DPDK19.11.13-rc3 almost test finished, 
no critical issue is found, but also find some new bugs on FreeBSD13.0.
New report defects for build:
  1, https://bugs.dpdk.org/show_bug.cgi?id=1064 [19.11.13-rc3] lib/eal make 
build failed with gcc10.3.0 and clang11.0.1 on FreeBSD13.0/64
> Not found such issue on FreeBSD13.1
  2, https://bugs.dpdk.org/show_bug.cgi?id=1063 [19.11.13-rc3] drivers/net/i40e 
on meson build failure with clang13.0 on FreeBSD13.1/64
> Not found such issue on FreeBSD13.0, Intel Dev is investigating.
Known defects for build:
  1, [dpdk 19.11.13-rc1] lib/librte_eal meson build error with gcc12.1 on 
fedora36, similar as https://bugs.dpdk.org/show_bug.cgi?id=985
> Not found such issue on Fedora35. Dev said that looks like a GCC13 BUG
  2, [dpdk 19.11.13-rc1] drivers/net/ena make build error with gcc12.1 on 
fedora36, similar as https://bugs.dpdk.org/show_bug.cgi?id=991
> No update in bugzilla, no fix yet.
  3, https://bugs.dpdk.org/show_bug.cgi?id=912 [dpdk 19.11.13-rc1] 
drivers/net/qede make build error with clang14 on fedora36/redhat8.6/UB22.04
> Known issue, no update in bugzilla, no fix yet.

# Basic Intel(R) NIC testing
* Build: cover the build test combination with latest GCC/Clang version and the 
popular OS revision such as Ubuntu20.04&22.04, Fedora36, RHEL8.4, etc.
- All test done. 
* PF&VF(i40e, ixgbe): test scenarios including RTE_FLOW/TSO/Jumboframe/checksum 
offload/VLAN/VXLAN, etc. 
- All test done. No new bug is found.
* PF/VF(ice): test scenarios including Switch features/Package Management/Flow 
Director/Advanced Tx, etc.
- All test done. No new issue is found. 
- Known bug about [dpdk-19.11.12] 
metering_and_policing/ipv4_HASH_table_RFC2698: unable to forward packets 
normally. Intel Dev is still investigating.
* Intel NIC single core/NIC performance: test scenarios including PF/VF single 
core performance test etc.
- All test done. No big performance drop.

# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as 
PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing, etc.
- All test done. No new issue is found.
* Cryptodev: 
* Function test: test scenarios including Cryptodev API testing/CompressDev 
ISA-L/QAT/ZLIB PMD Testing/ etc.
- All test done. No new issue is found.
* Performance test: test scenarios including Thoughput Performance /Cryptodev 
Latency, etc.
- All test done. No performance drop.

Best regards,
Yu Jiang

Re: 19.11.13 patches review and test

2022-08-11 Thread Christian Ehrhardt

On Thu, Aug 11, 2022 at 9:37 AM Jiang, YuX  wrote:
>
> > -Original Message-
> > From: christian.ehrha...@canonical.com 
> > Sent: Thursday, August 4, 2022 4:22 PM
> > To: sta...@dpdk.org
> > Cc: dev@dpdk.org; Abhishek Marathe ;
> > Ali Alnubani ; Walker, Benjamin
> > ; David Christensen
> > ; Hemant Agrawal ;
> > Stokes, Ian ; Jerin Jacob ;
> > Mcnamara, John ; Ju-Hyoung Lee
> > ; Kevin Traynor ; Luca
> > Boccassi ; Pei Zhang ; Xu, Qian
> > Q ; Raslan Darawsheh ;
> > Thomas Monjalon ; Yanghang Liu
> > ; Peng, Yuan ; Chen,
> > Zhaoyan 
> > Subject: 19.11.13 patches review and test
> >
> > Hi all,
> >
> > This is -rc3 just after -rc2 as a build issue was hiding in the former.
> > Sorry for the extra noise, but other than that it all stays the same.
> >
> > there were three patches close to the deadline that I missed and I
> > considered postponing them to 19.11.14 at first. But in the meantime there
> > arrived 11 more and I think that justfies a new tag for 19.11.13.
> >
> > We still have almost 4 weeks left - I hope that is ok.
> >
> > Here is the combined list of patches (the same as before plus the new 14)
> > targeted for stable release 19.11.13.
> >
> > The planned date for the final release is August 29th.
> >
> > Please help with testing and validation of your use cases and report any
> > issues/results with reply-all to this mail. For the final release the fixes 
> > and
> > reported validations will be added to the release notes.
> >
> > A release candidate tarball can be found at:
> >
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v19.11.13-rc3
> >
> > These patches are located at branch 19.11 of dpdk-stable repo:
> > https://dpdk.org/browse/dpdk-stable/
> >
> > Thanks.
> >
> > Christian Ehrhardt 
> >
> > ---
> Update the test status for Intel part. DPDK19.11.13-rc3 almost test finished, 
> no critical issue is found, but also find some new bugs on FreeBSD13.0.

So we are only seeing old and new build failures which we are happy to
resolve if patches are provided but "ok to stay".
Great to know that there were no functional regressions found.

Thanks Yu Jiang!

> New report defects for build:
>   1, https://bugs.dpdk.org/show_bug.cgi?id=1064 [19.11.13-rc3] lib/eal make 
> build failed with gcc10.3.0 and clang11.0.1 on FreeBSD13.0/64
> > Not found such issue on FreeBSD13.1
>   2, https://bugs.dpdk.org/show_bug.cgi?id=1063 [19.11.13-rc3] 
> drivers/net/i40e on meson build failure with clang13.0 on FreeBSD13.1/64
> > Not found such issue on FreeBSD13.0, Intel Dev is investigating.
> Known defects for build:
>   1, [dpdk 19.11.13-rc1] lib/librte_eal meson build error with gcc12.1 on 
> fedora36, similar as https://bugs.dpdk.org/show_bug.cgi?id=985
> > Not found such issue on Fedora35. Dev said that looks like a GCC13 
> BUG
>   2, [dpdk 19.11.13-rc1] drivers/net/ena make build error with gcc12.1 on 
> fedora36, similar as https://bugs.dpdk.org/show_bug.cgi?id=991
> > No update in bugzilla, no fix yet.
>   3, https://bugs.dpdk.org/show_bug.cgi?id=912 [dpdk 19.11.13-rc1] 
> drivers/net/qede make build error with clang14 on fedora36/redhat8.6/UB22.04
> > Known issue, no update in bugzilla, no fix yet.
>
> # Basic Intel(R) NIC testing
> * Build: cover the build test combination with latest GCC/Clang version and 
> the popular OS revision such as Ubuntu20.04&22.04, Fedora36, RHEL8.4, etc.
> - All test done.
> * PF&VF(i40e, ixgbe): test scenarios including 
> RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> - All test done. No new bug is found.
> * PF/VF(ice): test scenarios including Switch features/Package 
> Management/Flow Director/Advanced Tx, etc.
> - All test done. No new issue is found.
> - Known bug about [dpdk-19.11.12] 
> metering_and_policing/ipv4_HASH_table_RFC2698: unable to forward packets 
> normally. Intel Dev is still investigating.
> * Intel NIC single core/NIC performance: test scenarios including PF/VF 
> single core performance test etc.
> - All test done. No big performance drop.
>
> # Basic cryptodev and virtio testing
> * Virtio: both function and performance test are covered. Such as 
> PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing, etc.
> - All test done. No new issue is found.
> * Cryptodev:
> * Function test: test scenarios including Cryptodev API testing/CompressDev 
> ISA-L/QAT/ZLIB PMD Testing/ etc.
> - All test done. No new issue is found.
> * Performance test: test scenarios including Thoughput Performance /Cryptodev 
> Latency, etc.
> - All test done. No performance drop.
>
> Best regards,
> Yu Jiang



-- 
Christian Ehrhardt
Senior Staff Engineer, Ubuntu Server
Canonical Ltd

[PATCH] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()

2022-08-11 Thread Mao YingMing

From: maoyingming 

VFs's "bp->leds" is allways null, check bp->leds is
not null before use bp->leds->num_leds.

segfault backtrace in trex program when use VF:
11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true)
10: bnxt_dev_led_on_op (dev=0x22d7780 )
 9: rte_eth_led_on (port_id=0)
 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true)
 7: DpdkTRexPortAttr::DpdkTRexPortAttr
 6: CTRexExtendedDriverBnxt::create_port_attr
 5: CPhyEthIF::Create
 4: CGlobalTRex::device_start
 3: CGlobalTRex::Create
 2: main_test
 1: main

Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation")
Cc: sta...@dpdk.org

Signed-off-by: Mao YingMing 
---
 drivers/net/bnxt/bnxt_hwrm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 9c52573..41e6067 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on)
uint16_t duration = 0;
int rc, i;
 
-   if (!bp->leds->num_leds || BNXT_VF(bp))
+   if (BNXT_VF(bp) || (!bp->leds) || (!bp->leds->num_leds))
return -EOPNOTSUPP;
 
HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB);
-- 
1.8.3.1

Reason to alway to build both static and shared libs

2022-08-11 Thread Jianshen Liu

Hi all,

Could I know the reason for always building both static and shared libs of
DPDK? I can find the patch
 to enable this
behavior, but it seems that it didn't mention the reason behind it. Also,
if I propose a change to use "both" as the default for default_library in
meson's config file and still allow users to choose either static or shared
as they want, is there any reason against that change?

Thanks,
Jianshen

Re: [PATCH] usertools: fix bind failure from dpdk to kernel

2022-08-11 Thread Krzysztof Kozlowski

On 09/08/2022 14:44, lihuisong (C) wrote:
> 
> 在 2022/8/5 23:35, Stephen Hemminger 写道:
>> On Fri, 5 Aug 2022 11:10:22 +0800
>> Huisong Li  wrote:
>>
>>> Currently, the steps for binding device from dpdk driver to kernel
>>> driver is as follows:
>>> echo $BDF > /sys/bus/pci/drivers/vfio-pci/unbind
>>> echo $BDF > /sys/bus/pci/drivers/$kernel_driver/bind
>>>
>>> This steps cannot bind device from dpdk driver to kernel driver on
>>> platform with kernel 5.19. The 'driver_override' must be specify
>>> kernel driver before binding device to kernel driver.
>>>
>>> Fixes: 720b7a058260 ("usertools: fix device binding with kernel tools")
>>> Cc: sta...@dpdk.org
>>>
>>> Signed-off-by: Huisong Li 
>> Not sure exactly what you did and why.
>> The patch seems to just remove the check that the driver
>> is in the set of dpdk_drivers.
>> .
> Currently, the end of the operation binding device from kernel driver to
> dpdk driver write '\00' to driver_override file so as to this device can
> be bound to any other driver. 

This could have work but this was not the way to use the
driver_override. The kernel ABI document clearly states:
"and  may be cleared with an empty string (echo > driver_override)."
Documentation/ABI/testing/sysfs-bus-pci

Please use the kernel ABI how it is described. Using it in wrong way
might sometimes work, sometimes not.


> And perform following steps to
> bind device dpdk driver to kernel driver:
> echo $BDF > /sys/bus/pci/drivers/vfio-pci/unbind
> echo $BDF > /sys/bus/pci/drivers/$kernel_driver/bind
> 
> However, due to the patch[1] merged into 5.19 kernel, 'driver_override'
> in the pci_dev is no longer NULL by writing '\00' to driver_override file.
> This causes PCI match device failure and the device will never be bound to
> their kernel driver.
> 
> In 5.19 kernel, I found that dpdk-devbind.py need to write '\n' to
> driver_override file if we want to bind divce to any other driver.
> But I think it is not necessary to write empty to driver_override
> file. 

It is necessary because in 2014 it was described that PCI
driver_override works like that. What you are implying here is that "it
is not necessary to follow the API and we can do it differently"...

> After all, the device has only one kernel driver, and binding
> to dpdk driver(like, vfio-pci) must specify driver_override.
> 
> [1] 23d99baf9d72 ("PCI: Use driver_set_override() instead of open-coding")


Best regards,
Krzysztof

[PATCH] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()

2022-08-11 Thread Mao YingMing

From: maoyingming 

VFs's "bp->leds" is allways null, check bp->leds is
not null before use bp->leds->num_leds.

segfault backtrace in trex program when use VF:
11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true)
10: bnxt_dev_led_on_op (dev=0x22d7780 )
 9: rte_eth_led_on (port_id=0)
 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true)
 7: DpdkTRexPortAttr::DpdkTRexPortAttr
 6: CTRexExtendedDriverBnxt::create_port_attr
 5: CPhyEthIF::Create
 4: CGlobalTRex::device_start
 3: CGlobalTRex::Create
 2: main_test
 1: main

Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation")
Cc: sta...@dpdk.org

Signed-off-by: Mao YingMing 
---
 drivers/net/bnxt/bnxt_hwrm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 9c52573..41e6067 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on)
uint16_t duration = 0;
int rc, i;
 
-   if (!bp->leds->num_leds || BNXT_VF(bp))
+   if (BNXT_VF(bp) || (!bp->leds) || (!bp->leds->num_leds))
return -EOPNOTSUPP;
 
HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB);
-- 
1.8.3.1

[PATCH] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()

2022-08-11 Thread Mao YingMing

From: maoyingming 

VFs's "bp->leds" is allways null, check bp->leds is
not null before use bp->leds->num_leds.

segfault backtrace in trex program when use VF:
11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true)
10: bnxt_dev_led_on_op (dev=0x22d7780 )
 9: rte_eth_led_on (port_id=0)
 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true)
 7: DpdkTRexPortAttr::DpdkTRexPortAttr
 6: CTRexExtendedDriverBnxt::create_port_attr
 5: CPhyEthIF::Create
 4: CGlobalTRex::device_start
 3: CGlobalTRex::Create
 2: main_test
 1: main

Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation")
Cc: sta...@dpdk.org

Signed-off-by: Mao YingMing 
---
 drivers/net/bnxt/bnxt_hwrm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 9c52573..41e6067 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on)
uint16_t duration = 0;
int rc, i;
 
-   if (!bp->leds->num_leds || BNXT_VF(bp))
+   if (BNXT_VF(bp) || (!bp->leds) || (!bp->leds->num_leds))
return -EOPNOTSUPP;
 
HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB);
-- 
1.8.3.1

[PATCH] bus/pci: 'RTE_PMD_REGISTER_PCI' support expand 'nm'

2022-08-11 Thread Lichao Liu

Before, 'RTE_PMD_REGISTER_PCI' can not expand nm. So, it can not
work in this situation:
#define TEST(a, b) a##_##b
RTE_PMD_REGISTER_PCI(TEST(a, b), test_pmd_driver)

Signed-off-by: Lichao Liu 
---
 drivers/bus/pci/rte_bus_pci.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index 1c6a8fdd7b..4f11bda19d 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -270,8 +270,7 @@ int rte_pci_set_bus_master(struct rte_pci_device *dev, bool 
enable);
  */
 void rte_pci_register(struct rte_pci_driver *driver);
 
-/** Helper for PCI device registration from driver (eth, crypto) instance */
-#define RTE_PMD_REGISTER_PCI(nm, pci_drv) \
+#define _RTE_PMD_REGISTER_PCI(nm, pci_drv) \
 RTE_INIT(pciinitfn_ ##nm) \
 {\
(pci_drv).driver.name = RTE_STR(nm);\
@@ -279,6 +278,10 @@ RTE_INIT(pciinitfn_ ##nm) \
 } \
 RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
 
+/** Helper for PCI device registration from driver (eth, crypto) instance */
+#define RTE_PMD_REGISTER_PCI(nm, pci_drv) \
+   _RTE_PMD_REGISTER_PCI(nm, pci_drv)
+
 /**
  * Unregister a PCI driver.
  *
-- 
2.20.1

Re: [PATCH] usertools: fix bind failure from dpdk to kernel

2022-08-11 Thread Krzysztof Kozlowski

On 09/08/2022 20:58, Stephen Hemminger wrote:
>>
>> However, due to the patch[1] merged into 5.19 kernel, 'driver_override'
>> in the pci_dev is no longer NULL by writing '\00' to driver_override file.
>> This causes PCI match device failure and the device will never be bound to
>> their kernel driver.
> 
> 
> Linux kernel does not look favorably on API changes and that looks like
> the kernel changed behavior. That should be reported and fixed there.

To clarify around this issue:

There were no API changes. Linux kernel follows the API exactly how it
is described in the API document since 2014:
Documentation/ABI/testing/sysfs-bus-pci

There was no change in kernel API.

There was a change in undocumented, unsupported and wrong usage of
driver_override API.

Best regards,
Krzysztof

[PATCH] common/cnxk: add CPT LF reset sequence

2022-08-11 Thread Srujana Challa

Adds code to reset CPT LF as part of cpt_lf_fini.

Signed-off-by: Srujana Challa 
---
 drivers/common/cnxk/roc_cpt.c  | 82 ++
 drivers/common/cnxk/roc_mbox.h |  6 +++
 2 files changed, 88 insertions(+)

diff --git a/drivers/common/cnxk/roc_cpt.c b/drivers/common/cnxk/roc_cpt.c
index f1be6a3401..a48696f379 100644
--- a/drivers/common/cnxk/roc_cpt.c
+++ b/drivers/common/cnxk/roc_cpt.c
@@ -712,6 +712,87 @@ roc_cpt_lf_ctx_reload(struct roc_cpt_lf *lf, void *cptr)
return 0;
 }
 
+static int
+cpt_lf_reset(struct roc_cpt_lf *lf)
+{
+   struct cpt_lf_rst_req *req;
+   struct dev *dev = lf->dev;
+
+   req = mbox_alloc_msg_cpt_lf_reset(dev->mbox);
+   if (req == NULL)
+   return -EIO;
+
+   req->slot = lf->lf_id;
+
+   return mbox_process(dev->mbox);
+}
+
+static void
+cpt_9k_lf_rst_lmtst(struct roc_cpt_lf *lf, uint8_t egrp)
+{
+   struct cpt_inst_s inst;
+   uint64_t lmt_status;
+
+   memset(&inst, 0, sizeof(struct cpt_inst_s));
+   inst.w7.s.egrp = egrp;
+
+   plt_io_wmb();
+
+   do {
+   /* Copy CPT command to LMTLINE */
+   roc_lmt_mov64((void *)lf->lmt_base, &inst);
+   lmt_status = roc_lmt_submit_ldeor(lf->io_addr);
+   } while (lmt_status == 0);
+}
+
+static void
+cpt_10k_lf_rst_lmtst(struct roc_cpt_lf *lf, uint8_t egrp)
+{
+   uint64_t lmt_base, lmt_arg, io_addr;
+   struct cpt_inst_s *inst;
+   uint16_t lmt_id;
+
+   lmt_base = lf->lmt_base;
+   io_addr = lf->io_addr;
+
+   io_addr |= ROC_CN10K_CPT_INST_DW_M1 << 4;
+   ROC_LMT_BASE_ID_GET(lmt_base, lmt_id);
+
+   inst = (struct cpt_inst_s *)lmt_base;
+   memset(inst, 0, sizeof(struct cpt_inst_s));
+   inst->w7.s.egrp = egrp;
+   lmt_arg = ROC_CN10K_CPT_LMT_ARG | (uint64_t)lmt_id;
+   roc_lmt_submit_steorl(lmt_arg, io_addr);
+}
+
+static void
+roc_cpt_iq_reset(struct roc_cpt_lf *lf)
+{
+   union cpt_lf_inprog lf_inprog = {.u = 0x0};
+   union cpt_lf_ctl lf_ctl = {.u = 0x0};
+
+   lf_inprog.u = plt_read64(lf->rbase + CPT_LF_INPROG);
+   if (((lf_inprog.s.gwb_cnt & 0x1) == 0x1) &&
+   (lf_inprog.s.grb_partial == 0x0)) {
+   lf_inprog.s.grp_drp = 1;
+   plt_write64(lf_inprog.u, lf->rbase + CPT_LF_INPROG);
+
+   lf_ctl.u = plt_read64(lf->rbase + CPT_LF_CTL);
+   lf_ctl.s.ena = 1;
+   plt_write64(lf_ctl.u, lf->rbase + CPT_LF_CTL);
+
+   if (roc_model_is_cn10k())
+   cpt_10k_lf_rst_lmtst(lf, ROC_CPT_DFLT_ENG_GRP_SE);
+   else
+   cpt_9k_lf_rst_lmtst(lf, ROC_CPT_DFLT_ENG_GRP_SE);
+
+   plt_read64(lf->rbase + CPT_LF_INPROG);
+   plt_delay_us(2);
+   }
+   if (cpt_lf_reset(lf))
+   plt_err("Invalid CPT LF to reset");
+}
+
 void
 cpt_lf_fini(struct roc_cpt_lf *lf)
 {
@@ -720,6 +801,7 @@ cpt_lf_fini(struct roc_cpt_lf *lf)
 
/* Disable IQ */
roc_cpt_iq_disable(lf);
+   roc_cpt_iq_reset(lf);
 
/* Free memory */
plt_free(lf->iq_vaddr);
diff --git a/drivers/common/cnxk/roc_mbox.h b/drivers/common/cnxk/roc_mbox.h
index 965c704322..b6dee69ac8 100644
--- a/drivers/common/cnxk/roc_mbox.h
+++ b/drivers/common/cnxk/roc_mbox.h
@@ -151,6 +151,7 @@ struct mbox_msghdr {
M(CPT_RXC_TIME_CFG, 0xA06, cpt_rxc_time_cfg, cpt_rxc_time_cfg_req, \
  msg_rsp) \
M(CPT_CTX_CACHE_SYNC, 0xA07, cpt_ctx_cache_sync, msg_req, msg_rsp) \
+   M(CPT_LF_RESET, 0xA08, cpt_lf_reset, cpt_lf_rst_req, msg_rsp)  \
M(CPT_RX_INLINE_LF_CFG, 0xBFE, cpt_rx_inline_lf_cfg,   \
  cpt_rx_inline_lf_cfg_msg, msg_rsp)   \
M(CPT_GET_CAPS, 0xBFD, cpt_caps_get, msg_req, cpt_caps_rsp_msg)\
@@ -1511,6 +1512,11 @@ struct cpt_eng_grp_rsp {
uint8_t __io eng_grp_num;
 };
 
+struct cpt_lf_rst_req {
+   struct mbox_msghdr hdr;
+   uint32_t __io slot;
+};
+
 /* REE mailbox error codes
  * Range 1001 - 1100.
  */
-- 
2.25.1

[PATCH] ethdev: remove header split Rx offload

2022-08-11 Thread xuan . ding

From: Xuan Ding 

As announced in the deprecation note, this patch removes the Rx offload
flag 'RTE_ETH_RX_OFFLOAD_HEADER_SPLIT' and 'split_hdr_size' field from
the structure 'rte_eth_rxmode'. User can still use
`RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT` for per-queue packet split offload,
which is configured by 'rte_eth_rxseg_split'.

Signed-off-by: Xuan Ding 
---
 app/test-eventdev/test_perf_common.c|  1 -
 app/test-pipeline/init.c|  1 -
 app/test-pmd/cmdline.c  | 12 ++--
 app/test/test_link_bonding.c|  1 -
 app/test/test_link_bonding_mode4.c  |  1 -
 app/test/test_link_bonding_rssconf.c|  2 --
 app/test/test_pmd_perf.c|  1 -
 app/test/test_security_inline_proto.c   |  1 -
 doc/guides/nics/fm10k.rst   |  4 
 doc/guides/nics/ixgbe.rst   |  4 
 doc/guides/rel_notes/deprecation.rst|  6 --
 doc/guides/rel_notes/release_22_11.rst  |  5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 ++--
 drivers/net/cnxk/cnxk_ethdev_ops.c  |  1 -
 drivers/net/failsafe/failsafe_ops.c |  2 --
 drivers/net/fm10k/fm10k_ethdev.c|  1 -
 drivers/net/fm10k/fm10k_rxtx_vec.c  |  4 
 drivers/net/i40e/i40e_rxtx_vec_common.h |  4 
 drivers/net/mvneta/mvneta_ethdev.c  |  5 -
 drivers/net/mvpp2/mrvl_ethdev.c |  5 -
 drivers/net/thunderx/nicvf_ethdev.c |  5 -
 examples/bond/main.c|  1 -
 examples/flow_filtering/main.c  |  3 ---
 examples/ip_fragmentation/main.c|  1 -
 examples/ip_pipeline/link.c |  1 -
 examples/ip_reassembly/main.c   |  1 -
 examples/ipsec-secgw/ipsec-secgw.c  |  1 -
 examples/l2fwd-event/l2fwd_common.c |  3 ---
 examples/l2fwd-jobstats/main.c  |  3 ---
 examples/l2fwd-keepalive/main.c |  3 ---
 examples/l2fwd/main.c   |  3 ---
 examples/link_status_interrupt/main.c   |  3 ---
 examples/multi_process/symmetric_mp/main.c  |  1 -
 examples/ntb/ntb_fwd.c  |  1 -
 examples/pipeline/obj.c |  1 -
 examples/qos_meter/main.c   |  1 -
 examples/qos_sched/init.c   |  3 ---
 examples/vhost/main.c   |  1 -
 lib/ethdev/rte_ethdev.c |  1 -
 lib/ethdev/rte_ethdev.h |  3 ---
 40 files changed, 13 insertions(+), 92 deletions(-)

diff --git a/app/test-eventdev/test_perf_common.c 
b/app/test-eventdev/test_perf_common.c
index 81420be73a..7474b9270a 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -1244,7 +1244,6 @@ perf_ethdev_setup(struct evt_test *test, struct 
evt_options *opt)
struct rte_eth_conf port_conf = {
.rxmode = {
.mq_mode = RTE_ETH_MQ_RX_RSS,
-   .split_hdr_size = 0,
},
.rx_adv_conf = {
.rss_conf = {
diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index eee0719b67..d146c44be0 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -68,7 +68,6 @@ struct app_params app = {
 
 static struct rte_eth_conf port_conf = {
.rxmode = {
-   .split_hdr_size = 0,
.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
},
.rx_adv_conf = {
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b4fe9dfb17..5787659c32 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -745,7 +745,7 @@ static void cmd_help_long_parsed(void *parsed_result,
 
"port config  rx_offload vlan_strip|"
"ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
-   "outer_ipv4_cksum|macsec_strip|header_split|"
+   "outer_ipv4_cksum|macsec_strip|"
"vlan_filter|vlan_extend|jumbo_frame|scatter|"
"buffer_split|timestamp|security|keep_crc on|off\n"
" Enable or disable a per port Rx offloading"
@@ -753,7 +753,7 @@ static void cmd_help_long_parsed(void *parsed_result,
 
"port (port_id) rxq (queue_id) rx_offload vlan_strip|"
"ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
-   "outer_ipv4_cksum|macsec_strip|header_split|"
+   "outer_ipv4_cksum|macsec_strip|"
"vlan_filter|vlan_extend|jumbo_frame|scatter|"
"buffer_split|timestamp|security|keep_crc on|off\n"
"Enable or disable a per queue Rx offloading"
@@ -12522,7 +12522,7 @@ static cmdline_parse_token_string_t 
cmd_config_per_port_rx_offload_result_offloa
(struct cmd_config_per_port_rx_offload_result,

[PATCH v2] examples/ethtool: adds promiscuous mode functionality

2022-08-11 Thread Muhammad Jawad Hussain

ethtool did not have promiscuous mode functioality previously
which is needed for viewing broadcast and multicast packets.
This patch allows user to turn on/off promiscuous mode on
each port through command line.

Signed-off-by: Muhammad Jawad Hussain 
---
 doc/guides/sample_app_ug/ethtool.rst  |  1 +
 examples/ethtool/ethtool-app/ethapp.c | 79 ++-
 examples/ethtool/lib/rte_ethtool.c| 24 
 examples/ethtool/lib/rte_ethtool.h|  2 +
 4 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
index 159e9e0639..6edd9940b8 100644
--- a/doc/guides/sample_app_ug/ethtool.rst
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -54,6 +54,7 @@ they do as follows:
 * ``regs``: Dump port register(s) to file
 * ``ringparam``: Get/set ring parameters
 * ``rxmode``: Toggle port Rx mode
+* ``set promisc``: Enable/Disable promiscuous mode on ports
 * ``stop``: Stop port
 * ``validate``: Check that given MAC address is valid unicast address
 * ``vlan``: Add/remove VLAN id
diff --git a/examples/ethtool/ethtool-app/ethapp.c 
b/examples/ethtool/ethtool-app/ethapp.c
index 78e86534e8..f89e4c4cf0 100644
--- a/examples/ethtool/ethtool-app/ethapp.c
+++ b/examples/ethtool/ethtool-app/ethapp.c
@@ -13,8 +13,16 @@
 #include "ethapp.h"
 
 #define EEPROM_DUMP_CHUNKSIZE 1024
-
-
+typedef uint16_t portid_t;
+
+/* *** PROMISC_MODE *** */
+struct cmd_set_promisc_mode_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t promisc;
+   cmdline_fixed_string_t port_all; /* valid if "allports" argument == 1 */
+   uint16_t port_num;   /* valid if "allports" argument == 0 */
+   cmdline_fixed_string_t mode;
+};
 struct pcmd_get_params {
cmdline_fixed_string_t cmd;
 };
@@ -133,6 +141,22 @@ cmdline_parse_token_string_t pcmd_vlan_token_mode =
 cmdline_parse_token_num_t pcmd_vlan_token_vid =
TOKEN_NUM_INITIALIZER(struct pcmd_vlan_params, vid, RTE_UINT16);
 
+/* promisc mode */
+
+cmdline_parse_token_string_t cmd_setpromisc_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_promisc_mode_result, set, 
"set");
+cmdline_parse_token_string_t cmd_setpromisc_promisc =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_promisc_mode_result, promisc,
+"promisc");
+cmdline_parse_token_string_t cmd_setpromisc_portall =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_promisc_mode_result, port_all,
+"all");
+cmdline_parse_token_num_t cmd_setpromisc_portnum =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_promisc_mode_result, port_num,
+ RTE_UINT16);
+cmdline_parse_token_string_t cmd_setpromisc_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_promisc_mode_result, mode,
+"on#off");
 
 static void
 pcmd_quit_callback(__rte_unused void *ptr_params,
@@ -142,6 +166,30 @@ pcmd_quit_callback(__rte_unused void *ptr_params,
cmdline_quit(ctx);
 }
 
+static void pcmd_set_promisc_mode_parsed(void *ptr_params,
+   __rte_unused struct cmdline *ctx,
+   void *allports)
+{
+   struct cmd_set_promisc_mode_result *res = ptr_params;
+   int enable;
+   portid_t i;
+   if (!strcmp(res->mode, "on"))
+   enable = 1;
+   else
+   enable = 0;
+
+   /* all ports */
+   if (allports) {
+   RTE_ETH_FOREACH_DEV(i)
+   eth_set_promisc_mode(i, enable);
+   } else {
+   eth_set_promisc_mode(res->port_num, enable);
+   }
+   if (enable)
+   printf("Promisc mode Enabled\n");
+   else
+   printf("Promisc mode Disabled\n");
+}
 
 static void
 pcmd_drvinfo_callback(__rte_unused void *ptr_params,
@@ -869,6 +917,31 @@ cmdline_parse_inst_t pcmd_vlan = {
},
 };
 
+cmdline_parse_inst_t cmd_set_promisc_mode_all = {
+   .f = pcmd_set_promisc_mode_parsed,
+   .data = (void *)1,
+   .help_str = "set promisc all \n Set promisc mode for all 
ports",
+   .tokens = {
+   (void *)&cmd_setpromisc_set,
+   (void *)&cmd_setpromisc_promisc,
+   (void *)&cmd_setpromisc_portall,
+   (void *)&cmd_setpromisc_mode,
+   NULL,
+   },
+};
+
+cmdline_parse_inst_t cmd_set_promisc_mode_one = {
+   .f = pcmd_set_promisc_mode_parsed,
+   .data = (void *)0,
+   .help_str = "set promisc  \n Set promisc mode on 
port_id",
+   .tokens = {
+   (void *)&cmd_setpromisc_set,
+   (void *)&cmd_setpromisc_promisc,
+   (void *)&cmd_setpromisc_portnum,
+   (void *)&cmd_setpromisc_mode,
+   NULL,
+   },
+};
 
 cmdline_parse_ctx_t list_prompt_commands[] = {
(cmdline_parse_inst_t *)&pcmd_drvinfo,
@@ -886,6 +959,8 @@ cmdline_parse_ctx_t list_prompt_commands[

[RFC] ethdev: add send to kernel action

2022-08-11 Thread Michael Savisko

In some cases application may receive a packet that should have been
received by the kernel. In this case application uses KNI or other means
to transfer the packet to the kernel.
This commit introduces rte flow action that the application may use
to route the packet to the kernel while still in the HW.

Signed-off-by: Michael Savisko 
---
 lib/librte_ethdev/rte_flow.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f92bef0184..969a607115 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2853,6 +2853,11 @@ enum rte_flow_action_type {
 * See file rte_mtr.h for MTR profile object configuration.
 */
RTE_FLOW_ACTION_TYPE_METER_MARK,
+
+   /*
+* Send traffic to kernel.
+*/
+   RTE_FLOW_ACTION_TYPE_SEND_TO_KERNEL,
 };
 
 /**
-- 
2.27.0

Re: [RFC v2] non-temporal memcpy

2022-08-11 Thread Mattias Rönnblom


On 2022-08-10 23:05, Honnappa Nagarahalli wrote:





+TO: @Honnappa, we need input from ARM


From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
Sent: Friday, 29 July 2022 21.49



From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
Sent: Friday, 29 July 2022 14.14


Sorry, missed that part.




Another question - who will do 'sfence' after the copying?
Would it be inside memcpy_nt (seems quite costly), or would it
be another API function for that: memcpy_nt_flush() or so?


Outside. Only the developer knows when it is required, so it

wouldn't

make any sense to add the cost inside memcpy_nt().


I don't think we should add a flush function; it would just be

another name for an already existing function. Referring to the
required

operation in the memcpy_nt() function documentation should

suffice.




Ok, but again wouldn't it be arch specific?
AFAIK for x86 it needs to boil down to sfence, for other

architectures

- I don't know.
If you think there already is some generic one (rte_wmb?) that

would

always produce
correct instructions - sure let's use it.



DPDK has generic functions to wrap architecture specific stuff like

memory barriers.


Because they are non-temporal stores, I suspect that rte_mb() is

required before reading the data from the location it was copied to.

Ensuring that STORE operations are ordered (rte_wmb) might not

suffice. However, I'm not a CPU expert, so I will seek advice from

more qualified people in the community on this.


I think for IA sfence is enough, see citation below, for other
architectures - no idea.
What I am trying to say - it needs to be the *same* function on all
archs we support.


Now I get it: rte_wmb() might be appropriate on x86, but if any other
architecture requires something else, we should add a new common function
for flushing, e.g. rte_memcpy_nt_flush().



IA SW optimization manual:
9.4.2 Streaming Store Usage Models
The two primary usage domains for streaming store are coherent
requests and non-coherent requests.
9.4.2.1 Coherent Requests
Coherent requests are normal loads and stores to system memory, which
may also hit cache lines present in another processor in a
multiprocessor environment. With coherent requests, a streaming store
can be used in the same way as a regular store that has been mapped
with a WC memory type (PAT or MTRR). An SFENCE instruction must be
used within a producer-consumer usage model in order to ensure
coherency and visibility of data between processors.
Within a single-processor system, the CPU can also re-read the same
memory location and be assured of coherence (that is, a single,
consistent view of this memory location).
The same is true for a multiprocessor
(MP) system, assuming an accepted MP software producer-consumer
synchronization policy is employed.



With this reference, I am convinced that you are right about the SFENCE. This
puts a checkmark on this item on my TODO list for the patch. Thank you,
Konstantin!

Any ARM CPU experts on the mailing list seeing this, not on vacation?
@Honnappa, I'm looking at you. :-)

Summing up, the question is:

After a bunch of *non-temporal* stores (STNP instruction) on ARM
architecture, does calling rte_wmb() suffice to ensure the data is visible 
across
the system?

Apologies for the late response, the docs did not have enough information. The 
internal dialogue is still going on, but I have some information now. There is 
some information in ArmV8 programmer's guide [1], though it is not complete.
In summary, rte_wmb()/rte_mb() would not suffice, we need new APIs.

 From my perspective, I see several scenarios:
1)  Need for ordering before the memcpy_nt. Here there are several cases:
a.  LD – LDNP/STNP – DMB NSHLD
b.  ST – LDNP/STNP – DMB NSH
2)  Need for ordering after the memcpy. Again, we have the similar use 
cases:
a.  LDNP/STNP – LD – DMB NSH
b.  LDNP/STNP – ST – DMB NSH

The 'ST - STNP' and 'STNP - ST' do not apply here, but good to add an API for 
completion.

So, may be we could have rte_[r|w]mb_nt() APIs.



Is rte_smp_rmb()/rte_smp_wmb() also not enough on ARM?


[1] 
https://developer.arm.com/documentation/den0024/a/The-A64-instruction-set/Memory-access-instructions/Non-temporal-load-and-store-pair

Re: [RFC v2] non-temporal memcpy

2022-08-11 Thread Mattias Rönnblom


On 2022-08-10 23:20, Honnappa Nagarahalli wrote:






From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Wednesday, 10 August 2022 13.56

On 2022-08-09 17:26, Stephen Hemminger wrote:


[...]



Alignment seems like a non-issue to me. A NT-store memcpy() can be
made free of alignment requirements, incurring only a very slight cost
for the always-aligned case (who has their data always 16-byte aligned
anyways?).

The memory barrier required on x86 seems like a bigger issue.


Maybe rte_non_cache_copy()?



rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a
rte_memcpy_nt() with the sfence is place, which the user hopefully
will find first? I don't know. I would prefer not having the weak
variant at all.

I think providing weakly ordered version is required to offset the cost of the 
barriers. One might be able to copy multiple packets and then issue a barrier.



On what architecture?

I assumed that only x86 had the peculiar property of having different 
memory models for regular and NT load/stores.




Accepting weak memory ordering (i.e., no sfence) could also be one of
the flags, assuming rte_memcpy_nt() would have a flags parameter.
Default is safe (=memcpy() semantics), but potentially slower.


Excellent idea!




Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and

expect

everything to work.

[RFC] ethdev: introduce hairpin memory capabilities

2022-08-11 Thread Dariusz Sosnowski

This RFC introduces new hairpin queue configuration options through
rte_eth_hairpin_conf struct, allowing to tune Rx and Tx hairpin queues
memory configuration. Hairpin configuration is extended with the
following fields:

- use_locked_device_memory - If set, PMD will use specialized on-device
  memory to store RX or TX hairpin queue data.
- use_rte_memory - If set, PMD will use DPDK-managed memory to store RX
  or TX hairpin queue data.
- force_memory - If set, PMD will be forced to use provided memory
  settings. If no appropriate resources are available, then device start
  will fail. If unset and no resources are available, PMD will fallback
  to using default type of resource for given queue.

Hairpin capabilities are also extended, to allow verification of support
of given hairpin memory configurations. Struct rte_eth_hairpin_cap is
extended with two additional fields of type rte_eth_hairpin_queue_cap:

- rx_cap - memory capabilities of hairpin RX queues.
- tx_cap - memory capabilities of hairpin TX queues.

Struct rte_eth_hairpin_queue_cap exposes whether given queue type
supports use_locked_device_memory and use_rte_memory flags.

Signed-off-by: Dariusz Sosnowski 
---
 lib/ethdev/rte_ethdev.c | 44 
 lib/ethdev/rte_ethdev.h | 65 -
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 1979dc0850..edcec08231 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1945,6 +1945,28 @@ rte_eth_rx_hairpin_queue_setup(uint16_t port_id, 
uint16_t rx_queue_id,
conf->peer_count, cap.max_rx_2_tx);
return -EINVAL;
}
+   if (conf->use_locked_device_memory && !cap.rx_cap.locked_device_memory) 
{
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use locked device memory for Rx queue, 
which is not supported");
+   return -EINVAL;
+   }
+   if (conf->use_rte_memory && !cap.rx_cap.rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use DPDK memory for Rx queue, which is not 
supported");
+   return -EINVAL;
+   }
+   if (conf->use_locked_device_memory && conf->use_rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use mutually exclusive memory settings for 
Rx queue");
+   return -EINVAL;
+   }
+   if (conf->force_memory &&
+   !conf->use_locked_device_memory &&
+   !conf->use_rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to force Rx queue memory settings, but none is 
set");
+   return -EINVAL;
+   }
if (conf->peer_count == 0) {
RTE_ETHDEV_LOG(ERR,
"Invalid value for number of peers for Rx queue(=%u), 
should be: > 0",
@@ -2111,6 +2133,28 @@ rte_eth_tx_hairpin_queue_setup(uint16_t port_id, 
uint16_t tx_queue_id,
conf->peer_count, cap.max_tx_2_rx);
return -EINVAL;
}
+   if (conf->use_locked_device_memory && !cap.tx_cap.locked_device_memory) 
{
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use locked device memory for Tx queue, 
which is not supported");
+   return -EINVAL;
+   }
+   if (conf->use_rte_memory && !cap.tx_cap.rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use DPDK memory for Tx queue, which is not 
supported");
+   return -EINVAL;
+   }
+   if (conf->use_locked_device_memory && conf->use_rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to use mutually exclusive memory settings for 
Tx queue");
+   return -EINVAL;
+   }
+   if (conf->force_memory &&
+   !conf->use_locked_device_memory &&
+   !conf->use_rte_memory) {
+   RTE_ETHDEV_LOG(ERR,
+   "Attempt to force Tx queue memory settings, but none is 
set");
+   return -EINVAL;
+   }
if (conf->peer_count == 0) {
RTE_ETHDEV_LOG(ERR,
"Invalid value for number of peers for Tx queue(=%u), 
should be: > 0",
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index de9e970d4d..e179b0e79b 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1273,6 +1273,28 @@ struct rte_eth_txconf {
void *reserved_ptrs[2];   /**< Reserved for future fields */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * A structure used to return the Tx or Rx hairpin queue capabilities that are 
supported.
+ */
+struct rte_eth_hairpin_queue_cap {
+   /**
+* When set, a specialized on-device memory type can be used as a 
backing
+* storage for a given hairpin queue type.
+*/
+   uint32_t l

[PATCH v2] examples/ethtool: update rxmode to increase functionality

2022-08-11 Thread Muhammad Jawad Hussain

previously rxmode functionality did not allow the user to choose
between different vfs nor did it allow the user to choose rxmode settings
by default it was set to vf = 0, rxmode = AUPE and the on/off state toggled
without letting the user know what state it is in
also there were no error messages

Changes:
- added flag for vf id
- added flags for AUPE|ROPE|BAM|MPE rxmodes
- added flag for on/off
- added error messages
- added info messages

Signed-off-by: Muhammad Jawad Hussain 
---
 doc/guides/sample_app_ug/ethtool.rst  |   2 +-
 examples/ethtool/ethtool-app/ethapp.c | 134 --
 examples/ethtool/lib/rte_ethtool.c|  38 
 examples/ethtool/meson.build  |   4 +-
 4 files changed, 110 insertions(+), 68 deletions(-)

diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
index 159e9e0639..2e53084cc3 100644
--- a/doc/guides/sample_app_ug/ethtool.rst
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -53,7 +53,7 @@ they do as follows:
 * ``portstats``: Print port statistics
 * ``regs``: Dump port register(s) to file
 * ``ringparam``: Get/set ring parameters
-* ``rxmode``: Toggle port Rx mode
+* ``rxmode``: Set rxmode of vfs on ports
 * ``stop``: Stop port
 * ``validate``: Check that given MAC address is valid unicast address
 * ``vlan``: Add/remove VLAN id
diff --git a/examples/ethtool/ethtool-app/ethapp.c 
b/examples/ethtool/ethtool-app/ethapp.c
index 78e86534e8..214433a98b 100644
--- a/examples/ethtool/ethtool-app/ethapp.c
+++ b/examples/ethtool/ethtool-app/ethapp.c
@@ -8,12 +8,30 @@
 #include 
 #include 
 #include 
-
+#ifdef RTE_NET_IXGBE
+#include 
+#endif
+#ifdef RTE_NET_BNXT
+#include 
+#endif
 #include "rte_ethtool.h"
 #include "ethapp.h"
 
 #define EEPROM_DUMP_CHUNKSIZE 1024
 
+typedef uint16_t portid_t;
+
+/* *** CONFIGURE VF RECEIVE MODE *** */
+struct cmd_set_vf_rxmode {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t port;
+   portid_t port_id;
+   cmdline_fixed_string_t vf;
+   uint8_t vf_id;
+   cmdline_fixed_string_t what;
+   cmdline_fixed_string_t mode;
+   cmdline_fixed_string_t on;
+};
 
 struct pcmd_get_params {
cmdline_fixed_string_t cmd;
@@ -65,8 +83,6 @@ cmdline_parse_token_string_t pcmd_open_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_int_params, cmd, "open");
 cmdline_parse_token_string_t pcmd_stop_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_int_params, cmd, "stop");
-cmdline_parse_token_string_t pcmd_rxmode_token_cmd =
-   TOKEN_STRING_INITIALIZER(struct pcmd_int_params, cmd, "rxmode");
 cmdline_parse_token_string_t pcmd_portstats_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_int_params, cmd, "portstats");
 cmdline_parse_token_num_t pcmd_int_token_port =
@@ -133,6 +149,31 @@ cmdline_parse_token_string_t pcmd_vlan_token_mode =
 cmdline_parse_token_num_t pcmd_vlan_token_vid =
TOKEN_NUM_INITIALIZER(struct pcmd_vlan_params, vid, RTE_UINT16);
 
+/* rxmode */
+cmdline_parse_token_string_t cmd_set_vf_rxmode_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+set, "set");
+cmdline_parse_token_string_t cmd_set_vf_rxmode_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+port, "port");
+cmdline_parse_token_num_t cmd_set_vf_rxmode_portid =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_vf_rxmode,
+ port_id, RTE_UINT16);
+cmdline_parse_token_string_t cmd_set_vf_rxmode_vf =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+vf, "vf");
+cmdline_parse_token_num_t cmd_set_vf_rxmode_vfid =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_vf_rxmode,
+ vf_id, RTE_UINT8);
+cmdline_parse_token_string_t cmd_set_vf_rxmode_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+what, "rxmode");
+cmdline_parse_token_string_t cmd_set_vf_rxmode_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+mode, "AUPE#ROPE#BAM#MPE");
+cmdline_parse_token_string_t cmd_set_vf_rxmode_on =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_rxmode,
+on, "on#off");
 
 static void
 pcmd_quit_callback(__rte_unused void *ptr_params,
@@ -142,7 +183,6 @@ pcmd_quit_callback(__rte_unused void *ptr_params,
cmdline_quit(ctx);
 }
 
-
 static void
 pcmd_drvinfo_callback(__rte_unused void *ptr_params,
__rte_unused struct cmdline *ctx,
@@ -150,7 +190,6 @@ pcmd_drvinfo_callback(__rte_unused void *ptr_params,
 {
struct ethtool_drvinfo info;
uint16_t id_port;
-
RTE_ETH_FOREACH_DEV(id_port) {
memset(&info, 0, sizeof(info));
if (rte_ethtool_get_drvinfo(id_port, &info)) {
@@ -447,26 +486,57 @@ pcmd_stop_callback(__rte_unused void *ptr_params,
printf("Port %i: Error stopping device\n", params-

Re: [PATCH] multi_proc_support.rst: updated file location for config file in documentation

2022-08-11 Thread Muhammad Jawad Hussain

Hi,
I've submitted a patch on 17th March, 2022. One of the tests is failing
and it is not related to the patch.
Following test is failing:
- ci/iol-broadcom-Functional

Can you please rerun the tests so that the patch can be submitted.

Regards,
-Jawad


On Thu, Mar 17, 2022 at 11:19 AM Muhammad Jawad Hussain <
jawad.huss...@emumba.com> wrote:

> previously .rte_config was made at /var/run/.rte_config
> for root users and $HOME/.rte_config for non-root users
> now the file is renamed to config
>  and is created at /var/run/dpdk/rte/config
> for root users and /run/user/$EUID/dpdk/rte/config for non-root users
> the docmentation of multi_proc_support has been updated
>  to reflect this change
>
> Signed-off-by: Muhammad Jawad Hussain 
> ---
>  doc/guides/prog_guide/multi_proc_support.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/doc/guides/prog_guide/multi_proc_support.rst
> b/doc/guides/prog_guide/multi_proc_support.rst
> index 815e8bdc43..a409769b9c 100644
> --- a/doc/guides/prog_guide/multi_proc_support.rst
> +++ b/doc/guides/prog_guide/multi_proc_support.rst
> @@ -113,8 +113,8 @@ Support for this usage scenario is provided using the
> ``--file-prefix`` paramete
>
>  By default, the EAL creates hugepage files on each hugetlbfs filesystem
> using the rtemap_X filename,
>  where X is in the range 0 to the maximum number of hugepages -1.
> -Similarly, it creates shared configuration files, memory mapped in each
> process, using the /var/run/.rte_config filename,
> -when run as root (or $HOME/.rte_config when run as a non-root user;
> +Similarly, it creates shared configuration files, memory mapped in each
> process, using the /var/run/dpdk/rte/config filename,
> +when run as root (or /run/user/$EUID/dpdk/rte/config when run as a
> non-root user;
>  if filesystem and device permissions are set up to allow this).
>  The rte part of the filenames of each of the above is configurable using
> the file-prefix parameter.
>
> --
> 2.32.0
>
>

RE: [PATCH v11 1/7] eventdev/eth_rx: add adapter instance get API

2022-08-11 Thread Kundapura, Ganapati

Hi Jerin,
   Could you please review this?

Thanks,
Ganapati

> -Original Message-
> From: Ganapati Kundapura 
> Sent: 19 July 2022 13:56
> To: jer...@marvell.com; Jayatheerthan, Jay ;
> Naga Harish K, S V ; dev@dpdk.org
> Subject: [PATCH v11 1/7] eventdev/eth_rx: add adapter instance get API
> 
> Added rte_event_eth_rx_adapter_instance_get() to get adapter instance id
> for specified ethernet device id and rx queue index.
> 
> Signed-off-by: Ganapati Kundapura 
> 
> Reviewed-by: Naga Harish K S V 
> Acked-by: Jay Jayatheerthan 
> ---
> v11:
> * added instance_get under 22.11 in version.map
> 
> v10:
> * Add Review and Ack to series
> 
> v9:
> * Corrected rte_event_eth_tx_adapter_instanceget to
> * rte_event_eth_tx_adapter_instance_get in
> event_ethernet_tx_adapter.rst
> 
> v8:
> * Removed limits.h inclusion
> 
> v7:
> * Remove allocation of instance array and storage of instnace id
> * in instance array
> * Use Rx adapter instance data to query instance id for specified
> * eth_dev_id and rx_queue_id
> 
> v6:
> * rx adapter changes removed from patch4 and moved to patch1
> 
> v5:
> * patch is split into saperate patches
> 
> v4:
> * Moved instance array allocation and instance id storage
>   before adapter's nb_queue updation for handling the
>   error case  properly
> 
> v3:
> * Fixed checkpatch error
> 
> v2:
> * Fixed build issues
> * Added telemetry support for rte_event_eth_rx_adapter_instance_get
> * arranged functions in alphabetical order in version.map
> 
> diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
> index 6940266..c58ba05 100644
> --- a/lib/eventdev/eventdev_pmd.h
> +++ b/lib/eventdev/eventdev_pmd.h
> @@ -888,6 +888,26 @@ typedef int
> (*eventdev_eth_rx_adapter_vector_limits_get_t)(
>   const struct rte_eventdev *dev, const struct rte_eth_dev *eth_dev,
>   struct rte_event_eth_rx_adapter_vector_limits *limits);
> 
> +/**
> + * Get Rx adapter instance id for Rx queue
> + *
> + * @param eth_dev_id
> + *  Port identifier of ethernet device
> + *
> + * @param rx_queue_id
> + *  Ethernet device Rx queue index
> + *
> + * @param[out] rxa_inst_id
> + *  Pointer to Rx adapter instance identifier.
> + *  Contains valid Rx adapter instance id when return value is 0
> + *
> + * @return
> + *   -  0: Success
> + *   - <0: Error code on failure
> + */
> +typedef int (*eventdev_eth_rx_adapter_instance_get_t)
> + (uint16_t eth_dev_id, uint16_t rx_queue_id, uint8_t *rxa_inst_id);
> +
>  typedef uint32_t rte_event_pmd_selftest_seqn_t;  extern int
> rte_event_pmd_selftest_seqn_dynfield_offset;
> 
> @@ -1321,6 +1341,8 @@ struct eventdev_ops {
>   eventdev_eth_rx_adapter_vector_limits_get_t
>   eth_rx_adapter_vector_limits_get;
>   /**< Get event vector limits for the Rx adapter */
> + eventdev_eth_rx_adapter_instance_get_t
> eth_rx_adapter_instance_get;
> + /**< Get Rx adapter instance id for Rx queue */
> 
>   eventdev_timer_adapter_caps_get_t timer_adapter_caps_get;
>   /**< Get timer adapter capabilities */ diff --git
> a/lib/eventdev/rte_event_eth_rx_adapter.c
> b/lib/eventdev/rte_event_eth_rx_adapter.c
> index bf8741d..ababe13 100644
> --- a/lib/eventdev/rte_event_eth_rx_adapter.c
> +++ b/lib/eventdev/rte_event_eth_rx_adapter.c
> @@ -1415,15 +1415,13 @@ rxa_service_func(void *args)
>   return 0;
>  }
> 
> -static int
> -rte_event_eth_rx_adapter_init(void)
> +static void *
> +rxa_memzone_array_get(const char *name, unsigned int elt_size, int
> +nb_elems)
>  {
> - const char *name = RXA_ADAPTER_ARRAY;
>   const struct rte_memzone *mz;
>   unsigned int sz;
> 
> - sz = sizeof(*event_eth_rx_adapter) *
> - RTE_EVENT_ETH_RX_ADAPTER_MAX_INSTANCE;
> + sz = elt_size * nb_elems;
>   sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE);
> 
>   mz = rte_memzone_lookup(name);
> @@ -1431,13 +1429,34 @@ rte_event_eth_rx_adapter_init(void)
>   mz = rte_memzone_reserve_aligned(name, sz,
> rte_socket_id(), 0,
>RTE_CACHE_LINE_SIZE);
>   if (mz == NULL) {
> - RTE_EDEV_LOG_ERR("failed to reserve memzone err
> = %"
> - PRId32, rte_errno);
> - return -rte_errno;
> + RTE_EDEV_LOG_ERR("failed to reserve memzone"
> +  " name = %s, err = %"
> +  PRId32, name, rte_errno);
> + return NULL;
>   }
>   }
> 
> - event_eth_rx_adapter = mz->addr;
> + return mz->addr;
> +}
> +
> +static int
> +rte_event_eth_rx_adapter_init(void)
> +{
> + uint8_t i;
> +
> + if (event_eth_rx_adapter == NULL) {
> + event_eth_rx_adapter =
> + rxa_memzone_array_get(RXA_ADAPTER_ARRAY,
> + sizeof(*event_eth_rx_adapter),
> +
>   RTE_EVENT_ETH_RX_ADAPTER_MAX_INSTANCE);
> + if (event_eth_rx_a

Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics

2022-08-11 Thread Stephen Hemminger

On Thu, 11 Aug 2022 06:31:31 +
Chaoyong He  wrote:

> > -Original Message-
> > From: Stephen Hemminger 
> > Sent: Thursday, August 11, 2022 12:25 PM
> > To: Chaoyong He 
> > Cc: Andrew Rybchenko ; Niklas
> > Soderlund ; dev@dpdk.org
> > Subject: Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
> > 
> > On Thu, 11 Aug 2022 01:26:49 +
> > Chaoyong He  wrote:
> >   
> > > > > The 'port_id' is the 'Device [external] port identifier', which
> > > > > related with the 'rte_ethdev_devices[]' I think.
> > > > > Here the ethdev we created is not exposed to the user and is not
> > > > > in the  
> > > > 'rte_ethdev_devices[]'  
> > > > > array, so it can't be invoked by the user at all.
> > > > > And we invoke this ethdev through a pointer in the `struct
> > > > > nfp_net_hw`, so I think there should no conflict with other ones
> > > > > in the  
> > > > system.
> > > >
> > > > DPDK already has a port ownership framework to deal with internal
> > > > ethernet device ports. Why was this not used?  
> > >
> > > Sorry I have no knowledge about this framework before. Any document
> > > link or logic about this framework will be greatly appreciated. Thanks!  
> > 
> > It is part of ethdev https://doc.dpdk.org/api/rte__ethdev_8h.html
> > 
> > See rte_eth_dev_owner_new, rte_eth_dev_owner_set, etc
> > https://doc.dpdk.org/api/rte__ethdev_8h.html#ad6817cc801bf0faa566f52d3
> > 82214457  
> 
> Thank you very much!
> 
> If the app uses the rte_eth_dev_owner_* APIs to check the ownership first, it 
> does can
> protect the internal ethdev ports.
> But right now, the ovs-dpdk seems don't use these APIs at all, and it can use 
> 'port_id' to
> get any ethdev port in rte_ethdev_devices[] array.
> So maybe it's a good idea to keep our original logic and keep an eye on this 
> area, once
> the ovs-dpdk use the rte_eth_dev_owner_* APIs, we'll update the logic here 
> accordingly.
> 
> Thanks again!

Once device is owned by something, then it is no longer show in the FOREACH and 
other
iterators; so ovs-dpdk should be ok.  This mechanism is how bonding, failsafe, 
and netvsc
drivers handle sub devices. Therefore OVS should be smart enough to handle it.

RE: [PATCH v2 2/4] event/sw: report periodic event timer capability

2022-08-11 Thread Van Haaren, Harry

> -Original Message-
> From: Carrillo, Erik G 
> Sent: Wednesday, August 10, 2022 9:52 PM
> To: Naga Harish K, S V ; jer...@marvell.com; Van
> Haaren, Harry 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v2 2/4] event/sw: report periodic event timer capability
> 
> Hi Harish,
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Wednesday, August 10, 2022 2:10 AM
> > To: Carrillo, Erik G ; jer...@marvell.com; Van
> > Haaren, Harry 
> > Cc: dev@dpdk.org
> > Subject: [PATCH v2 2/4] event/sw: report periodic event timer capability
> >
> > update the software eventdev pmd timer_adapter_caps_get callback
> > function to report the support of periodic event timer capability
> >
> > Signed-off-by: Naga Harish K S V 
> > ---
> >  drivers/event/sw/sw_evdev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
> > index f93313b31b..89c07d30ae 100644
> > --- a/drivers/event/sw/sw_evdev.c
> > +++ b/drivers/event/sw/sw_evdev.c
> > @@ -564,7 +564,7 @@ sw_timer_adapter_caps_get(const struct
> > rte_eventdev *dev, uint64_t flags,  {
> > RTE_SET_USED(dev);
> > RTE_SET_USED(flags);
> > -   *caps = 0;
> > +   *caps = RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC;

Thanks Harish for the explanation as to why caps are exposed in the Eventdev 
PMD,
for related timer/etc features, makes sense.

> It looks like we can add:
> 
> #define RTE_EVENT_TIMER_ADAPTER_SW_CAP \
>   RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC
> 
> to eventdev_pmd.h (the same as RTE_EVENT_CRYPTO_ADAPTER_SW_CAP, for
> example),
> 
> and use that definition here, and in rte_event_timer_adapter_caps_get().
> 
> Thanks,
> Erik

Erik, I like the suggestion of a standardized set of "generic SW" caps flags at 
the Eventdev level,
and then all SW based PMDs can use that #define, avoids each new cap needing 
changes in multiple drivers.

Good work, -Harry

[PATCH v3 1/4] eventdev/timer: add periodic event timer support

2022-08-11 Thread Naga Harish K S V

This patch adds support to configure and use periodic event
timers in software timer adapter.

The structure ``rte_event_timer_adapter_stats`` is extended
by adding a new field, ``evtim_drop_count``. This stat
represents the number of times an event_timer expiry event
is dropped by the event timer adapter.

Signed-off-by: Naga Harish K S V 
---
 lib/eventdev/rte_event_timer_adapter.c | 106 -
 lib/eventdev/rte_event_timer_adapter.h |   2 +
 lib/eventdev/rte_eventdev.c|   6 +-
 3 files changed, 77 insertions(+), 37 deletions(-)

diff --git a/lib/eventdev/rte_event_timer_adapter.c 
b/lib/eventdev/rte_event_timer_adapter.c
index e0d978d641..d2480060c5 100644
--- a/lib/eventdev/rte_event_timer_adapter.c
+++ b/lib/eventdev/rte_event_timer_adapter.c
@@ -53,6 +53,14 @@ static const struct event_timer_adapter_ops swtim_ops;
 #define EVTIM_SVC_LOG_DBG(...) (void)0
 #endif
 
+static inline enum rte_timer_type
+get_timer_type(const struct rte_event_timer_adapter *adapter)
+{
+   return (adapter->data->conf.flags &
+   RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) ?
+   PERIODICAL : SINGLE;
+}
+
 static int
 default_port_conf_cb(uint16_t id, uint8_t event_dev_id, uint8_t *event_port_id,
 void *conf_arg)
@@ -195,13 +203,14 @@ rte_event_timer_adapter_create_ext(
adapter->data->conf = *conf;  /* copy conf structure */
 
/* Query eventdev PMD for timer adapter capabilities and ops */
-   ret = dev->dev_ops->timer_adapter_caps_get(dev,
-  adapter->data->conf.flags,
-  &adapter->data->caps,
-  &adapter->ops);
-   if (ret < 0) {
-   rte_errno = -ret;
-   goto free_memzone;
+   if (dev->dev_ops->timer_adapter_caps_get) {
+   ret = dev->dev_ops->timer_adapter_caps_get(dev,
+   adapter->data->conf.flags,
+   &adapter->data->caps, &adapter->ops);
+   if (ret < 0) {
+   rte_errno = -ret;
+   goto free_memzone;
+   }
}
 
if (!(adapter->data->caps &
@@ -348,13 +357,14 @@ rte_event_timer_adapter_lookup(uint16_t adapter_id)
dev = &rte_eventdevs[adapter->data->event_dev_id];
 
/* Query eventdev PMD for timer adapter capabilities and ops */
-   ret = dev->dev_ops->timer_adapter_caps_get(dev,
-  adapter->data->conf.flags,
-  &adapter->data->caps,
-  &adapter->ops);
-   if (ret < 0) {
-   rte_errno = EINVAL;
-   return NULL;
+   if (dev->dev_ops->timer_adapter_caps_get) {
+   ret = dev->dev_ops->timer_adapter_caps_get(dev,
+   adapter->data->conf.flags,
+   &adapter->data->caps, &adapter->ops);
+   if (ret < 0) {
+   rte_errno = EINVAL;
+   return NULL;
+   }
}
 
/* If eventdev PMD did not provide ops, use default software
@@ -612,35 +622,44 @@ swtim_callback(struct rte_timer *tim)
uint64_t opaque;
int ret;
int n_lcores;
+   enum rte_timer_type type;
 
opaque = evtim->impl_opaque[1];
adapter = (struct rte_event_timer_adapter *)(uintptr_t)opaque;
sw = swtim_pmd_priv(adapter);
+   type = get_timer_type(adapter);
+
+   if (unlikely(sw->in_use[lcore].v == 0)) {
+   sw->in_use[lcore].v = 1;
+   n_lcores = __atomic_fetch_add(&sw->n_poll_lcores, 1,
+__ATOMIC_RELAXED);
+   __atomic_store_n(&sw->poll_lcores[n_lcores], lcore,
+   __ATOMIC_RELAXED);
+   }
 
ret = event_buffer_add(&sw->buffer, &evtim->ev);
if (ret < 0) {
-   /* If event buffer is full, put timer back in list with
-* immediate expiry value, so that we process it again on the
-* next iteration.
-*/
-   ret = rte_timer_alt_reset(sw->timer_data_id, tim, 0, SINGLE,
- lcore, NULL, evtim);
-   if (ret < 0) {
-   EVTIM_LOG_DBG("event buffer full, failed to reset "
- "timer with immediate expiry value");
+   if (type == SINGLE) {
+   /* If event buffer is full, put timer back in list with
+* immediate expiry value, so that we process it again
+* on the next iteration.
+*/
+   ret = rte_timer_alt_reset(sw->timer_data_id, tim, 0,
+

[PATCH v3 4/4] test/event: update periodic event timer tests

2022-08-11 Thread Naga Harish K S V

This patch updates the software timer adapter tests to
configure and use periodic event timers.

Signed-off-by: Naga Harish K S V 
---
 app/test/test_event_timer_adapter.c | 41 ++---
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/app/test/test_event_timer_adapter.c 
b/app/test/test_event_timer_adapter.c
index d6170bb589..654c412836 100644
--- a/app/test/test_event_timer_adapter.c
+++ b/app/test/test_event_timer_adapter.c
@@ -386,11 +386,22 @@ timdev_setup_msec(void)
 static int
 timdev_setup_msec_periodic(void)
 {
+   uint32_t caps = 0;
+   uint64_t max_tmo_ns;
+
uint64_t flags = RTE_EVENT_TIMER_ADAPTER_F_ADJUST_RES |
 RTE_EVENT_TIMER_ADAPTER_F_PERIODIC;
 
+   TEST_ASSERT_SUCCESS(rte_event_timer_adapter_caps_get(evdev, &caps),
+   "failed to get adapter capabilities");
+
+   if (caps & RTE_EVENT_TIMER_ADAPTER_CAP_INTERNAL_PORT)
+   max_tmo_ns = 0;
+   else
+   max_tmo_ns = 180 * NSECPERSEC;
+
/* Periodic mode with 100 ms resolution */
-   return _timdev_setup(0, NSECPERSEC / 10, flags);
+   return _timdev_setup(max_tmo_ns, NSECPERSEC / 10, flags);
 }
 
 static int
@@ -409,7 +420,7 @@ timdev_setup_sec_periodic(void)
 RTE_EVENT_TIMER_ADAPTER_F_PERIODIC;
 
/* Periodic mode with 1 sec resolution */
-   return _timdev_setup(0, NSECPERSEC, flags);
+   return _timdev_setup(180 * NSECPERSEC, NSECPERSEC, flags);
 }
 
 static int
@@ -561,12 +572,23 @@ test_timer_arm(void)
 static inline int
 test_timer_arm_periodic(void)
 {
+   uint32_t caps = 0;
+   uint32_t timeout_count = 0;
+
TEST_ASSERT_SUCCESS(_arm_timers(1, MAX_TIMERS),
"Failed to arm timers");
/* With a resolution of 100ms and wait time of 1sec,
 * there will be 10 * MAX_TIMERS periodic timer triggers.
 */
-   TEST_ASSERT_SUCCESS(_wait_timer_triggers(1, 10 * MAX_TIMERS, 0),
+   TEST_ASSERT_SUCCESS(rte_event_timer_adapter_caps_get(evdev, &caps),
+   "failed to get adapter capabilities");
+
+   if (caps & RTE_EVENT_TIMER_ADAPTER_CAP_INTERNAL_PORT)
+   timeout_count = 10;
+   else
+   timeout_count = 9;
+
+   TEST_ASSERT_SUCCESS(_wait_timer_triggers(1, timeout_count * MAX_TIMERS, 
0),
"Timer triggered count doesn't match arm count");
return TEST_SUCCESS;
 }
@@ -649,12 +671,23 @@ test_timer_arm_burst(void)
 static inline int
 test_timer_arm_burst_periodic(void)
 {
+   uint32_t caps = 0;
+   uint32_t timeout_count = 0;
+
TEST_ASSERT_SUCCESS(_arm_timers_burst(1, MAX_TIMERS),
"Failed to arm timers");
/* With a resolution of 100ms and wait time of 1sec,
 * there will be 10 * MAX_TIMERS periodic timer triggers.
 */
-   TEST_ASSERT_SUCCESS(_wait_timer_triggers(1, 10 * MAX_TIMERS, 0),
+   TEST_ASSERT_SUCCESS(rte_event_timer_adapter_caps_get(evdev, &caps),
+   "failed to get adapter capabilities");
+
+   if (caps & RTE_EVENT_TIMER_ADAPTER_CAP_INTERNAL_PORT)
+   timeout_count = 10;
+   else
+   timeout_count = 9;
+
+   TEST_ASSERT_SUCCESS(_wait_timer_triggers(1, timeout_count * MAX_TIMERS, 
0),
"Timer triggered count doesn't match arm count");
 
return TEST_SUCCESS;
-- 
2.25.1

[PATCH v3 2/4] event/sw: report periodic event timer capability

2022-08-11 Thread Naga Harish K S V

update the software eventdev pmd timer_adapter_caps_get
callback function to report the support of periodic
event timer capability

Signed-off-by: Naga Harish K S V 
---
 drivers/event/sw/sw_evdev.c | 2 +-
 lib/eventdev/eventdev_pmd.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index f93313b31b..6eddf8bd93 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -564,7 +564,7 @@ sw_timer_adapter_caps_get(const struct rte_eventdev *dev, 
uint64_t flags,
 {
RTE_SET_USED(dev);
RTE_SET_USED(flags);
-   *caps = 0;
+   *caps = RTE_EVENT_TIMER_ADAPTER_SW_CAP;
 
/* Use default SW ops */
*ops = NULL;
diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index 69402668d8..9d43b73570 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -77,6 +77,9 @@ extern "C" {
 #define RTE_EVENT_CRYPTO_ADAPTER_SW_CAP \
RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA
 
+#define RTE_EVENT_TIMER_ADAPTER_SW_CAP \
+   RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC
+
 /**< Ethernet Rx adapter cap to return If the packet transfers from
  * the ethdev to eventdev use a SW service function
  */
-- 
2.25.1

[PATCH v3 3/4] timer: fix function to stop all timers

2022-08-11 Thread Naga Harish K S V

There is a possibility of deadlock in this API,
as same spinlock is tried to be acquired in nested manner.

If the lcore that is stopping the timer is different from the lcore
that owns the timer, the timer list lock is acquired in timer_del(),
even if local_is_locked is true. Because the same lock was already
acquired in rte_timer_stop_all(), the thread will hang.

This patch removes the acquisition of nested lock.

Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a list")
Cc: sta...@dpdk.org

Signed-off-by: Naga Harish K S V 
---
 lib/timer/rte_timer.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/lib/timer/rte_timer.c b/lib/timer/rte_timer.c
index 9994813d0d..85d67573eb 100644
--- a/lib/timer/rte_timer.c
+++ b/lib/timer/rte_timer.c
@@ -580,7 +580,7 @@ rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
 }
 
 static int
-__rte_timer_stop(struct rte_timer *tim, int local_is_locked,
+__rte_timer_stop(struct rte_timer *tim,
 struct rte_timer_data *timer_data)
 {
union rte_timer_status prev_status, status;
@@ -602,7 +602,7 @@ __rte_timer_stop(struct rte_timer *tim, int local_is_locked,
 
/* remove it from list */
if (prev_status.state == RTE_TIMER_PENDING) {
-   timer_del(tim, prev_status, local_is_locked, priv_timer);
+   timer_del(tim, prev_status, 0, priv_timer);
__TIMER_STAT_ADD(priv_timer, pending, -1);
}
 
@@ -631,7 +631,7 @@ rte_timer_alt_stop(uint32_t timer_data_id, struct rte_timer 
*tim)
 
TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, timer_data, -EINVAL);
 
-   return __rte_timer_stop(tim, 0, timer_data);
+   return __rte_timer_stop(tim, timer_data);
 }
 
 /* loop until rte_timer_stop() succeed */
@@ -987,21 +987,16 @@ rte_timer_stop_all(uint32_t timer_data_id, unsigned int 
*walk_lcores,
walk_lcore = walk_lcores[i];
priv_timer = &timer_data->priv_timer[walk_lcore];
 
-   rte_spinlock_lock(&priv_timer->list_lock);
-
for (tim = priv_timer->pending_head.sl_next[0];
 tim != NULL;
 tim = next_tim) {
next_tim = tim->sl_next[0];
 
-   /* Call timer_stop with lock held */
-   __rte_timer_stop(tim, 1, timer_data);
+   __rte_timer_stop(tim, timer_data);
 
if (f)
f(tim, f_arg);
}
-
-   rte_spinlock_unlock(&priv_timer->list_lock);
}
 
return 0;
-- 
2.25.1

RE: [PATCH v2 3/4] timer: fix function to stop all timers

2022-08-11 Thread Naga Harish K, S V

Hi Gabe,

> -Original Message-
> From: Carrillo, Erik G 
> Sent: Thursday, August 11, 2022 1:00 AM
> To: Naga Harish K, S V 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH v2 3/4] timer: fix function to stop all timers
> 
> Hi Harish,
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Wednesday, August 10, 2022 2:10 AM
> > To: Carrillo, Erik G 
> > Cc: dev@dpdk.org; sta...@dpdk.org
> > Subject: [PATCH v2 3/4] timer: fix function to stop all timers
> >
> > There is a possibility of deadlock in this API, as same spinlock is
> > tried to be acquired in nested manner.
> >
> > In timer_del function, if the previous owner and current owner lcore
> > are
> 
> It might be clearer to say something like:
> 
>  "If the lcore that is stopping the timer is different from the lcore that 
> owns
> the timer, the timer list lock is acquired in timer_del(), even if 
> local_is_locked
> is true.  Because the same lock was already acquired in rte_timer_stop_all(),
> the thread will hang."
> 

Incorporated the commit message in v3 version of the patch

> Thanks,
> Erik
> 
> > different, the lock is tried to be acquired even though the same lock
> > is already acquired by the caller of timer_del function.
> >
> > This patch removes the acquisition of nested locking.
> >
> > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > list")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V 
> > ---

RE: [PATCH v2 1/4] eventdev/timer: add periodic event timer support

2022-08-11 Thread Naga Harish K, S V

Hi Gabe,

> -Original Message-
> From: Carrillo, Erik G 
> Sent: Thursday, August 11, 2022 1:25 AM
> To: Naga Harish K, S V ; jer...@marvell.com
> Cc: pbhagavat...@marvell.com; sthot...@marvell.com; dev@dpdk.org
> Subject: RE: [PATCH v2 1/4] eventdev/timer: add periodic event timer
> support
> 
> Hi Harish,
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Wednesday, August 10, 2022 2:07 AM
> > To: Carrillo, Erik G ; jer...@marvell.com
> > Cc: pbhagavat...@marvell.com; sthot...@marvell.com; dev@dpdk.org
> > Subject: [PATCH v2 1/4] eventdev/timer: add periodic event timer
> > support
> >
> > This patch adds support to configure and use periodic event timers in
> > software timer adapter.
> >
> > The structure ``rte_event_timer_adapter_stats`` is extended by adding
> > a new field, ``evtim_drop_count``. This stat represents the number of
> > times an event_timer expiry event is dropped by the event timer adapter.
> >
> > Signed-off-by: Naga Harish K S V 
> > ---
> >  lib/eventdev/rte_event_timer_adapter.c | 86 ++-
> --
> > -  lib/eventdev/rte_event_timer_adapter.h |  2 +
> >  lib/eventdev/rte_eventdev.c|  6 +-
> >  3 files changed, 67 insertions(+), 27 deletions(-)
> >
> > diff --git a/lib/eventdev/rte_event_timer_adapter.c
> > b/lib/eventdev/rte_event_timer_adapter.c
> > index e0d978d641..0de88dfc0f 100644
> > --- a/lib/eventdev/rte_event_timer_adapter.c
> > +++ b/lib/eventdev/rte_event_timer_adapter.c
> > @@ -53,6 +53,14 @@ static const struct event_timer_adapter_ops
> > swtim_ops;  #define EVTIM_SVC_LOG_DBG(...) (void)0  #endif
> >
> > +static inline enum rte_timer_type
> > +get_event_timer_type(const struct rte_event_timer_adapter *adapter) {
> 
> Let's call this function "get_timer_type" since it is selecting a type for an
> rte_timer.
> 
Taken in v3 version of the patch

> > +   return (adapter->data->conf.flags &
> > +   RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) ?
> > +   PERIODICAL : SINGLE;
> > +}
> > +
> >  static int
> >  default_port_conf_cb(uint16_t id, uint8_t event_dev_id, uint8_t
> > *event_port_id,
> >  void *conf_arg)
> > @@ -195,10 +203,11 @@ rte_event_timer_adapter_create_ext(
> > adapter->data->conf = *conf;  /* copy conf structure */
> >
> > /* Query eventdev PMD for timer adapter capabilities and ops */
> > -   ret = dev->dev_ops->timer_adapter_caps_get(dev,
> > +   ret = dev->dev_ops->timer_adapter_caps_get ?
> > +   dev->dev_ops-
> > >timer_adapter_caps_get(dev,
> >adapter->data->conf.flags,
> >&adapter->data->caps,
> > -  &adapter->ops);
> > +  &adapter->ops) : 0;
> > if (ret < 0) {
> > rte_errno = -ret;
> > goto free_memzone;
> 
> IMO, this hunk would read better as:
> 
> if (dev->dev_ops->timer_adapter_caps_get) {
> ret = dev->dev_ops->timer_adapter_caps_get(dev,
> adapter->data->conf.flags, 
> &adapter->data->caps,
> &adapter->ops);
> if (ret < 0) {
> rte_errno = -ret;
> goto free_memzone;
> }
> }
> 

Taken in v3 version of the patch

> > @@ -348,10 +357,11 @@ rte_event_timer_adapter_lookup(uint16_t
> > adapter_id)
> > dev = &rte_eventdevs[adapter->data->event_dev_id];
> >
> > /* Query eventdev PMD for timer adapter capabilities and ops */
> > -   ret = dev->dev_ops->timer_adapter_caps_get(dev,
> > +   ret = dev->dev_ops->timer_adapter_caps_get ?
> > +   dev->dev_ops->timer_adapter_caps_get(dev,
> >adapter->data->conf.flags,
> >&adapter->data->caps,
> > -  &adapter->ops);
> > +  &adapter->ops) : 0;
> > if (ret < 0) {
> > rte_errno = EINVAL;
> > return NULL;
> 
> Same comment as above for this hunk...
> 
> Thanks,
> Erik

RE: [PATCH v2 2/4] event/sw: report periodic event timer capability

2022-08-11 Thread Naga Harish K, S V

Hi Gabe,

> -Original Message-
> From: Carrillo, Erik G 
> Sent: Thursday, August 11, 2022 2:22 AM
> To: Naga Harish K, S V ; jer...@marvell.com;
> Van Haaren, Harry 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v2 2/4] event/sw: report periodic event timer capability
> 
> Hi Harish,
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Wednesday, August 10, 2022 2:10 AM
> > To: Carrillo, Erik G ; jer...@marvell.com;
> > Van Haaren, Harry 
> > Cc: dev@dpdk.org
> > Subject: [PATCH v2 2/4] event/sw: report periodic event timer
> > capability
> >
> > update the software eventdev pmd timer_adapter_caps_get callback
> > function to report the support of periodic event timer capability
> >
> > Signed-off-by: Naga Harish K S V 
> > ---
> >  drivers/event/sw/sw_evdev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
> > index f93313b31b..89c07d30ae 100644
> > --- a/drivers/event/sw/sw_evdev.c
> > +++ b/drivers/event/sw/sw_evdev.c
> > @@ -564,7 +564,7 @@ sw_timer_adapter_caps_get(const struct
> > rte_eventdev *dev, uint64_t flags,  {
> > RTE_SET_USED(dev);
> > RTE_SET_USED(flags);
> > -   *caps = 0;
> > +   *caps = RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC;
> 
> It looks like we can add:
> 
> #define RTE_EVENT_TIMER_ADAPTER_SW_CAP \
>   RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC
> 
> to eventdev_pmd.h (the same as RTE_EVENT_CRYPTO_ADAPTER_SW_CAP,
> for example),
> 
> and use that definition here, and in rte_event_timer_adapter_caps_get().
> 

Taken in v3 version of the patch

> Thanks,
> Erik
> 
> >
> > /* Use default SW ops */
> > *ops = NULL;
> > --
> > 2.25.1

RE: [RFC v2] non-temporal memcpy

2022-08-11 Thread Honnappa Nagarahalli



> >>
> >> +TO: @Honnappa, we need input from ARM
> >>
> >>> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
> >>> Sent: Friday, 29 July 2022 21.49
> 
> > From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
> > Sent: Friday, 29 July 2022 14.14
> >
> >
> > Sorry, missed that part.
> >
> >>
> >>> Another question - who will do 'sfence' after the copying?
> >>> Would it be inside memcpy_nt (seems quite costly), or would it
> >>> be another API function for that: memcpy_nt_flush() or so?
> >>
> >> Outside. Only the developer knows when it is required, so it
> >>> wouldn't
> > make any sense to add the cost inside memcpy_nt().
> >>
> >> I don't think we should add a flush function; it would just be
> > another name for an already existing function. Referring to the
> > required
> >> operation in the memcpy_nt() function documentation should
> >>> suffice.
> >>
> >
> > Ok, but again wouldn't it be arch specific?
> > AFAIK for x86 it needs to boil down to sfence, for other
> >>> architectures
> > - I don't know.
> > If you think there already is some generic one (rte_wmb?) that
> >>> would
> > always produce
> > correct instructions - sure let's use it.
> >
> 
>  DPDK has generic functions to wrap architecture specific stuff like
> >>> memory barriers.
> 
>  Because they are non-temporal stores, I suspect that rte_mb() is
> >>> required before reading the data from the location it was copied to.
>  Ensuring that STORE operations are ordered (rte_wmb) might not
> >>> suffice. However, I'm not a CPU expert, so I will seek advice from
>  more qualified people in the community on this.
> >>>
> >>> I think for IA sfence is enough, see citation below, for other
> >>> architectures - no idea.
> >>> What I am trying to say - it needs to be the *same* function on all
> >>> archs we support.
> >>
> >> Now I get it: rte_wmb() might be appropriate on x86, but if any other
> >> architecture requires something else, we should add a new common
> >> function for flushing, e.g. rte_memcpy_nt_flush().
> >>
> >>>
> >>> IA SW optimization manual:
> >>> 9.4.2 Streaming Store Usage Models
> >>> The two primary usage domains for streaming store are coherent
> >>> requests and non-coherent requests.
> >>> 9.4.2.1 Coherent Requests
> >>> Coherent requests are normal loads and stores to system memory,
> >>> which may also hit cache lines present in another processor in a
> >>> multiprocessor environment. With coherent requests, a streaming
> >>> store can be used in the same way as a regular store that has been
> >>> mapped with a WC memory type (PAT or MTRR). An SFENCE instruction
> >>> must be used within a producer-consumer usage model in order to
> >>> ensure coherency and visibility of data between processors.
> >>> Within a single-processor system, the CPU can also re-read the same
> >>> memory location and be assured of coherence (that is, a single,
> >>> consistent view of this memory location).
> >>> The same is true for a multiprocessor
> >>> (MP) system, assuming an accepted MP software producer-consumer
> >>> synchronization policy is employed.
> >>>
> >>
> >> With this reference, I am convinced that you are right about the
> >> SFENCE. This puts a checkmark on this item on my TODO list for the
> >> patch. Thank you, Konstantin!
> >>
> >> Any ARM CPU experts on the mailing list seeing this, not on vacation?
> >> @Honnappa, I'm looking at you. :-)
> >>
> >> Summing up, the question is:
> >>
> >> After a bunch of *non-temporal* stores (STNP instruction) on ARM
> >> architecture, does calling rte_wmb() suffice to ensure the data is
> >> visible across the system?
> > Apologies for the late response, the docs did not have enough information.
> The internal dialogue is still going on, but I have some information now.
> There is some information in ArmV8 programmer's guide [1], though it is not
> complete.
> > In summary, rte_wmb()/rte_mb() would not suffice, we need new APIs.
> >
> >  From my perspective, I see several scenarios:
> > 1)  Need for ordering before the memcpy_nt. Here there are several
> cases:
> > a.  LD – LDNP/STNP – DMB NSHLD
> > b.  ST – LDNP/STNP – DMB NSH
> > 2)  Need for ordering after the memcpy. Again, we have the similar use
> cases:
> > a.  LDNP/STNP – LD – DMB NSH
> > b.  LDNP/STNP – ST – DMB NSH
> >
> > The 'ST - STNP' and 'STNP - ST' do not apply here, but good to add an API 
> > for
> completion.
> >
> > So, may be we could have rte_[r|w]mb_nt() APIs.
> >
> 
> Is rte_smp_rmb()/rte_smp_wmb() also not enough on ARM?
No, they are not as they fall under inner sharable domain where as non-temporal 
loads/stores fall under non-sharable domain

> 
> > [1]
> > https://developer.arm.com/documentation/den0024/a/The-A64-
> instruction-
> > set/Memory-access-instructions/Non-temporal-load-and-store-pair

RE: [PATCH v3 1/4] eventdev/timer: add periodic event timer support

2022-08-11 Thread Carrillo, Erik G

Hi Harish,

> -Original Message-
> From: Naga Harish K, S V 
> Sent: Thursday, August 11, 2022 10:37 AM
> To: Carrillo, Erik G ; jer...@marvell.com
> Cc: pbhagavat...@marvell.com; sthot...@marvell.com; dev@dpdk.org
> Subject: [PATCH v3 1/4] eventdev/timer: add periodic event timer support
> 
> This patch adds support to configure and use periodic event timers in
> software timer adapter.
> 
> The structure ``rte_event_timer_adapter_stats`` is extended by adding a
> new field, ``evtim_drop_count``. This stat represents the number of times an
> event_timer expiry event is dropped by the event timer adapter.
> 
> Signed-off-by: Naga Harish K S V 
> ---

<... snipped ...>
 
> diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c
> index 1dc4f966be..4a2a1178da 100644
> --- a/lib/eventdev/rte_eventdev.c
> +++ b/lib/eventdev/rte_eventdev.c
> @@ -139,7 +139,11 @@ rte_event_timer_adapter_caps_get(uint8_t dev_id,
> uint32_t *caps)
> 
>   if (caps == NULL)
>   return -EINVAL;
> - *caps = 0;
> +
> + if (dev->dev_ops->timer_adapter_caps_get == NULL)
> + *caps = RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC;

I think we should move the definition of RTE_EVENT_TIMER_ADAPTER_SW_CAP to this 
patch, and use that macro here as well.  With that change, this looks good to 
me.

Thanks,
Erik

> + else
> + *caps = 0;
> 
>   return dev->dev_ops->timer_adapter_caps_get ?
>   (*dev->dev_ops-
> >timer_adapter_caps_get)(dev,
> --
> 2.25.1

[dpdk-kmods v2] windows/netuio: fix BAR parsing

2022-08-11 Thread Pallavi Kadam

Current code was always checking the 'prev_bar & PCI_TYPE_64BIT'
though only the first BAR slot of a 64-bit BAR contains flags.
Also for certain PCIe devices, BAR values were not continuous.
This patch fixes this incorrectness and maps the BAR addresses
correctly.

Reported-by: Qiao Liu 
Suggested-by: Dmitry Kozlyuk 
Signed-off-by: Dmitry Kozlyuk 
Tested-by: Pallavi Kadam 
---
 windows/netuio/netuio_dev.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/windows/netuio/netuio_dev.c b/windows/netuio/netuio_dev.c
index b2deb10..073fac8 100644
--- a/windows/netuio/netuio_dev.c
+++ b/windows/netuio/netuio_dev.c
@@ -170,8 +170,6 @@ netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST 
Resources, WDFCMRESLIST R
 
 PCM_PARTIAL_RESOURCE_DESCRIPTOR descriptor;
 ULONG next_descriptor = 0;
-ULONG curr_bar = 0;
-ULONG prev_bar = 0;
 
/*
 * ResourcesTranslated report MMIO BARs in the correct order, but their
@@ -195,9 +193,9 @@ netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST 
Resources, WDFCMRESLIST R
 * searching for the next MMIO resource each time.
 */
 for (INT bar_index = 0; bar_index < PCI_MAX_BAR; bar_index++) {
-prev_bar = curr_bar;
-curr_bar = pci_config.u.type0.BaseAddresses[bar_index];
-if (curr_bar == 0 || (prev_bar & PCI_TYPE_64BIT)) {
+ULONG bar_value = pci_config.u.type0.BaseAddresses[bar_index];
+
+if (bar_value == 0) {
 continue;
 }
 
@@ -236,6 +234,11 @@ netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST 
Resources, WDFCMRESLIST R
 }
 
 ctx->dpdk_hw[bar_index].mem.size = ctx->bar[bar_index].size;
+
+// Skip the next BAR slot used by the current 64-bit address.
+if (bar_value & PCI_TYPE_64BIT) {
+bar_index++;
+}
 } // for bar_index
 
 status = STATUS_SUCCESS;
-- 
2.31.1.windows.1

Re: [dpdk-kmods] windows/netuio: fix bar parsing

2022-08-11 Thread Kadam, Pallavi



On 8/9/2022 2:15 AM, Dmitry Kozlyuk wrote:

2022-08-08 17:33 (UTC-0700), Kadam, Pallavi:
[...]

Hi Pallavi,

In the first place, it was wrong to always test `prev_bar & PCI_TYPE_64BIT`
because only the first BAR slot of a 64-bit BAR contains flags.
The current code has a state to track (curr_bar, prev_bar),
and the fix is complicating it even more without solving the root cause.
I suggest a simpler fix (not tested!)
that eliminates both the incorrectness and the state to maintain:

Thank you. This change works for us.

Please let me know if you would like to submit this change as a new patch or if 
I should include it as a v2 of this same patch.

Please send v2.
You can include my Signed-off-by.
You might also update the commit message and capitalize "BAR" in the title.


Thanks, Dmitry. Have sent v2.

RE: [RFC v2] non-temporal memcpy

2022-08-11 Thread Honnappa Nagarahalli



> >
> >>
> >>> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> >>> Sent: Wednesday, 10 August 2022 13.56
> >>>
> >>> On 2022-08-09 17:26, Stephen Hemminger wrote:
> >>
> >> [...]
> >>
> >>>
> >>> Alignment seems like a non-issue to me. A NT-store memcpy() can be
> >>> made free of alignment requirements, incurring only a very slight
> >>> cost for the always-aligned case (who has their data always 16-byte
> >>> aligned anyways?).
> >>>
> >>> The memory barrier required on x86 seems like a bigger issue.
> >>>
>  Maybe rte_non_cache_copy()?
> 
> >>>
> >>> rte_memcpy_nt_weakly_ordered(), or rte_memcpy_nt_weak(). And a
> >>> rte_memcpy_nt() with the sfence is place, which the user hopefully
> >>> will find first? I don't know. I would prefer not having the weak
> >>> variant at all.
> > I think providing weakly ordered version is required to offset the cost of 
> > the
> barriers. One might be able to copy multiple packets and then issue a barrier.
> >
> 
> On what architecture?
I am talking about Arm architecture. Arm architecture needs barriers between 
normal and NT operations.

> 
> I assumed that only x86 had the peculiar property of having different memory
> models for regular and NT load/stores.
> 
> >>>
> >>> Accepting weak memory ordering (i.e., no sfence) could also be one
> >>> of the flags, assuming rte_memcpy_nt() would have a flags parameter.
> >>> Default is safe (=memcpy() semantics), but potentially slower.
> >>
> >> Excellent idea!
> >>
> >>>
>  Want to avoid the naive user just doing s/memcpy/rte_memcpy_nt/ and
> >>> expect
>  everything to work.
> >

[PATCH] net/iavf: fix VLAN insertion

2022-08-11 Thread Yiding Zhou

When the driver tells the VF to insert VLAN tag using the L2TAG2 field,
vector Tx path does not use Tx context descriptor and would cause VLAN tag
inserted into the wrong location.

This commit is to fix issue by using normal Tx path to handle L2TAG2 case.

Fixes: 3aa957338503 ("net/iavf: fix VLAN insert")
Cc: sta...@dpdk.org

Signed-off-by: Yiding Zhou 
---
 drivers/net/iavf/iavf_rxtx_vec_common.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/iavf/iavf_rxtx_vec_common.h 
b/drivers/net/iavf/iavf_rxtx_vec_common.h
index a59cb2ceee..4ab22c6b2b 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_common.h
+++ b/drivers/net/iavf/iavf_rxtx_vec_common.h
@@ -253,6 +253,9 @@ iavf_tx_vec_queue_default(struct iavf_tx_queue *txq)
if (txq->offloads & IAVF_TX_NO_VECTOR_FLAGS)
return -1;
 
+   if (txq->vlan_flag == IAVF_TX_FLAGS_VLAN_TAG_LOC_L2TAG2)
+   return -1;
+
if (txq->offloads & IAVF_TX_VECTOR_OFFLOAD)
return IAVF_VECTOR_OFFLOAD_PATH;
 
-- 
2.34.1

RE: [PATCH v2] net/iavf: fix Tx L3 checksum offload flag

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Ke Zhang 
> Sent: Wednesday, August 10, 2022 5:57 PM
> To: Wu, Jingjing ; Xing, Beilei
> ; dev@dpdk.org
> Cc: Zhang, Ke1X ; sta...@dpdk.org
> Subject: [PATCH v2] net/iavf: fix Tx L3 checksum offload flag
> 
> When ol_flag is only RTE_MBUF_F_TX_IPV4, the Tx L3 checksum offload is still
> configured to IIPT in the command field of Tx data descriptor.
> 
> This patch is to fix the issue to make the Tx L3 checksum offload flags and Tx
> data descriptor consistent.
> 
> Fixes: 1e728b01120c ("net/iavf: rework Tx path")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ke Zhang 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

RE: [PATCH] net/ice: remove deprecated VF flow action

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Zeng, ZhichaoX 
> Sent: Wednesday, August 10, 2022 2:50 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Zhou, YidingX
> ; Zeng, ZhichaoX ;
> Zhang, Qi Z 
> Subject: [PATCH] net/ice: remove deprecated VF flow action
> 
> From: Zhichao Zeng 
> 
> According to the ABI and API Deprecation, remove deprecated VF action as
> hard-to-use / ambiguous.
> 
> Action REPRESENTED_PORT should be used instead.
> 
> Signed-off-by: Zhichao Zeng 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

[PATCH] net/iavf: fix VLAN insertion

2022-08-11 Thread Yiding Zhou

When the PF driver tells the VF to insert VLAN tag using the L2TAG2 field,
vector Tx path does not use Tx context descriptor and would cause VLAN tag
inserted into the wrong location.

This commit is to fix the issue by using normal Tx path to handle L2TAG2 case.

Fixes: 3aa957338503 ("net/iavf: fix VLAN insert")
Cc: sta...@dpdk.org

Signed-off-by: Yiding Zhou 
---
 drivers/net/iavf/iavf_rxtx_vec_common.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/iavf/iavf_rxtx_vec_common.h 
b/drivers/net/iavf/iavf_rxtx_vec_common.h
index a59cb2ceee..4ab22c6b2b 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_common.h
+++ b/drivers/net/iavf/iavf_rxtx_vec_common.h
@@ -253,6 +253,9 @@ iavf_tx_vec_queue_default(struct iavf_tx_queue *txq)
if (txq->offloads & IAVF_TX_NO_VECTOR_FLAGS)
return -1;
 
+   if (txq->vlan_flag == IAVF_TX_FLAGS_VLAN_TAG_LOC_L2TAG2)
+   return -1;
+
if (txq->offloads & IAVF_TX_VECTOR_OFFLOAD)
return IAVF_VECTOR_OFFLOAD_PATH;
 
-- 
2.34.1

RE: [PATCH] net/iavf: fix VLAN insertion

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Yiding Zhou 
> Sent: Friday, August 12, 2022 10:53 AM
> To: dev@dpdk.org
> Cc: Wu, Jingjing ; Xing, Beilei
> ; sta...@dpdk.org; Zhou, YidingX
> 
> Subject: [PATCH] net/iavf: fix VLAN insertion
> 
> When the PF driver tells the VF to insert VLAN tag using the L2TAG2 field,
> vector Tx path does not use Tx context descriptor and would cause VLAN tag
> inserted into the wrong location.
> 
> This commit is to fix the issue by using normal Tx path to handle L2TAG2 case.
> 
> Fixes: 3aa957338503 ("net/iavf: fix VLAN insert")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yiding Zhou 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

RE: [PATCH] net/igc: add support for Ethernet Controller I225-IT

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Guo, Junfeng 
> Sent: Wednesday, August 10, 2022 2:07 PM
> To: Zhang, Qi Z ; Yang, Qiming
> ; Su, Simei 
> Cc: dev@dpdk.org; Guo, Junfeng 
> Subject: [PATCH] net/igc: add support for Ethernet Controller I225-IT
> 
> Add device id for Ethernet Controller (2) I225-IT.
> 
> Signed-off-by: Junfeng Guo 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

[PATCH v1] net/ice/base: fix switch rules not cleared on warm reset

2022-08-11 Thread Steve Yang

When users killed app forcely (e.g.: kill -9 pid), the driver reset
couldn't make all registers of NIC recovery to initial status.
For example, the switch filter rules, which involved the vlan tag,
couldn't be added.

Tell the Firmware to shut down the AdminQ to avoid possible error
when process was killed abnormally.

Fixes: 453d087ccaff ("net/ice/base: add common functions")
Cc: sta...@dpdk.org

Signed-off-by: Steve Yang 
---
 drivers/net/ice/base/ice_common.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ice/base/ice_common.c 
b/drivers/net/ice/base/ice_common.c
index db87bacd97..66b51be29d 100644
--- a/drivers/net/ice/base/ice_common.c
+++ b/drivers/net/ice/base/ice_common.c
@@ -926,6 +926,11 @@ enum ice_status ice_init_hw(struct ice_hw *hw)
if (status)
goto err_unroll_cqinit;
 
+   /* Tell the Firmware to shut down the AdminQ to avoid possible error
+* when process was killed abnormally.
+*/
+   ice_aq_q_shutdown(hw, true);
+
status = ice_init_nvm(hw);
if (status)
goto err_unroll_cqinit;
-- 
2.25.1

[PATCH v2] ethdev: remove header split Rx offload

2022-08-11 Thread xuan . ding

From: Xuan Ding 

As announced in the deprecation note, this patch removes the Rx offload
flag 'RTE_ETH_RX_OFFLOAD_HEADER_SPLIT' and 'split_hdr_size' field from
the structure 'rte_eth_rxmode'. Meanwhile, the place where the examples
and apps initialize the 'split_hdr_size' field, and where the drivers
check if the 'split_hdr_size' value is 0 are also removed.

User can still use `RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT` for per-queue packet
split offload, which is configured by 'rte_eth_rxseg_split'.

Signed-off-by: Xuan Ding 
---
v2:
* fix CI build error
---
 app/test-eventdev/test_perf_common.c|  1 -
 app/test-pipeline/init.c|  1 -
 app/test-pmd/cmdline.c  | 12 ++--
 app/test/test_link_bonding.c|  1 -
 app/test/test_link_bonding_mode4.c  |  1 -
 app/test/test_link_bonding_rssconf.c|  2 --
 app/test/test_pmd_perf.c|  1 -
 app/test/test_security_inline_proto.c   |  1 -
 doc/guides/nics/fm10k.rst   |  4 
 doc/guides/nics/ixgbe.rst   |  4 
 doc/guides/rel_notes/deprecation.rst|  6 --
 doc/guides/rel_notes/release_22_11.rst  |  5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 ++--
 drivers/net/cnxk/cnxk_ethdev_ops.c  |  1 -
 drivers/net/failsafe/failsafe_ops.c |  2 --
 drivers/net/fm10k/fm10k_ethdev.c|  1 -
 drivers/net/fm10k/fm10k_rxtx_vec.c  |  4 
 drivers/net/i40e/i40e_rxtx_vec_common.h |  4 
 drivers/net/mvneta/mvneta_ethdev.c  |  5 -
 drivers/net/mvpp2/mrvl_ethdev.c |  5 -
 drivers/net/thunderx/nicvf_ethdev.c |  5 -
 examples/bbdev_app/main.c   |  1 -
 examples/bond/main.c|  1 -
 examples/flow_filtering/main.c  |  3 ---
 examples/ip_fragmentation/main.c|  1 -
 examples/ip_pipeline/link.c |  1 -
 examples/ip_reassembly/main.c   |  1 -
 examples/ipsec-secgw/ipsec-secgw.c  |  1 -
 examples/ipv4_multicast/main.c  |  1 -
 examples/l2fwd-crypto/main.c|  1 -
 examples/l2fwd-event/l2fwd_common.c |  3 ---
 examples/l2fwd-jobstats/main.c  |  3 ---
 examples/l2fwd-keepalive/main.c |  3 ---
 examples/l2fwd/main.c   |  3 ---
 examples/l3fwd-graph/main.c |  1 -
 examples/l3fwd-power/main.c |  1 -
 examples/l3fwd/main.c   |  1 -
 examples/link_status_interrupt/main.c   |  3 ---
 examples/multi_process/symmetric_mp/main.c  |  1 -
 examples/ntb/ntb_fwd.c  |  1 -
 examples/pipeline/obj.c |  1 -
 examples/qos_meter/main.c   |  1 -
 examples/qos_sched/init.c   |  3 ---
 examples/vhost/main.c   |  1 -
 examples/vmdq/main.c|  1 -
 examples/vmdq_dcb/main.c|  1 -
 lib/ethdev/rte_ethdev.c |  1 -
 lib/ethdev/rte_ethdev.h |  3 ---
 48 files changed, 13 insertions(+), 100 deletions(-)

diff --git a/app/test-eventdev/test_perf_common.c 
b/app/test-eventdev/test_perf_common.c
index 81420be73a..7474b9270a 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -1244,7 +1244,6 @@ perf_ethdev_setup(struct evt_test *test, struct 
evt_options *opt)
struct rte_eth_conf port_conf = {
.rxmode = {
.mq_mode = RTE_ETH_MQ_RX_RSS,
-   .split_hdr_size = 0,
},
.rx_adv_conf = {
.rss_conf = {
diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index eee0719b67..d146c44be0 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -68,7 +68,6 @@ struct app_params app = {
 
 static struct rte_eth_conf port_conf = {
.rxmode = {
-   .split_hdr_size = 0,
.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
},
.rx_adv_conf = {
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b4fe9dfb17..5787659c32 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -745,7 +745,7 @@ static void cmd_help_long_parsed(void *parsed_result,
 
"port config  rx_offload vlan_strip|"
"ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
-   "outer_ipv4_cksum|macsec_strip|header_split|"
+   "outer_ipv4_cksum|macsec_strip|"
"vlan_filter|vlan_extend|jumbo_frame|scatter|"
"buffer_split|timestamp|security|keep_crc on|off\n"
" Enable or disable a per port Rx offloading"
@@ -753,7 +753,7 @@ static void cmd_help_long_parsed(void *parsed_result,
 
"port (port_id) rxq (queue_id) rx_offload vlan

RE: [PATCH] net/ice: support disabling ACL engine in DCF via devargs

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Zeng, ZhichaoX 
> Sent: Monday, July 25, 2022 11:15 AM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Zeng, ZhichaoX
> ; Zhang, Qi Z 
> Subject: [PATCH] net/ice: support disabling ACL engine in DCF via devargs
> 
> From: Zhichao Zeng 
> 
> Support disabling DCF ACL engine via devarg "acl=off" in cmdline, aiming to
> shorten the DCF startup time.
> 
> Signed-off-by: Zhichao Zeng 

The patch looks good, but need to document the new devarg in section "Device 
Config Function (DCF)" in ice.rst.

RE: [PATCH v1] net/iavf: fix pattern check for flow director parser

2022-08-11 Thread Zhang, Qi Z




> -Original Message-
> From: Steve Yang 
> Sent: Wednesday, August 10, 2022 2:48 PM
> To: dev@dpdk.org
> Cc: Wu, Jingjing ; Xing, Beilei
> ; Yang, SteveX ;
> sta...@dpdk.org
> Subject: [PATCH v1] net/iavf: fix pattern check for flow director parser
> 
> FDIR rules with masks are not supported in current code. Thus add pattern
> check for IPv4/UDP/TCP/SCTP addr/port to terminate the FDIR programming
> stage.
> 
> Fixes: d5eb3e600d9e ("net/iavf: support flow director basic rule")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Steve Yang 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi

[PATCH] vhost: support CPU copy for small packets

2022-08-11 Thread Wenwu Ma

Offloading small packets to DMA degrades throughput 10%~20%,
and this is because DMA offloading is not free and DMA is not
good at processing small packets. In addition, control plane
packets are usually small, and assign those packets to DMA will
significantly increase latency, which may cause timeout like
TCP handshake packets. Therefore, this patch use CPU to perform
small copies in vhost.

Signed-off-by: Wenwu Ma 
---
 lib/vhost/vhost.h  |  6 ++-
 lib/vhost/virtio_net.c | 87 +++---
 2 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 40fac3b7c6..b4523175a8 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -142,8 +142,10 @@ struct virtqueue_stats {
  * iovec
  */
 struct vhost_iovec {
-   void *src_addr;
-   void *dst_addr;
+   void *src_iov_addr;
+   void *dst_iov_addr;
+   void *src_virt_addr;
+   void *dst_virt_addr;
size_t len;
 };
 
diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 35fa4670fd..b3bed93de7 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -26,6 +26,8 @@
 
 #define MAX_BATCH_LEN 256
 
+#define CPU_COPY_THRESHOLD_LEN 256
+
 static __rte_always_inline uint16_t
 async_poll_dequeue_completed(struct virtio_net *dev, struct vhost_virtqueue 
*vq,
struct rte_mbuf **pkts, uint16_t count, int16_t dma_id,
@@ -114,29 +116,36 @@ vhost_async_dma_transfer_one(struct virtio_net *dev, 
struct vhost_virtqueue *vq,
int copy_idx = 0;
uint32_t nr_segs = pkt->nr_segs;
uint16_t i;
+   bool cpu_copy = true;
 
if (rte_dma_burst_capacity(dma_id, vchan_id) < nr_segs)
return -1;
 
for (i = 0; i < nr_segs; i++) {
-   copy_idx = rte_dma_copy(dma_id, vchan_id, 
(rte_iova_t)iov[i].src_addr,
-   (rte_iova_t)iov[i].dst_addr, iov[i].len, 
RTE_DMA_OP_FLAG_LLC);
-   /**
-* Since all memory is pinned and DMA vChannel
-* ring has enough space, failure should be a
-* rare case. If failure happens, it means DMA
-* device encounters serious errors; in this
-* case, please stop async data-path and check
-* what has happened to DMA device.
-*/
-   if (unlikely(copy_idx < 0)) {
-   if (!vhost_async_dma_copy_log) {
-   VHOST_LOG_DATA(dev->ifname, ERR,
-   "DMA copy failed for channel %d:%u\n",
-   dma_id, vchan_id);
-   vhost_async_dma_copy_log = true;
+   if (iov[i].len > CPU_COPY_THRESHOLD_LEN) {
+   copy_idx = rte_dma_copy(dma_id, vchan_id, 
(rte_iova_t)iov[i].src_iov_addr,
+   (rte_iova_t)iov[i].dst_iov_addr,
+   iov[i].len, RTE_DMA_OP_FLAG_LLC);
+   /**
+* Since all memory is pinned and DMA vChannel
+* ring has enough space, failure should be a
+* rare case. If failure happens, it means DMA
+* device encounters serious errors; in this
+* case, please stop async data-path and check
+* what has happened to DMA device.
+*/
+   if (unlikely(copy_idx < 0)) {
+   if (!vhost_async_dma_copy_log) {
+   VHOST_LOG_DATA(dev->ifname, ERR,
+   "DMA copy failed for channel 
%d:%u\n",
+   dma_id, vchan_id);
+   vhost_async_dma_copy_log = true;
+   }
+   return -1;
}
-   return -1;
+   cpu_copy = false;
+   } else {
+   rte_memcpy(iov[i].dst_virt_addr, iov[i].src_virt_addr, 
iov[i].len);
}
}
 
@@ -144,7 +153,13 @@ vhost_async_dma_transfer_one(struct virtio_net *dev, 
struct vhost_virtqueue *vq,
 * Only store packet completion flag address in the last copy's
 * slot, and other slots are set to NULL.
 */
-   dma_info->pkts_cmpl_flag_addr[copy_idx & ring_mask] = 
&vq->async->pkts_cmpl_flag[flag_idx];
+   if (cpu_copy == false) {
+   dma_info->pkts_cmpl_flag_addr[copy_idx & ring_mask] =
+   &vq->async->pkts_cmpl_flag[flag_idx];
+   } else {
+   vq->async->pkts_cmpl_flag[flag_idx] = true;
+   nr_segs = 0;
+   }
 
return nr_segs;
 }
@@ -1008,7 +1023,7 @@ async_iter_initialize(struct virtio_net *dev, struct 
vhost_async *a

45 matches

Mail list logo