RE: [PATCH RESEND] config/arm: add PHYTIUM fts2500

2022-09-08 Thread Ruifeng Wang
> -Original Message-
> From: luzhipeng 
> Sent: Wednesday, September 7, 2022 4:11 PM
> To: dev@dpdk.org
> Cc: Jan Viktorin ; Ruifeng Wang 
> ; Bruce
> Richardson ; luzhipeng 
> Subject: [PATCH RESEND] config/arm: add PHYTIUM fts2500
> 
> Here adds configs for PHYTIUM server.
> 
> Signed-off-by: luzhipeng 
> ---
>  config/arm/arm64_fts2500_linux_gcc | 16 
>  config/arm/meson.build | 22 --
>  2 files changed, 36 insertions(+), 2 deletions(-)  create mode 100644
> config/arm/arm64_fts2500_linux_gcc
> 
> diff --git a/config/arm/arm64_fts2500_linux_gcc 
> b/config/arm/arm64_fts2500_linux_gcc
> new file mode 100644
> index 00..d43c7aad3a
> --- /dev/null
> +++ b/config/arm/arm64_fts2500_linux_gcc
> @@ -0,0 +1,16 @@
> +[binaries]
> +c = 'aarch64-linux-gnu-gcc'
Ccache was enabled to speed up cross build.
To be consistent with other SOCs, please add it here as well.

Thanks.
> +cpp = 'aarch64-linux-gnu-g++'
> +ar = 'aarch64-linux-gnu-gcc-ar'
> +strip = 'aarch64-linux-gnu-strip'
> +pkgconfig = 'aarch64-linux-gnu-pkg-config'
> +pcap-config = ''
> +
> +[host_machine]
> +system = 'linux'
> +cpu_family = 'aarch64'
> +cpu = 'armv8-a'
> +endian = 'little'
> +
> +[properties]
> +platform = 'fts2500'
> diff --git a/config/arm/meson.build b/config/arm/meson.build index 
> 9f1636e0d5..ae0777b46c
> 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -203,13 +203,22 @@ implementer_phytium = {
>  ['RTE_MACHINE', '"armv8a"'],
>  ['RTE_USE_C11_MEM_MODEL', true],
>  ['RTE_CACHE_LINE_SIZE', 64],
> -['RTE_MAX_LCORE', 64],
> -['RTE_MAX_NUMA_NODES', 8]
>  ],
>  'part_number_config': {
>  '0x662': {
>  'machine_args': ['-march=armv8-a+crc'],
> +'flags': [
> +['RTE_MAX_LCORE', 64],
> +['RTE_MAX_NUMA_NODES', 8]
> + ]
>  },
> +   '0x663': {
> +'machine_args': ['-march=armv8-a+crc'],
> +'flags': [
> +['RTE_MAX_LCORE', 128],
> +['RTE_MAX_NUMA_NODES', 16]
> +]
> +}
>  }
>  }
> 
> @@ -328,6 +337,13 @@ soc_ft2000plus = {
>  'numa': true
>  }
> 
> +soc_fts2500 = {
> +'description': 'Phytium FT-S2500',
> +'implementer': '0x70',
> +'part_number': '0x663',
> +'numa': true
> +}
> +
>  soc_graviton2 = {
>  'description': 'AWS Graviton2',
>  'implementer': '0x41',
> @@ -414,6 +430,7 @@ cn10k:   Marvell OCTEON 10
>  dpaa:NXP DPAA
>  emag:Ampere eMAG
>  ft2000plus:  Phytium FT-2000+
> +fts2500: Phytium FT-S2500
>  graviton2:   AWS Graviton2
>  kunpeng920:  HiSilicon Kunpeng 920
>  kunpeng930:  HiSilicon Kunpeng 930
> @@ -438,6 +455,7 @@ socs = {
>  'dpaa': soc_dpaa,
>  'emag': soc_emag,
>  'ft2000plus': soc_ft2000plus,
> +'fts2500': soc_fts2500,
>  'graviton2': soc_graviton2,
>  'kunpeng920': soc_kunpeng920,
>  'kunpeng930': soc_kunpeng930,
> --
> 2.27.0
> 
> 



> [PATCH RESEND] config/arm: add PHYTIUM fts2500

2022-09-08 Thread 解建华
Hello Zhipeng, please see inline. 

Thanks a lot,
Jianhua


> -原始邮件-发件人:luzhipeng 发送时间:2022-09-07 16:10:55 
> (星期三)收件人:dev@dpdk.org抄送:"Jan Viktorin" , "Ruifeng 
> Wang" , "Bruce Richardson" 
> , luzhipeng 主题:[PATCH RESEND] 
> config/arm: add PHYTIUM fts2500
> 
> Here adds configs for PHYTIUM server.
> 
> Signed-off-by: luzhipeng 
> ---
>  config/arm/arm64_fts2500_linux_gcc | 16 
>  config/arm/meson.build | 22 --
>  2 files changed, 36 insertions(+), 2 deletions(-)
>  create mode 100644 config/arm/arm64_fts2500_linux_gcc
> 
> diff --git a/config/arm/arm64_fts2500_linux_gcc 
> b/config/arm/arm64_fts2500_linux_gcc

Phytium released 3 series of CPU including TengYun S - server,
TengRui D - desktop and TengLong E - embedded. please refer to the introduction 
link:
https://www.phytium.com.cn/en/class/11

so it would be better if you change config/arm/arm64_fts2500_linux_gcc  to
config/arm/arm64_tys2500_linux_gcc

> new file mode 100644
> index 00..d43c7aad3a
> --- /dev/null
> +++ b/config/arm/arm64_fts2500_linux_gcc
> @@ -0,0 +1,16 @@
> +[binaries]
> +c = 'aarch64-linux-gnu-gcc'
> +cpp = 'aarch64-linux-gnu-g++'
> +ar = 'aarch64-linux-gnu-gcc-ar'
> +strip = 'aarch64-linux-gnu-strip'
> +pkgconfig = 'aarch64-linux-gnu-pkg-config'
> +pcap-config = ''
> +
> +[host_machine]
> +system = 'linux'
> +cpu_family = 'aarch64'
> +cpu = 'armv8-a'
> +endian = 'little'
> +
> +[properties]
> +platform = 'fts2500'

tys2500 looks better.


> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 9f1636e0d5..ae0777b46c 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -203,13 +203,22 @@ implementer_phytium = {
>  ['RTE_MACHINE', '"armv8a"'],
>  ['RTE_USE_C11_MEM_MODEL', true],
>  ['RTE_CACHE_LINE_SIZE', 64],
> -['RTE_MAX_LCORE', 64],
> -['RTE_MAX_NUMA_NODES', 8]
>  ],
>  'part_number_config': {
>  '0x662': {
>  'machine_args': ['-march=armv8-a+crc'],

please split machine_args like this:
-'machine_args': ['-march=armv8-a+crc'],
+'march': 'armv8-a',
+'march_features': ['crc'],


> +'flags': [
> +['RTE_MAX_LCORE', 64],
> +['RTE_MAX_NUMA_NODES', 8]
> + ]
>  },
> +   '0x663': {
> +'machine_args': ['-march=armv8-a+crc'],

please split machine_args like this:
-'machine_args': ['-march=armv8-a+crc'],
+'march': 'armv8-a',
+'march_features': ['crc'],


> +'flags': [
> +['RTE_MAX_LCORE', 128],
> +['RTE_MAX_NUMA_NODES', 16]

+['RTE_MAX_LCORE', 256],
+['RTE_MAX_NUMA_NODES', 32]

Phytium TengYun S2500 server series have 2P_128core, 4P_256core
and up to 8P_512core SKUs. Single processor of them is ARMv8-a
architecture with part number 0x663, 8 NUMA nodes, 64 cores.

you may add Phytium TengYun S2500 servers with the
max configuration 4P_256core_32NUMA, and ignore 8P_512core_64NUMA
since this SKU has not been found in current market, can't be tested.


> +]
> +}
>  }
>  }
>  
> @@ -328,6 +337,13 @@ soc_ft2000plus = {
>  'numa': true
>  }
>  
> +soc_fts2500 = {

+soc_tys2500

> +'description': 'Phytium FT-S2500',

+'description': 'Phytium TengYun S2500',


> +'implementer': '0x70',
> +'part_number': '0x663',
> +'numa': true
> +}
> +
>  soc_graviton2 = {
>  'description': 'AWS Graviton2',
>  'implementer': '0x41',
> @@ -414,6 +430,7 @@ cn10k:   Marvell OCTEON 10
>  dpaa:NXP DPAA
>  emag:Ampere eMAG
>  ft2000plus:  Phytium FT-2000+
> +fts2500: Phytium FT-S2500

+tys2500: Phytium TengYun S2500

>  graviton2:   AWS Graviton2
>  kunpeng920:  HiSilicon Kunpeng 920
>  kunpeng930:  HiSilicon Kunpeng 930
> @@ -438,6 +455,7 @@ socs = {
>  'dpaa': soc_dpaa,
>  'emag': soc_emag,
>  'ft2000plus': soc_ft2000plus,
> +'fts2500': soc_fts2500,

+'tys2500': soc_tys2500,

>  'graviton2': soc_graviton2,
>  'kunpeng920': soc_kunpeng920,
>  'kunpeng930': soc_kunpeng930,
> -- 
> 2.27.0
> 
> 


信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息。
Information Security Notice: The information contained in this mail is solely 
property of the sender's organization.This mail communication is 
confidential.Recipients named above are obligated to maintain secrecy and are 
not permitted to disclose the contents of this communication to others.

Re: DPDK Summit 2022 is live

2022-09-08 Thread Thomas Monjalon
For the second day of presentations, we have a new Youtube link.

Feel free to watch (with best quality) on
https://www.youtube.com/watch?v=wRiDhfTKPWw
or join and ask questions on
https://zoom.us/j/96829774063?pwd=OGJESU5HN1h0Y29DQjJjd2owT0xHUT09

You can find the schedule and slides at
https://dpdkuserspace22.sched.com/




RE: [PATCH v2 02/10] net/gve: add logs and OS specific implementation

2022-09-08 Thread Guo, Junfeng


> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, September 7, 2022 19:17
> To: Guo, Junfeng ; Zhang, Qi Z
> ; Wu, Jingjing 
> Cc: dev@dpdk.org; Li, Xiaoyun ;
> awogbem...@google.com; Richardson, Bruce
> ; Wang, Haiyue 
> Subject: Re: [PATCH v2 02/10] net/gve: add logs and OS specific
> implementation
> 
> On 9/7/2022 7:58 AM, Guo, Junfeng wrote:
> > CAUTION: This message has originated from an External Source. Please
> use proper judgment and caution when opening attachments, clicking
> links, or responding to this email.
> >
> >
> >> -Original Message-
> >> From: Ferruh Yigit 
> >> Sent: Friday, September 2, 2022 01:21
> >> To: Guo, Junfeng ; Zhang, Qi Z
> >> ; Wu, Jingjing 
> >> Cc: dev@dpdk.org; Li, Xiaoyun ;
> >> awogbem...@google.com; Richardson, Bruce
> >> ; Wang, Haiyue
> 
> >> Subject: Re: [PATCH v2 02/10] net/gve: add logs and OS specific
> >> implementation
> >>
> >> On 8/29/2022 9:41 AM, Junfeng Guo wrote:
> >>
> >>>
> >>> Add GVE PMD logs.
> >>> Add some MACRO definitions and memory operations which are
> specific
> >>> for DPDK.
> >>>
> >>> Signed-off-by: Haiyue Wang 
> >>> Signed-off-by: Xiaoyun Li 
> >>> Signed-off-by: Junfeng Guo 
> >>
> >> <...>
> >>
> >>> diff --git a/drivers/net/gve/gve_logs.h b/drivers/net/gve/gve_logs.h
> >>> new file mode 100644
> >>> index 00..a050253f59
> >>> --- /dev/null
> >>> +++ b/drivers/net/gve/gve_logs.h
> >>> @@ -0,0 +1,22 @@
> >>> +/* SPDX-License-Identifier: BSD-3-Clause
> >>> + * Copyright(C) 2022 Intel Corporation
> >>> + */
> >>> +
> >>> +#ifndef _GVE_LOGS_H_
> >>> +#define _GVE_LOGS_H_
> >>> +
> >>> +extern int gve_logtype_init;
> >>> +extern int gve_logtype_driver;
> >>> +
> >>> +#define PMD_INIT_LOG(level, fmt, args...) \
> >>> +   rte_log(RTE_LOG_ ## level, gve_logtype_init, "%s(): " fmt "\n", \
> >>> +   __func__, ##args)
> >>> +
> >>> +#define PMD_DRV_LOG_RAW(level, fmt, args...) \
> >>> +   rte_log(RTE_LOG_ ## level, gve_logtype_driver, "%s(): " fmt, \
> >>> +   __func__, ## args)
> >>> + > +#define PMD_DRV_LOG(level, fmt, args...) \
> >>> +   PMD_DRV_LOG_RAW(level, fmt "\n", ## args)
> >>> +
> >>
> >> Why 'PMD_DRV_LOG_RAW' is needed, why not directly use
> >> 'PMD_DRV_LOG'?
> >
> > It seems that the _RAW macro was first introduced at i40e driver logs
> file.
> > Since sometimes the trailing '\n' is added at the end of the log message
> in
> > the base code, the PMD_DRV_LOG_RAW macro that will not add one is
> > used to keep consistent of the new line character.
> >
> > Well, looks that the macro PMD_DRV_LOG_RAW is somewhat
> redundant.
> > I think it's ok to remove PMD_DRV_LOG_RAW and keep all the log
> messages
> > end without the trailing '\n'. Thanks!
> >
> 
> Or you can add '\n' to 'PMD_DRV_LOG', to not change all logs. Only
> having two macro seems unnecessary.

Yes, already did as this form in the coming version gve pmd code. Thanks!

> 
> >>
> >>
> >> Do you really need two different log types? How do you differentiate
> >> 'init' & 'driver' types? As far as I can see there is mixed usage of them.
> >
> > The PMD_INIT_LOG is used at the init stage, while the PMD_DRV_LOG
> > is used at the driver normal running stage. I agree that there might be
> > mixed usage of these two macros. I'll try to check all these usages and
> > update them at correct conditions in the coming versions.
> > If you insist that only one log type is needed to keep the code clean,
> > then I could update them as you expected. Thanks!
> >
> 
> I do not insist, but it looks like you are complicating things, is there
> really a benefit to have two different log types?

Well, these two types may be used to show init/driver logs, respectively.
But It seems that there is no such specific need to use two log types in the
GVE PMD. Anyway, I think it is good time to keep the code clean and not 
just inherit from previous drivers. We can add new log type in the future
if it's required. Thanks!


Re: [PATCH v4 3/9] dts: add basic logging facility

2022-09-08 Thread Bruce Richardson
On Fri, Jul 29, 2022 at 10:55:44AM +, Juraj Linkeš wrote:
> The logging module provides loggers distinguished by two attributes,
> a custom format and a verbosity switch. The loggers log to both console
> and more verbosely to files.
> 
> Signed-off-by: Owen Hilyard 
> Signed-off-by: Juraj Linkeš 

Few small comments inline below.

Thanks,
/Bruce

> ---
>  dts/framework/__init__.py |   3 +
>  dts/framework/logger.py   | 124 ++
>  2 files changed, 127 insertions(+)
>  create mode 100644 dts/framework/__init__.py
>  create mode 100644 dts/framework/logger.py
> 
> diff --git a/dts/framework/__init__.py b/dts/framework/__init__.py
> new file mode 100644
> index 00..3c30bccf43
> --- /dev/null
> +++ b/dts/framework/__init__.py
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +#
> diff --git a/dts/framework/logger.py b/dts/framework/logger.py
> new file mode 100644
> index 00..920ce0fb15
> --- /dev/null
> +++ b/dts/framework/logger.py
> @@ -0,0 +1,124 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2010-2014 Intel Corporation
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +# Copyright(c) 2022 University of New Hampshire
> +#
> +
> +import logging
> +import os.path
> +from typing import TypedDict
> +
> +"""
> +DTS logger module with several log level. DTS framework and TestSuite log
> +will saved into different log files.
> +"""
> +verbose = False
> +date_fmt = "%d/%m/%Y %H:%M:%S"

Please use Year-month-day ordering for dates, since it's unambiguous - as
well as being an ISO standard date format. (ISO 8601)

> +stream_fmt = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
> +
> +
> +class LoggerDictType(TypedDict):
> +logger: "DTSLOG"
> +name: str
> +node: str
> +
> +
> +# List for saving all using loggers
> +global Loggers
> +Loggers: list[LoggerDictType] = []
> +
> +
> +def set_verbose() -> None:
> +global verbose
> +verbose = True
> +

Is there a need for a clear_verbose() or "set_not_verbose()" API?

> +
> +class DTSLOG(logging.LoggerAdapter):
> +"""
> +DTS log class for framework and testsuite.
> +"""
> +
> +node: str
> +logger: logging.Logger
> +sh: logging.StreamHandler
> +fh: logging.FileHandler
> +verbose_handler: logging.FileHandler
> +
> +def __init__(self, logger: logging.Logger, node: str = "suite"):
> +global log_dir
> +
> +self.logger = logger
> +self.logger.setLevel(1)  # 1 means log everything
> +
> +self.node = node
> +
> +# add handler to emit to stdout
> +sh = logging.StreamHandler()
> +sh.setFormatter(logging.Formatter(stream_fmt, date_fmt))
> +
> +sh.setLevel(logging.DEBUG)  # file handler default level
> +global verbose
> +if verbose is True:
> +sh.setLevel(logging.DEBUG)
> +else:
> +sh.setLevel(logging.INFO)  # console handler defaultlevel

The global should be defined at the top of the function.
Looks like some of the setlevel calls are unnecessary; two should be enough
rather than three. For example:

sh.setLevel(logging.INFO)
if verbose:
sh.setLevel(logging.DEGUG)

> +
> +self.logger.addHandler(sh)
> +self.sh = sh
> +
> +if not os.path.exists("output"):
> +os.mkdir("output")
> +
> +fh = logging.FileHandler(f"output/{node}.log")
> +fh.setFormatter(
> +logging.Formatter(
> +fmt="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
> +datefmt=date_fmt,
> +)
> +)
> +
> +fh.setLevel(1)  # We want all the logs we can get in the file
> +self.logger.addHandler(fh)
> +self.fh = fh
> +
> +# This outputs EVERYTHING, intended for post-mortem debugging
> +# Also optimized for processing via AWK (awk -F '|' ...)
> +verbose_handler = logging.FileHandler(f"output/{node}.verbose.log")
> +verbose_handler.setFormatter(
> +logging.Formatter(
> +
> fmt="%(asctime)s|%(name)s|%(levelname)s|%(pathname)s|%(lineno)d|%(funcName)s|"
> +"%(process)d|%(thread)d|%(threadName)s|%(message)s",
> +datefmt=date_fmt,
> +)
> +)
> +
> +verbose_handler.setLevel(1)  # We want all the logs we can get in 
> the file
> +self.logger.addHandler(verbose_handler)
> +self.verbose_handler = verbose_handler
> +
> +super(DTSLOG, self).__init__(self.logger, dict(node=self.node))
> +
> +def logger_exit(self) -> None:
> +"""
> +Remove stream handler and logfile handler.
> +"""
> +for handler in (self.sh, self.fh, self.verbose_handler):
> +handler.flush()
> +self.logger.removeHandler(handler)
> +
> +
> +def getLogger(name: str, node: str = "suite") -> DTSLOG:
> +"""
> + 

[PATCH v8 00/12] preparation for the rte_flow offload of nfp PMD

2022-09-08 Thread Chaoyong He
This is the first patch series to add the support of rte_flow offload for
nfp PMD, includes:
Add the support of flower firmware
Add the support of representor port
Add the flower service infrastructure
Add the cmsg interactive channels between pmd and fw

* Changes since v7
- Adjust the logics to make sure not break the pci probe process
- Change 'app' to 'app_fw' in all logics to avoid confuse
- Fix problem about log level

* Changes since v6
- Fix the compile error

* Changes since v5
- Compare integer with 0 explicitly
- Change helper macro to function
- Implement the dummy functions
- Remove some unnecessary logics

* Changes since v4
- Remove the unneeded '__rte_unused' attribute
- Fixup a potential memory leak problem

* Changes since v3
- Add the 'Depends-on' tag

* Changes since v2
- Remove the use of rte_panic()

* Changes since v1
- Fix the compile error

Depends-on: series-23707 ("Add support of NFP3800 chip and firmware with NFDk")

Chaoyong He (12):
  net/nfp: move app specific attributes to own struct
  net/nfp: simplify initialization and remove dead code
  net/nfp: move app specific init logic to own function
  net/nfp: add initial flower firmware support
  net/nfp: add flower PF setup logic
  net/nfp: add flower PF related routines
  net/nfp: add flower ctrl VNIC related logics
  net/nfp: move common rxtx function for flower use
  net/nfp: add flower ctrl VNIC rxtx logic
  net/nfp: add flower representor framework
  net/nfp: move rxtx function to header file
  net/nfp: add flower PF rxtx logic

 doc/guides/rel_notes/release_22_11.rst  |5 +
 drivers/net/nfp/flower/nfp_flower.c | 1324 +++
 drivers/net/nfp/flower/nfp_flower.h |   62 ++
 drivers/net/nfp/flower/nfp_flower_cmsg.c|  186 
 drivers/net/nfp/flower/nfp_flower_cmsg.h|  173 +++
 drivers/net/nfp/flower/nfp_flower_ctrl.c|  250 +
 drivers/net/nfp/flower/nfp_flower_ctrl.h|   13 +
 drivers/net/nfp/flower/nfp_flower_ovs_compat.h  |   37 +
 drivers/net/nfp/flower/nfp_flower_representor.c |  664 
 drivers/net/nfp/flower/nfp_flower_representor.h |   39 +
 drivers/net/nfp/meson.build |4 +
 drivers/net/nfp/nfp_common.c|2 +-
 drivers/net/nfp/nfp_common.h|   35 +-
 drivers/net/nfp/nfp_cpp_bridge.c|   87 +-
 drivers/net/nfp/nfp_cpp_bridge.h|6 +-
 drivers/net/nfp/nfp_ethdev.c|  345 +++---
 drivers/net/nfp/nfp_ethdev_vf.c |2 +-
 drivers/net/nfp/nfp_rxtx.c  |  123 +--
 drivers/net/nfp/nfp_rxtx.h  |  121 +++
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c  |   31 +-
 20 files changed, 3228 insertions(+), 281 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ovs_compat.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

-- 
1.8.3.1



[PATCH v8 01/12] net/nfp: move app specific attributes to own struct

2022-09-08 Thread Chaoyong He
The NFP card can load different firmware applications. Currently
only the CoreNIC application is supported. This commit makes
needed infrastructure changes in order to support other firmware
applications too.

Clearer separation is made between the PF device and any application
specific concepts. The PF struct is now generic regardless of the
application loaded. A new struct is also made for the CoreNIC
application. Future additions to support other applications should
also add an applications specific struct.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/nfp_common.h |  30 ++-
 drivers/net/nfp/nfp_ethdev.c | 197 +++
 2 files changed, 152 insertions(+), 75 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 6d917e4..bea5f95 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -111,6 +111,11 @@
 #include 
 #include 
 
+/* Firmware application ID's */
+enum nfp_app_fw_id {
+   NFP_APP_FW_CORE_NIC   = 0x1,
+};
+
 /* nfp_qcp_ptr - Read or Write Pointer of a queue */
 enum nfp_qcp_ptr {
NFP_QCP_READ_PTR = 0,
@@ -121,8 +126,10 @@ struct nfp_pf_dev {
/* Backpointer to associated pci device */
struct rte_pci_device *pci_dev;
 
-   /* Array of physical ports belonging to this PF */
-   struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+   enum nfp_app_fw_id app_fw_id;
+
+   /* Pointer to the app running on the PF */
+   void *app_fw_priv;
 
/* Current values for control */
uint32_t ctrl;
@@ -151,8 +158,6 @@ struct nfp_pf_dev {
struct nfp_cpp_area *msix_area;
 
uint8_t *hw_queues;
-   uint8_t total_phyports;
-   boolmultiport;
 
union eth_table_entry *eth_table;
 
@@ -161,6 +166,20 @@ struct nfp_pf_dev {
uint32_t nfp_cpp_service_id;
 };
 
+struct nfp_app_fw_nic {
+   /* Backpointer to the PF device */
+   struct nfp_pf_dev *pf_dev;
+
+   /*
+* Array of physical ports belonging to the this CoreNIC app
+* This is really a list of vNIC's. One for each physical port
+*/
+   struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+
+   bool multiport;
+   uint8_t total_phyports;
+};
+
 struct nfp_net_hw {
/* Backpointer to the PF this port belongs to */
struct nfp_pf_dev *pf_dev;
@@ -424,6 +443,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
 #define NFP_NET_DEV_PRIVATE_TO_PF(dev_priv)\
(((struct nfp_net_hw *)dev_priv)->pf_dev)
 
+#define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
+   ((struct nfp_app_fw_nic *)app_fw_priv)
+
 #endif /* _NFP_COMMON_H_ */
 /*
  * Local variables:
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index e9d01f4..bd9cf67 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -39,15 +39,15 @@
 #include "nfp_cpp_bridge.h"
 
 static int
-nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
+nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_hw_nic, int port)
 {
struct nfp_eth_table *nfp_eth_table;
struct nfp_net_hw *hw = NULL;
 
/* Grab a pointer to the correct physical port */
-   hw = pf_dev->ports[port];
+   hw = app_hw_nic->ports[port];
 
-   nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
+   nfp_eth_table = nfp_eth_read_ports(app_hw_nic->pf_dev->cpp);
 
nfp_eth_copy_mac((uint8_t *)&hw->mac_addr,
 (uint8_t *)&nfp_eth_table->ports[port].mac_addr);
@@ -64,6 +64,7 @@
uint32_t new_ctrl, update = 0;
struct nfp_net_hw *hw;
struct nfp_pf_dev *pf_dev;
+   struct nfp_app_fw_nic *app_hw_nic;
struct rte_eth_conf *dev_conf;
struct rte_eth_rxmode *rxmode;
uint32_t intr_vector;
@@ -71,6 +72,7 @@
 
hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+   app_hw_nic = NFP_PRIV_TO_APP_FW_NIC(pf_dev->app_fw_priv);
 
PMD_INIT_LOG(DEBUG, "Start");
 
@@ -82,7 +84,7 @@
 
/* check and configure queue intr-vector mapping */
if (dev->data->dev_conf.intr_conf.rxq != 0) {
-   if (pf_dev->multiport) {
+   if (app_hw_nic->multiport) {
PMD_INIT_LOG(ERR, "PMD rx interrupt is not supported "
  "with NFP multiport PF");
return -EINVAL;
@@ -250,6 +252,7 @@
struct nfp_net_hw *hw;
struct rte_pci_device *pci_dev;
struct nfp_pf_dev *pf_dev;
+   struct nfp_app_fw_nic *app_hw_nic;
int i;
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
@@ -260,6 +263,7 @@
pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+   app_

[PATCH v8 02/12] net/nfp: simplify initialization and remove dead code

2022-09-08 Thread Chaoyong He
Calling nfp_net_init() is only done for the corenic firmware flavor
and it is guaranteed to always be called from the primary process,
so the explicit check for RTE_PROC_PRIMARY can be dropped.

The calling graph of nfp_net_init() already guaranteed the free of
resources when it fail, so remove the necessary free logics inside it.

While at it remove the unused member is_phyport from struct nfp_net_hw.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/nfp_common.h |  1 -
 drivers/net/nfp/nfp_ethdev.c | 40 +++-
 2 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index bea5f95..6af8481 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -235,7 +235,6 @@ struct nfp_net_hw {
uint8_t idx;
/* Internal port number as seen from NFP */
uint8_t nfp_idx;
-   boolis_phyport;
 
union eth_table_entry *eth_table;
 
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index bd9cf67..955b214 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -417,7 +417,6 @@
uint32_t start_q;
int stride = 4;
int port = 0;
-   int err;
 
PMD_INIT_FUNC_TRACE();
 
@@ -452,10 +451,6 @@
PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
"NFP internal port number: %d", port, hw->nfp_idx);
 
-   /* For secondary processes, the primary has done all the work */
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-   return 0;
-
rte_eth_copy_pci_info(eth_dev, pci_dev);
 
hw->device_id = pci_dev->id.device_id;
@@ -506,8 +501,7 @@
break;
default:
PMD_DRV_LOG(ERR, "nfp_net: no device ID matching");
-   err = -ENODEV;
-   goto dev_err_ctrl_map;
+   return -ENODEV;
}
 
PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%" PRIx64 "", tx_bar_off);
@@ -573,8 +567,7 @@
   RTE_ETHER_ADDR_LEN, 0);
if (eth_dev->data->mac_addrs == NULL) {
PMD_INIT_LOG(ERR, "Failed to space for MAC address");
-   err = -ENOMEM;
-   goto dev_err_queues_map;
+   return -ENOMEM;
}
 
nfp_net_pf_read_mac(app_hw_nic, port);
@@ -604,24 +597,15 @@
 hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
 hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
 
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-   /* Registering LSC interrupt handler */
-   rte_intr_callback_register(pci_dev->intr_handle,
-   nfp_net_dev_interrupt_handler, (void *)eth_dev);
-   /* Telling the firmware about the LSC interrupt entry */
-   nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
-   /* Recording current stats counters values */
-   nfp_net_stats_reset(eth_dev);
-   }
+   /* Registering LSC interrupt handler */
+   rte_intr_callback_register(pci_dev->intr_handle,
+   nfp_net_dev_interrupt_handler, (void *)eth_dev);
+   /* Telling the firmware about the LSC interrupt entry */
+   nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+   /* Recording current stats counters values */
+   nfp_net_stats_reset(eth_dev);
 
return 0;
-
-dev_err_queues_map:
-   nfp_cpp_area_free(hw->hwqueues_area);
-dev_err_ctrl_map:
-   nfp_cpp_area_free(hw->ctrl_area);
-
-   return err;
 }
 
 #define DEFAULT_FW_PATH   "/lib/firmware/netronome"
@@ -820,7 +804,6 @@
hw->eth_dev = eth_dev;
hw->idx = i;
hw->nfp_idx = nfp_eth_table->ports[i].index;
-   hw->is_phyport = true;
 
eth_dev->device = &pf_dev->pci_dev->device;
 
@@ -886,8 +869,7 @@
 
if (cpp == NULL) {
PMD_INIT_LOG(ERR, "A CPP handle can not be obtained");
-   ret = -EIO;
-   goto error;
+   return -EIO;
}
 
hwinfo = nfp_hwinfo_read(cpp);
@@ -1008,7 +990,7 @@
free(hwinfo);
 cpp_cleanup:
nfp_cpp_free(cpp);
-error:
+
return ret;
 }
 
-- 
1.8.3.1



[PATCH v8 03/12] net/nfp: move app specific init logic to own function

2022-09-08 Thread Chaoyong He
The NFP card can load different firmware applications.
This commit move the init logic of corenic app of the
secondary process into its own function.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/nfp_ethdev.c | 90 +---
 1 file changed, 60 insertions(+), 30 deletions(-)

diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 955b214..19b26cb 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -994,6 +994,49 @@
return ret;
 }
 
+static int
+nfp_secondary_init_app_fw_nic(struct rte_pci_device *pci_dev,
+   struct nfp_rtsym_table *sym_tbl,
+   struct nfp_cpp *cpp)
+{
+   int i;
+   int err = 0;
+   int ret = 0;
+   int total_vnics;
+   struct nfp_net_hw *hw;
+
+   /* Read the number of vNIC's created for the PF */
+   total_vnics = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+   if (err != 0 || total_vnics <= 0 || total_vnics > 8) {
+   PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong 
value");
+   return -ENODEV;
+   }
+
+   for (i = 0; i < total_vnics; i++) {
+   struct rte_eth_dev *eth_dev;
+   char port_name[RTE_ETH_NAME_MAX_LEN];
+   snprintf(port_name, sizeof(port_name), "%s_port%d",
+   pci_dev->device.name, i);
+
+   PMD_INIT_LOG(DEBUG, "Secondary attaching to port %s", 
port_name);
+   eth_dev = rte_eth_dev_attach_secondary(port_name);
+   if (eth_dev == NULL) {
+   PMD_INIT_LOG(ERR, "Secondary process attach to port %s 
failed", port_name);
+   ret = -ENODEV;
+   break;
+   }
+
+   eth_dev->process_private = cpp;
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+   if (nfp_net_ethdev_ops_mount(hw, eth_dev))
+   return -EINVAL;
+
+   rte_eth_dev_probing_finish(eth_dev);
+   }
+
+   return ret;
+}
+
 /*
  * When attaching to the NFP4000/6000 PF on a secondary process there
  * is no need to initialise the PF again. Only minimal work is required
@@ -1002,12 +1045,10 @@
 static int
 nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
 {
-   int i;
int err = 0;
int ret = 0;
-   int total_ports;
struct nfp_cpp *cpp;
-   struct nfp_net_hw *hw;
+   enum nfp_app_fw_id app_fw_id;
struct nfp_rtsym_table *sym_tbl;
 
if (pci_dev == NULL)
@@ -1041,37 +1082,26 @@
return -EIO;
}
 
-   total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
-   if (err != 0 || total_ports <= 0 || total_ports > 8) {
-   PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong 
value");
-   ret = -ENODEV;
+   /* Read the app ID of the firmware loaded */
+   app_fw_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
+   if (err != 0) {
+   PMD_INIT_LOG(ERR, "Couldn't read app_fw_id from fw");
goto sym_tbl_cleanup;
}
 
-   for (i = 0; i < total_ports; i++) {
-   struct rte_eth_dev *eth_dev;
-   char port_name[RTE_ETH_NAME_MAX_LEN];
-
-   snprintf(port_name, sizeof(port_name), "%s_port%d",
-pci_dev->device.name, i);
-
-   PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
-   eth_dev = rte_eth_dev_attach_secondary(port_name);
-   if (eth_dev == NULL) {
-   RTE_LOG(ERR, EAL,
-   "secondary process attach failed, ethdev 
doesn't exist");
-   ret = -ENODEV;
-   break;
+   switch (app_fw_id) {
+   case NFP_APP_FW_CORE_NIC:
+   PMD_INIT_LOG(INFO, "Initializing coreNIC");
+   ret = nfp_secondary_init_app_fw_nic(pci_dev, sym_tbl, cpp);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+   goto sym_tbl_cleanup;
}
-
-   hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
-
-   if (nfp_net_ethdev_ops_mount(hw, eth_dev))
-   return -EINVAL;
-
-   eth_dev->process_private = cpp;
-
-   rte_eth_dev_probing_finish(eth_dev);
+   break;
+   default:
+   PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
+   ret = -EINVAL;
+   goto sym_tbl_cleanup;
}
 
if (ret != 0)
-- 
1.8.3.1



[PATCH v8 04/12] net/nfp: add initial flower firmware support

2022-09-08 Thread Chaoyong He
Adds the basic probing infrastructure to support the flower firmware.

Adds the basic infrastructure needed by the flower firmware to operate.
The firmware requires threads to service both the PF vNIC and the ctrl
vNIC. The PF is responsible for handling any fallback traffic and the
ctrl vNIC is used to communicate various control messages to and
from the smartNIC. rte_services are used to facilitate this logic.

Adds the cpp service, used for some user tools.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
 doc/guides/rel_notes/release_22_11.rst |  3 ++
 drivers/net/nfp/flower/nfp_flower.c| 59 +++
 drivers/net/nfp/flower/nfp_flower.h| 18 +++
 drivers/net/nfp/meson.build|  1 +
 drivers/net/nfp/nfp_common.h   |  1 +
 drivers/net/nfp/nfp_cpp_bridge.c   | 87 +-
 drivers/net/nfp/nfp_cpp_bridge.h   |  6 ++-
 drivers/net/nfp/nfp_ethdev.c   | 28 ++-
 8 files changed, 187 insertions(+), 16 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index f601617..bb170e3 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,9 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+   * **Updated Netronome nfp driver.**
+ Added the support of flower firmware.
+ Added the flower service infrastructure.
 
 Removed Items
 -
diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
new file mode 100644
index 000..e0506bd
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_cpp_bridge.h"
+#include "nfp_flower.h"
+
+int
+nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
+{
+   int ret;
+   unsigned int numa_node;
+   struct nfp_net_hw *pf_hw;
+   struct nfp_app_fw_flower *app_fw_flower;
+
+   numa_node = rte_socket_id();
+
+   /* Allocate memory for the Flower app */
+   app_fw_flower = rte_zmalloc_socket("nfp_app_fw_flower", 
sizeof(*app_fw_flower),
+   RTE_CACHE_LINE_SIZE, numa_node);
+   if (app_fw_flower == NULL) {
+   ret = -ENOMEM;
+   goto done;
+   }
+
+   pf_dev->app_fw_priv = app_fw_flower;
+
+   /* Allocate memory for the PF AND ctrl vNIC here (hence the * 2) */
+   pf_hw = rte_zmalloc_socket("nfp_pf_vnic", 2 * sizeof(struct 
nfp_net_adapter),
+   RTE_CACHE_LINE_SIZE, numa_node);
+   if (pf_hw == NULL) {
+   ret = -ENOMEM;
+   goto app_cleanup;
+   }
+
+   return 0;
+
+app_cleanup:
+   rte_free(app_fw_flower);
+done:
+   return ret;
+}
+
+int
+nfp_secondary_init_app_fw_flower(__rte_unused struct nfp_cpp *cpp)
+{
+   PMD_INIT_LOG(ERR, "Flower firmware not supported");
+   return -ENOTSUP;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower.h 
b/drivers/net/nfp/flower/nfp_flower.h
new file mode 100644
index 000..51e05e8
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_H_
+#define _NFP_FLOWER_H_
+
+/* The flower application's private structure */
+struct nfp_app_fw_flower {
+   /* Pointer to the PF vNIC */
+   struct nfp_net_hw *pf_hw;
+};
+
+int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
+int nfp_secondary_init_app_fw_flower(struct nfp_cpp *cpp);
+
+#endif /* _NFP_FLOWER_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 810f02a..7ae3115 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -6,6 +6,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 reason = 'only supported on 64-bit Linux'
 endif
 sources = files(
+'flower/nfp_flower.c',
 'nfpcore/nfp_cpp_pcie_ops.c',
 'nfpcore/nfp_nsp.c',
 'nfpcore/nfp_cppcore.c',
diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 6af8481..cefe717 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -114,6 +114,7 @@
 /* Firmware application ID's */
 enum nfp_app_fw_id {
NFP_APP_FW_CORE_NIC   = 0x1,
+   NFP_APP_FW_FLOWER_NIC = 0x3,
 };
 
 /* nfp_qcp_ptr - Read or Write Pointer of a queue */
diff --git a/drivers/net/nfp/nfp_cpp_bridge.c b/drivers/net/nfp/nfp_cpp_bridge.c
index 092

[PATCH v8 05/12] net/nfp: add flower PF setup logic

2022-09-08 Thread Chaoyong He
Adds the vNIC initialization logic for the flower PF vNIC. The flower
firmware exposes this vNIC for the purposes of fallback traffic in the
switchdev use-case.

Adds minimal dev_ops for this PF device. Because the device is being
exposed externally to DPDK it should also be configured using DPDK
helpers like rte_eth_configure(). For these helpers to work the flower
logic needs to implements a minimal set of dev_ops.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/flower/nfp_flower.c| 398 -
 drivers/net/nfp/flower/nfp_flower.h|   6 +
 drivers/net/nfp/flower/nfp_flower_ovs_compat.h |  37 +++
 drivers/net/nfp/nfp_common.h   |   3 +
 4 files changed, 441 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ovs_compat.h

diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index e0506bd..ad0dff1 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -13,7 +13,353 @@
 #include "../nfp_logs.h"
 #include "../nfp_ctrl.h"
 #include "../nfp_cpp_bridge.h"
+#include "../nfp_rxtx.h"
+#include "../nfpcore/nfp_mip.h"
+#include "../nfpcore/nfp_rtsym.h"
+#include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
+#include "nfp_flower_ovs_compat.h"
+
+#define MAX_PKT_BURST 32
+#define MEMPOOL_CACHE_SIZE 512
+#define DEFAULT_FLBUF_SIZE 9216
+
+#define PF_VNIC_NB_DESC 1024
+
+static const struct rte_eth_rxconf rx_conf = {
+   .rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 1,
+};
+
+static const struct rte_eth_txconf tx_conf = {
+   .tx_thresh = {
+   .pthresh  = DEFAULT_TX_PTHRESH,
+   .hthresh = DEFAULT_TX_HTHRESH,
+   .wthresh = DEFAULT_TX_WTHRESH,
+   },
+   .tx_free_thresh = DEFAULT_TX_FREE_THRESH,
+};
+
+static const struct eth_dev_ops nfp_flower_pf_vnic_ops = {
+   .dev_infos_get  = nfp_net_infos_get,
+};
+
+static void
+nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
+   __rte_unused void *opaque_arg,
+   void *packet,
+   __rte_unused unsigned int i)
+{
+   struct dp_packet *pkt = packet;
+   pkt->source  = DPBUF_DPDK;
+   pkt->l2_pad_size = 0;
+   pkt->l2_5_ofs= UINT16_MAX;
+   pkt->l3_ofs  = UINT16_MAX;
+   pkt->l4_ofs  = UINT16_MAX;
+   pkt->packet_type = 0; /* PT_ETH */
+}
+
+static struct rte_mempool *
+nfp_flower_pf_mp_create(void)
+{
+   uint32_t nb_mbufs;
+   uint32_t pkt_size;
+   unsigned int numa_node;
+   uint32_t aligned_mbuf_size;
+   uint32_t mbuf_priv_data_len;
+   struct rte_mempool *pktmbuf_pool;
+   uint32_t n_rxd = PF_VNIC_NB_DESC;
+   uint32_t n_txd = PF_VNIC_NB_DESC;
+
+   nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST + MEMPOOL_CACHE_SIZE, 
81920U);
+
+   /*
+* The size of the mbuf's private area (i.e. area that holds OvS'
+* dp_packet data)
+*/
+   mbuf_priv_data_len = sizeof(struct dp_packet) - sizeof(struct rte_mbuf);
+   /* The size of the entire dp_packet. */
+   pkt_size = sizeof(struct dp_packet) + RTE_MBUF_DEFAULT_BUF_SIZE;
+   /* mbuf size, rounded up to cacheline size. */
+   aligned_mbuf_size = RTE_CACHE_LINE_ROUNDUP(pkt_size);
+   mbuf_priv_data_len += (aligned_mbuf_size - pkt_size);
+
+   numa_node = rte_socket_id();
+   pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool", nb_mbufs,
+   MEMPOOL_CACHE_SIZE, mbuf_priv_data_len,
+   RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
+   if (pktmbuf_pool == NULL) {
+   PMD_INIT_LOG(ERR, "Cannot init pf vnic mbuf pool");
+   return NULL;
+   }
+
+   rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init, NULL);
+
+   return pktmbuf_pool;
+}
+
+static int
+nfp_flower_init_vnic_common(struct nfp_net_hw *hw, const char *vnic_type)
+{
+   uint32_t start_q;
+   uint64_t rx_bar_off;
+   uint64_t tx_bar_off;
+   const int stride = 4;
+   struct nfp_pf_dev *pf_dev;
+   struct rte_pci_device *pci_dev;
+
+   pf_dev = hw->pf_dev;
+   pci_dev = hw->pf_dev->pci_dev;
+
+   /* NFP can not handle DMA addresses requiring more than 40 bits */
+   if (rte_mem_check_dma_mask(40)) {
+   PMD_INIT_LOG(ERR, "Device %s can not be used: restricted dma 
mask to 40 bits!\n",
+   pci_dev->device.name);
+   return -ENODEV;
+   };
+
+   hw->device_id = pci_dev->id.device_id;
+   hw->vendor_id = pci_dev->id.vendor_id;
+   hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
+   hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
+
+   PMD_INIT_LOG(DEBUG, "%s vNIC ctrl bar: %p", vnic_type, hw->ctrl_bar);
+
+   /* Read the number of available rx/tx queues from hardware */
+   hw->max_rx_queues = nn

[PATCH v8 06/12] net/nfp: add flower PF related routines

2022-09-08 Thread Chaoyong He
Adds the start/stop/close routine of the flower PF vNIC.
And we reuse the configure/link_update routine.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/flower/nfp_flower.c | 185 
 1 file changed, 185 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index ad0dff1..e0c0ab3 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -40,8 +41,168 @@
.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
 };
 
+static int
+nfp_flower_pf_start(struct rte_eth_dev *dev)
+{
+   int ret;
+   uint32_t new_ctrl;
+   uint32_t update = 0;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* Disabling queues just in case... */
+   nfp_net_disable_queues(dev);
+
+   /* Enabling the required queues in the device */
+   nfp_net_enable_queues(dev);
+
+   new_ctrl = nfp_check_offloads(dev);
+
+   /* Writing configuration parameters in the device */
+   nfp_net_params_setup(hw);
+
+   nfp_net_rss_config_default(dev);
+   update |= NFP_NET_CFG_UPDATE_RSS;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RSS2)
+   new_ctrl |= NFP_NET_CFG_CTRL_RSS2;
+   else
+   new_ctrl |= NFP_NET_CFG_CTRL_RSS;
+
+   /* Enable device */
+   new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+
+   update |= NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+   new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+   nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+
+   /* If an error when reconfig we avoid to change hw state */
+   ret = nfp_net_reconfig(hw, new_ctrl, update);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Failed to reconfig PF vnic");
+   return -EIO;
+   }
+
+   hw->ctrl = new_ctrl;
+
+   /* Setup the freelist ring */
+   ret = nfp_net_rx_freelist_setup(dev);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Error with flower PF vNIC freelist setup");
+   return -EIO;
+   }
+
+   return 0;
+}
+
+/* Stop device: disable rx and tx functions to allow for reconfiguring. */
+static int
+nfp_flower_pf_stop(struct rte_eth_dev *dev)
+{
+   uint16_t i;
+   struct nfp_net_hw *hw;
+   struct nfp_net_txq *this_tx_q;
+   struct nfp_net_rxq *this_rx_q;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   nfp_net_disable_queues(dev);
+
+   /* Clear queues */
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+   nfp_net_reset_tx_queue(this_tx_q);
+   }
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+   nfp_net_reset_rx_queue(this_rx_q);
+   }
+
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   /* Configure the physical port down */
+   nfp_eth_set_configured(hw->cpp, hw->nfp_idx, 0);
+   else
+   nfp_eth_set_configured(dev->process_private, hw->nfp_idx, 0);
+
+   return 0;
+}
+
+/* Reset and stop device. The device can not be restarted. */
+static int
+nfp_flower_pf_close(struct rte_eth_dev *dev)
+{
+   uint16_t i;
+   struct nfp_net_hw *hw;
+   struct nfp_pf_dev *pf_dev;
+   struct nfp_net_txq *this_tx_q;
+   struct nfp_net_rxq *this_rx_q;
+   struct rte_pci_device *pci_dev;
+   struct nfp_app_fw_flower *app_fw_flower;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+   app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+   /*
+* We assume that the DPDK application is stopping all the
+* threads/queues before calling the device close function.
+*/
+   nfp_net_disable_queues(dev);
+
+   /* Clear queues */
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+   nfp_net_reset_tx_queue(this_tx_q);
+   }
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+   nfp_net_reset_rx_queue(this_rx_q);
+   }
+
+   /* Cancel possible impending LSC work here before releasing the port*/
+   rte_eal_alarm_cancel(nfp_net_dev_interrupt_delayed_handler, (void 
*)dev);
+
+   nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
+
+   rte_eth_dev_release_port(dev);
+
+   /* Now it is safe to free all PF resource

[PATCH v8 07/12] net/nfp: add flower ctrl VNIC related logics

2022-09-08 Thread Chaoyong He
Adds the setup/start logic for the ctrl vNIC. This vNIC is used by
the PMD and flower firmware as a communication channel between driver
and firmware. In the case of OVS it is also used to communicate flow
statistics from hardware to the driver.

A rte_eth device is not exposed to DPDK for this vNIC as it is strictly
used internally by flower logic. Rx and Tx logic will be added later for
this vNIC.

Because of the add of ctrl vNIC, a new PCItoCPPBar is needed. Modify the
related logics.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/flower/nfp_flower.c| 264 +
 drivers/net/nfp/flower/nfp_flower.h|   6 +
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c |  31 ++--
 3 files changed, 289 insertions(+), 12 deletions(-)

diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index e0c0ab3..4d07416 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -26,6 +26,7 @@
 #define DEFAULT_FLBUF_SIZE 9216
 
 #define PF_VNIC_NB_DESC 1024
+#define CTRL_VNIC_NB_DESC 512
 
 static const struct rte_eth_rxconf rx_conf = {
.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
@@ -205,6 +206,11 @@
.dev_close  = nfp_flower_pf_close,
 };
 
+static const struct eth_dev_ops nfp_flower_ctrl_vnic_ops = {
+   .dev_infos_get  = nfp_net_infos_get,
+   .dev_configure  = nfp_net_configure,
+};
+
 static void
 nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
__rte_unused void *opaque_arg,
@@ -489,6 +495,176 @@
 }
 
 static int
+nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw)
+{
+   uint32_t i;
+   int ret = 0;
+   uint16_t n_txq;
+   uint16_t n_rxq;
+   uint16_t port_id;
+   unsigned int numa_node;
+   struct rte_mempool *mp;
+   struct nfp_pf_dev *pf_dev;
+   struct rte_eth_dev *eth_dev;
+   struct nfp_app_fw_flower *app_fw_flower;
+
+   static const struct rte_eth_conf port_conf = {
+   .rxmode = {
+   .mq_mode = RTE_ETH_MQ_RX_NONE,
+   },
+   .txmode = {
+   .mq_mode = RTE_ETH_MQ_TX_NONE,
+   },
+   };
+
+   /* Set up some pointers here for ease of use */
+   pf_dev = hw->pf_dev;
+   app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+   ret = nfp_flower_init_vnic_common(hw, "ctrl_vnic");
+   if (ret != 0)
+   goto done;
+
+   /* Allocate memory for the eth_dev of the vNIC */
+   hw->eth_dev = rte_eth_dev_allocate("nfp_ctrl_vnic");
+   if (hw->eth_dev == NULL) {
+   ret = -ENOMEM;
+   goto done;
+   }
+
+   /* Grab the pointer to the newly created rte_eth_dev here */
+   eth_dev = hw->eth_dev;
+
+   numa_node = rte_socket_id();
+
+   /* Fill in some of the eth_dev fields */
+   eth_dev->device = &pf_dev->pci_dev->device;
+   eth_dev->data->dev_private = hw;
+
+   /* Create a mbuf pool for the ctrl vNIC */
+   app_fw_flower->ctrl_pktmbuf_pool = 
rte_pktmbuf_pool_create("ctrl_mbuf_pool",
+   4 * CTRL_VNIC_NB_DESC, 64, 0, 9216, numa_node);
+   if (app_fw_flower->ctrl_pktmbuf_pool == NULL) {
+   PMD_INIT_LOG(ERR, "create mbuf pool for ctrl vnic failed");
+   ret = -ENOMEM;
+   goto port_release;
+   }
+
+   mp = app_fw_flower->ctrl_pktmbuf_pool;
+
+   eth_dev->dev_ops = &nfp_flower_ctrl_vnic_ops;
+   rte_eth_dev_probing_finish(eth_dev);
+
+   /* Configure the ctrl vNIC device */
+   n_rxq = hw->max_rx_queues;
+   n_txq = hw->max_tx_queues;
+   port_id = hw->eth_dev->data->port_id;
+
+   ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Could not configure ctrl vNIC device %d", 
ret);
+   goto mempool_cleanup;
+   }
+
+   /* Set up the Rx queues */
+   for (i = 0; i < n_rxq; i++) {
+   ret = nfp_net_rx_queue_setup(eth_dev, i, CTRL_VNIC_NB_DESC, 
numa_node,
+   &rx_conf, mp);
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Configure ctrl vNIC Rx queue %d 
failed", i);
+   goto rx_queue_cleanup;
+   }
+   }
+
+   /* Set up the Tx queues */
+   for (i = 0; i < n_txq; i++) {
+   ret = nfp_net_nfd3_tx_queue_setup(eth_dev, i, 
CTRL_VNIC_NB_DESC, numa_node,
+   &tx_conf);
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Configure ctrl vNIC Tx queue %d 
failed", i);
+   goto tx_queue_cleanup;
+   }
+   }
+
+   return 0;
+
+tx_queue_cleanup:
+   for (i = 0; i < n_txq; i++)
+   nfp_net_tx_queue_release(eth_dev, i);
+rx_queue_cleanup:
+   for (i = 0; i < n_rxq; i++)
+  

[PATCH v8 08/12] net/nfp: move common rxtx function for flower use

2022-09-08 Thread Chaoyong He
Move some common Rx and Tx logic to the rxtx header file so that
they can be re-used by flower tx and rx logic.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/nfp_rxtx.c | 32 +---
 drivers/net/nfp/nfp_rxtx.h | 31 +++
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8429b44..8d63a7b 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -116,12 +116,6 @@
return count;
 }
 
-static inline void
-nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
-{
-   rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
-}
-
 /*
  * nfp_net_set_hash - Set mbuf hash data
  *
@@ -583,7 +577,7 @@
  * @txq: TX queue to work with
  * Returns number of descriptors freed
  */
-static int
+int
 nfp_net_tx_free_bufs(struct nfp_net_txq *txq)
 {
uint32_t qcp_rd_p;
@@ -774,30 +768,6 @@
return 0;
 }
 
-/* Leaving always free descriptors for avoiding wrapping confusion */
-static inline
-uint32_t nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
-{
-   if (txq->wr_p >= txq->rd_p)
-   return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
-   else
-   return txq->rd_p - txq->wr_p - 8;
-}
-
-/*
- * nfp_net_txq_full - Check if the TX queue free descriptors
- * is below tx_free_threshold
- *
- * @txq: TX queue to check
- *
- * This function uses the host copy* of read/write pointers
- */
-static inline
-uint32_t nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
-{
-   return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
-}
-
 /* nfp_net_tx_tso - Set TX descriptor for TSO */
 static inline void
 nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h
index 5c005d7..a30171f 100644
--- a/drivers/net/nfp/nfp_rxtx.h
+++ b/drivers/net/nfp/nfp_rxtx.h
@@ -330,6 +330,36 @@ struct nfp_net_rxq {
int rx_qcidx;
 } __rte_aligned(64);
 
+static inline void
+nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
+{
+   rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
+}
+
+/* Leaving always free descriptors for avoiding wrapping confusion */
+static inline uint32_t
+nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
+{
+   if (txq->wr_p >= txq->rd_p)
+   return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
+   else
+   return txq->rd_p - txq->wr_p - 8;
+}
+
+/*
+ * nfp_net_nfd3_txq_full - Check if the TX queue free descriptors
+ * is below tx_free_threshold
+ *
+ * @txq: TX queue to check
+ *
+ * This function uses the host copy* of read/write pointers
+ */
+static inline uint32_t
+nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
+{
+   return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
+}
+
 int nfp_net_rx_freelist_setup(struct rte_eth_dev *dev);
 uint32_t nfp_net_rx_queue_count(void *rx_queue);
 uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
@@ -355,6 +385,7 @@ int nfp_net_nfdk_tx_queue_setup(struct rte_eth_dev *dev,
 uint16_t nfp_net_nfdk_xmit_pkts(void *tx_queue,
struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);
+int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
 
 #endif /* _NFP_RXTX_H_ */
 /*
-- 
1.8.3.1



[PATCH v8 09/12] net/nfp: add flower ctrl VNIC rxtx logic

2022-09-08 Thread Chaoyong He
Adds a Rx and Tx function for the ctrl VNIC. The logic is mostly
identical to the normal Rx and Tx functionality of the NFP PMD.

Make use of the ctrl VNIC service logic to service the ctrl VNIC Rx
path.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
 doc/guides/rel_notes/release_22_11.rst   |   1 +
 drivers/net/nfp/flower/nfp_flower.c  |   3 +
 drivers/net/nfp/flower/nfp_flower.h  |  14 ++
 drivers/net/nfp/flower/nfp_flower_ctrl.c | 250 +++
 drivers/net/nfp/flower/nfp_flower_ctrl.h |  13 ++
 drivers/net/nfp/meson.build  |   1 +
 6 files changed, 282 insertions(+)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index bb170e3..0fe928a 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -58,6 +58,7 @@ New Features
* **Updated Netronome nfp driver.**
  Added the support of flower firmware.
  Added the flower service infrastructure.
+ Added the control message interactive channels between PMD and firmware.
 
 Removed Items
 -
diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index 4d07416..b873eba 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -20,6 +20,7 @@
 #include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
 #include "nfp_flower_ovs_compat.h"
+#include "nfp_flower_ctrl.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -708,6 +709,8 @@
goto ctrl_vnic_cleanup;
}
 
+   nfp_flower_ctrl_vnic_poll(app_fw_flower);
+
return 0;
 
 ctrl_vnic_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h 
b/drivers/net/nfp/flower/nfp_flower.h
index e18703e..e96d3b2 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -6,6 +6,14 @@
 #ifndef _NFP_FLOWER_H_
 #define _NFP_FLOWER_H_
 
+/*
+ * Flower fallback and ctrl path always adds and removes
+ * 8 bytes of prepended data. Tx descriptors must point
+ * to the correct packet data offset after metadata has
+ * been added
+ */
+#define FLOWER_PKT_DATA_OFFSET 8
+
 /* The flower application's private structure */
 struct nfp_app_fw_flower {
/* Pointer to a mempool for the PF vNIC */
@@ -22,6 +30,12 @@ struct nfp_app_fw_flower {
 
/* the eth table as reported by firmware */
struct nfp_eth_table *nfp_eth_table;
+
+   /* Ctrl vNIC Rx counter */
+   uint64_t ctrl_vnic_rx_count;
+
+   /* Ctrl vNIC Tx counter */
+   uint64_t ctrl_vnic_tx_count;
 };
 
 int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_ctrl.c 
b/drivers/net/nfp/flower/nfp_flower_ctrl.c
new file mode 100644
index 000..0c04f75
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ctrl.c
@@ -0,0 +1,250 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include 
+#include 
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_rxtx.h"
+#include "nfp_flower.h"
+#include "nfp_flower_ctrl.h"
+
+#define MAX_PKT_BURST 32
+
+static uint16_t
+nfp_flower_ctrl_vnic_recv(void *rx_queue,
+   struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts)
+{
+   uint64_t dma_addr;
+   uint16_t avail = 0;
+   struct rte_mbuf *mb;
+   uint16_t nb_hold = 0;
+   struct nfp_net_hw *hw;
+   struct nfp_net_rxq *rxq;
+   struct rte_mbuf *new_mb;
+   struct nfp_net_rx_buff *rxb;
+   struct nfp_net_rx_desc *rxds;
+
+   rxq = rx_queue;
+   if (unlikely(rxq == NULL)) {
+   /*
+* DPDK just checks the queue is lower than max queues
+* enabled. But the queue needs to be configured
+*/
+   PMD_RX_LOG(ERR, "RX Bad queue");
+   return 0;
+   }
+
+   hw = rxq->hw;
+   while (avail < nb_pkts) {
+   rxb = &rxq->rxbufs[rxq->rd_p];
+   if (unlikely(rxb == NULL)) {
+   PMD_RX_LOG(ERR, "rxb does not exist!");
+   break;
+   }
+
+   rxds = &rxq->rxds[rxq->rd_p];
+   if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+   break;
+
+   /*
+* Memory barrier to ensure that we won't do other
+* reads before the DD bit.
+*/
+   rte_rmb();
+
+   /*
+* We got a packet. Let's alloc a new mbuf for refilling the
+* free descriptor ring as soon as possible
+*/
+   new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+   if (unlikely(new_mb == NULL)) {

[PATCH v8 10/12] net/nfp: add flower representor framework

2022-09-08 Thread Chaoyong He
Adds the framework to support flower representors. The number of VF
representors are parsed from the command line. For physical port
representors the current logic aims to create a representor for
each physical port present on the hardware.

An eth_dev is created for each physical port and VF, and flower
firmware requires a MAC repr cmsg to be transmitted to firmware
with info about the number of physical ports configured.

Reify messages are sent to hardware for each physical port representor.
An rte_ring is also created per representor so that traffic can be
pushed and pulled to this interface.

To up and down the real device represented by a flower representor port
a port mod message is used to convey that info to the firmware. This
message will be used in the dev_ops callbacks of flower representors.

Each cmsg generated by the driver is prepended with a cmsg header.
This commit also adds the logic to fill in the header of cmsgs.

Also add the Rx and Tx path for flower representors. For Rx packets are
dequeued from the representor ring and passed to the eth_dev. For Tx
the first queue of the PF vNIC is used. Metadata about the representor
is added before the packet is sent down to firmware.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 doc/guides/rel_notes/release_22_11.rst  |   1 +
 drivers/net/nfp/flower/nfp_flower.c |   7 +
 drivers/net/nfp/flower/nfp_flower.h |  18 +
 drivers/net/nfp/flower/nfp_flower_cmsg.c| 186 +++
 drivers/net/nfp/flower/nfp_flower_cmsg.h| 173 ++
 drivers/net/nfp/flower/nfp_flower_representor.c | 664 
 drivers/net/nfp/flower/nfp_flower_representor.h |  39 ++
 drivers/net/nfp/meson.build |   2 +
 8 files changed, 1090 insertions(+)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 0fe928a..f2e4649 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -59,6 +59,7 @@ New Features
  Added the support of flower firmware.
  Added the flower service infrastructure.
  Added the control message interactive channels between PMD and firmware.
+ Added the support of representor port.
 
 Removed Items
 -
diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index b873eba..3f1b7d1 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -21,6 +21,7 @@
 #include "nfp_flower.h"
 #include "nfp_flower_ovs_compat.h"
 #include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -709,6 +710,12 @@
goto ctrl_vnic_cleanup;
}
 
+   ret = nfp_flower_repr_create(app_fw_flower);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Could not create representor port");
+   goto ctrl_vnic_cleanup;
+   }
+
nfp_flower_ctrl_vnic_poll(app_fw_flower);
 
return 0;
diff --git a/drivers/net/nfp/flower/nfp_flower.h 
b/drivers/net/nfp/flower/nfp_flower.h
index e96d3b2..5d43656 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -14,8 +14,20 @@
  */
 #define FLOWER_PKT_DATA_OFFSET 8
 
+#define MAX_FLOWER_PHYPORTS 8
+#define MAX_FLOWER_VFS 64
+
 /* The flower application's private structure */
 struct nfp_app_fw_flower {
+   /* switch domain for this app */
+   uint16_t switch_domain_id;
+
+   /* Number of VF representors */
+   uint8_t num_vf_reprs;
+
+   /* Number of phyport representors */
+   uint8_t num_phyport_reprs;
+
/* Pointer to a mempool for the PF vNIC */
struct rte_mempool *pf_pktmbuf_pool;
 
@@ -36,6 +48,12 @@ struct nfp_app_fw_flower {
 
/* Ctrl vNIC Tx counter */
uint64_t ctrl_vnic_tx_count;
+
+   /* Array of phyport representors */
+   struct nfp_flower_representor *phy_reprs[MAX_FLOWER_PHYPORTS];
+
+   /* Array of VF representors */
+   struct nfp_flower_representor *vf_reprs[MAX_FLOWER_VFS];
 };
 
 int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.c 
b/drivers/net/nfp/flower/nfp_flower_cmsg.c
new file mode 100644
index 000..d16d373
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_cmsg.c
@@ -0,0 +1,186 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include "../nfpcore/nfp_nsp.h"
+#include "../nfp_logs.h"
+#include "../nfp_common.h"
+#include "nfp_flower.h"
+#include "nfp_flower_cmsg.h"
+#include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
+
+static void

[PATCH v8 11/12] net/nfp: move rxtx function to header file

2022-09-08 Thread Chaoyong He
Flower makes use of the same Rx and Tx checksum logic as the normal PMD.
Expose it so that flower can make use of it.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/nfp_common.c|  2 +-
 drivers/net/nfp/nfp_ethdev.c|  2 +-
 drivers/net/nfp/nfp_ethdev_vf.c |  2 +-
 drivers/net/nfp/nfp_rxtx.c  | 91 +
 drivers/net/nfp/nfp_rxtx.h  | 90 
 5 files changed, 94 insertions(+), 93 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.c b/drivers/net/nfp/nfp_common.c
index 0e55f0c..e86929c 100644
--- a/drivers/net/nfp/nfp_common.c
+++ b/drivers/net/nfp/nfp_common.c
@@ -38,9 +38,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include 
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index fea41b7..13fed4b 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -33,9 +33,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include "flower/nfp_flower.h"
diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c
index d304d78..ceaf618 100644
--- a/drivers/net/nfp/nfp_ethdev_vf.c
+++ b/drivers/net/nfp/nfp_ethdev_vf.c
@@ -19,9 +19,9 @@
 #include "nfpcore/nfp_rtsym.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 
 static void
 nfp_netvf_read_mac(struct nfp_net_hw *hw)
diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8d63a7b..95403a3 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -17,9 +17,9 @@
 #include 
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfpcore/nfp_mip.h"
 #include "nfpcore/nfp_rtsym.h"
 #include "nfpcore/nfp-common/nfp_platform.h"
@@ -208,34 +208,6 @@
}
 }
 
-/* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
-static inline void
-nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
-struct rte_mbuf *mb)
-{
-   struct nfp_net_hw *hw = rxq->hw;
-
-   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
-   return;
-
-   /* If IPv4 and IP checksum error, fail */
-   if (unlikely((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
-   !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)))
-   mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
-   else
-   mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
-
-   /* If neither UDP nor TCP return */
-   if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
-   !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
-   return;
-
-   if (likely(rxd->rxd.flags & PCIE_DESC_RX_L4_CSUM_OK))
-   mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
-   else
-   mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
-}
-
 /*
  * RX path design:
  *
@@ -768,67 +740,6 @@
return 0;
 }
 
-/* nfp_net_tx_tso - Set TX descriptor for TSO */
-static inline void
-nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
-  struct rte_mbuf *mb)
-{
-   uint64_t ol_flags;
-   struct nfp_net_hw *hw = txq->hw;
-
-   if (!(hw->cap & NFP_NET_CFG_CTRL_LSO_ANY))
-   goto clean_txd;
-
-   ol_flags = mb->ol_flags;
-
-   if (!(ol_flags & RTE_MBUF_F_TX_TCP_SEG))
-   goto clean_txd;
-
-   txd->l3_offset = mb->l2_len;
-   txd->l4_offset = mb->l2_len + mb->l3_len;
-   txd->lso_hdrlen = mb->l2_len + mb->l3_len + mb->l4_len;
-   txd->mss = rte_cpu_to_le_16(mb->tso_segsz);
-   txd->flags = PCIE_DESC_TX_LSO;
-   return;
-
-clean_txd:
-   txd->flags = 0;
-   txd->l3_offset = 0;
-   txd->l4_offset = 0;
-   txd->lso_hdrlen = 0;
-   txd->mss = 0;
-}
-
-/* nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor */
-static inline void
-nfp_net_nfd3_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc 
*txd,
-struct rte_mbuf *mb)
-{
-   uint64_t ol_flags;
-   struct nfp_net_hw *hw = txq->hw;
-
-   if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
-   return;
-
-   ol_flags = mb->ol_flags;
-
-   /* IPv6 does not need checksum */
-   if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
-   txd->flags |= PCIE_DESC_TX_IP4_CSUM;
-
-   switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
-   case RTE_MBUF_F_TX_UDP_CKSUM:
-   txd->flags |= PCIE_DESC_TX_UDP_CSUM;
-   break;
-   case RTE_MBUF_F_TX_TCP_CKSUM:
-   txd->flags |= PCIE_DESC_TX_TCP_CSUM;
-   break;
-   }
-
-

[PATCH v8 12/12] net/nfp: add flower PF rxtx logic

2022-09-08 Thread Chaoyong He
Implements the flower Rx logic. Fallback packets are multiplexed to the
correct representor port based on the prepended metadata. The Rx poll
is set to run on the existing service infrastructure.

For Tx the existing NFP Tx logic is duplicated to keep the Tx two paths
distinct. Flower fallback also adds 8 bytes of metadata to the start of
the packet that has to be adjusted for in the Tx descriptor.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
 drivers/net/nfp/flower/nfp_flower.c | 414 
 1 file changed, 414 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower.c 
b/drivers/net/nfp/flower/nfp_flower.c
index 3f1b7d1..b6824b9 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -22,6 +22,7 @@
 #include "nfp_flower_ovs_compat.h"
 #include "nfp_flower_ctrl.h"
 #include "nfp_flower_representor.h"
+#include "nfp_flower_cmsg.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -213,6 +214,383 @@
.dev_configure  = nfp_net_configure,
 };
 
+static inline void
+nfp_flower_parse_metadata(struct nfp_net_rxq *rxq,
+   struct nfp_net_rx_desc *rxd,
+   struct rte_mbuf *mbuf,
+   uint32_t *portid)
+{
+   uint32_t meta_info;
+   uint8_t *meta_offset;
+   struct nfp_net_hw *hw;
+
+   hw = rxq->hw;
+   if (!((hw->ctrl & NFP_NET_CFG_CTRL_RSS) ||
+   (hw->ctrl & NFP_NET_CFG_CTRL_RSS2)))
+   return;
+
+   meta_offset = rte_pktmbuf_mtod(mbuf, uint8_t *);
+   meta_offset -= NFP_DESC_META_LEN(rxd);
+   meta_info = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+   meta_offset += 4;
+
+   while (meta_info) {
+   switch (meta_info & NFP_NET_META_FIELD_MASK) {
+   /* Expect flower firmware to only send packets with META_PORTID 
*/
+   case NFP_NET_META_PORTID:
+   *portid = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+   meta_offset += 4;
+   meta_info >>= NFP_NET_META_FIELD_SIZE;
+   break;
+   default:
+   /* Unsupported metadata can be a performance issue */
+   return;
+   }
+   }
+}
+
+static inline struct nfp_flower_representor *
+nfp_flower_get_repr(struct nfp_net_hw *hw,
+   uint32_t port_id)
+{
+   uint8_t port;
+   struct nfp_app_fw_flower *app_fw_flower;
+
+   /* Obtain handle to app_fw_flower here */
+   app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(hw->pf_dev->app_fw_priv);
+
+   switch (NFP_FLOWER_CMSG_PORT_TYPE(port_id)) {
+   case NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT:
+   port =  NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(port_id);
+   return app_fw_flower->phy_reprs[port];
+   case NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT:
+   port = NFP_FLOWER_CMSG_PORT_VNIC(port_id);
+   return app_fw_flower->vf_reprs[port];
+   default:
+   break;
+   }
+
+   return NULL;
+}
+
+static uint16_t
+nfp_flower_pf_recv_pkts(void *rx_queue,
+   struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts)
+{
+   /*
+* We need different counters for packets given to the caller
+* and packets sent to representors
+*/
+   int avail = 0;
+   int avail_multiplexed = 0;
+   uint64_t dma_addr;
+   uint32_t meta_portid;
+   uint16_t nb_hold = 0;
+   struct rte_mbuf *mb;
+   struct nfp_net_hw *hw;
+   struct rte_mbuf *new_mb;
+   struct nfp_net_rxq *rxq;
+   struct nfp_net_rx_buff *rxb;
+   struct nfp_net_rx_desc *rxds;
+   struct nfp_flower_representor *repr;
+
+   rxq = rx_queue;
+   if (unlikely(rxq == NULL)) {
+   /*
+* DPDK just checks the queue is lower than max queues
+* enabled. But the queue needs to be configured
+*/
+   RTE_LOG_DP(ERR, PMD, "RX Bad queue\n");
+   return 0;
+   }
+
+   hw = rxq->hw;
+
+   /*
+* This is tunable as we could allow to receive more packets than
+* requested if most are multiplexed.
+*/
+   while (avail + avail_multiplexed < nb_pkts) {
+   rxb = &rxq->rxbufs[rxq->rd_p];
+   if (unlikely(rxb == NULL)) {
+   RTE_LOG_DP(ERR, PMD, "rxb does not exist!\n");
+   break;
+   }
+
+   rxds = &rxq->rxds[rxq->rd_p];
+   if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+   break;
+
+   /*
+* Memory barrier to ensure that we won't do other
+* reads before the DD bit.
+*/
+   rte_rmb();
+
+   /*
+* We got a packet. Let's alloc a new mbuf for refilling the
+* free descrip

Re: [PATCH v4 4/9] dts: add ssh pexpect library

2022-09-08 Thread Bruce Richardson
On Fri, Jul 29, 2022 at 10:55:45AM +, Juraj Linkeš wrote:
> The library uses the pexpect python library and implements connection to
> a node and two ways to interact with the node:
> 1. Send a string with specified prompt which will be matched after
>the string has been sent to the node.
> 2. Send a command to be executed. No prompt is specified here.
> 
> Signed-off-by: Owen Hilyard 
> Signed-off-by: Juraj Linkeš 

Comments inline below.
Thanks,
/Bruce

> ---
>  dts/framework/exception.py   |  57 ++
>  dts/framework/ssh_pexpect.py | 205 +++
>  dts/framework/utils.py   |  12 ++
>  3 files changed, 274 insertions(+)
>  create mode 100644 dts/framework/exception.py
>  create mode 100644 dts/framework/ssh_pexpect.py
>  create mode 100644 dts/framework/utils.py
> 
> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> new file mode 100644
> index 00..35e81a4d99
> --- /dev/null
> +++ b/dts/framework/exception.py
> @@ -0,0 +1,57 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2010-2014 Intel Corporation
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +# Copyright(c) 2022 University of New Hampshire
> +#
> +
> +"""
> +User-defined exceptions used across the framework.
> +"""
> +
> +
> +class TimeoutException(Exception):
> +"""
> +Command execution timeout.
> +"""
> +
> +command: str
> +output: str
> +
> +def __init__(self, command: str, output: str):
> +self.command = command
> +self.output = output
> +
> +def __str__(self) -> str:
> +return f"TIMEOUT on {self.command}"
> +
> +def get_output(self) -> str:
> +return self.output
> +
> +
> +class SSHConnectionException(Exception):
> +"""
> +SSH connection error.
> +"""
> +
> +host: str
> +
> +def __init__(self, host: str):
> +self.host = host
> +
> +def __str__(self) -> str:
> +return f"Error trying to connect with {self.host}"
> +
> +
> +class SSHSessionDeadException(Exception):
> +"""
> +SSH session is not alive.
> +It can no longer be used.
> +"""
> +
> +host: str
> +
> +def __init__(self, host: str):
> +self.host = host
> +
> +def __str__(self) -> str:
> +return f"SSH session with {self.host} has died"
> diff --git a/dts/framework/ssh_pexpect.py b/dts/framework/ssh_pexpect.py
> new file mode 100644
> index 00..e8f64515c0
> --- /dev/null
> +++ b/dts/framework/ssh_pexpect.py
> @@ -0,0 +1,205 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2010-2014 Intel Corporation
> +# Copyright(c) 2022 PANTHEON.tech s.r.o.
> +# Copyright(c) 2022 University of New Hampshire
> +#
> +
> +import time
> +from typing import Optional
> +
> +from pexpect import pxssh
> +
> +from .exception import SSHConnectionException, SSHSessionDeadException, 
> TimeoutException
> +from .logger import DTSLOG
> +from .utils import GREEN, RED
> +
> +"""
> +The module handles ssh sessions to TG and SUT.
> +It implements the send_expect function to send commands and get output data.
> +"""
> +
> +
> +class SSHPexpect:
> +username: str
> +password: str
> +node: str
> +logger: DTSLOG
> +magic_prompt: str
> +
> +def __init__(
> +self,
> +node: str,
> +username: str,
> +password: Optional[str],
> +logger: DTSLOG,
> +):
> +self.magic_prompt = "MAGIC PROMPT"
> +self.logger = logger
> +
> +self.node = node
> +self.username = username
> +self.password = password or ""
> +self.logger.info(f"ssh {self.username}@{self.node}")
> +
> +self._connect_host()
> +
> +def _connect_host(self) -> None:
> +"""
> +Create connection to assigned node.
> +"""
> +retry_times = 10
> +try:
> +if ":" in self.node:

Should this check and the relevant splitting below to assign to self.ip and
self.port, not be don at init when the node is passed in? Certainly the
splitting should probably be done outside the loop, rather than re-doing
the split into ip and port 10 times?

> +while retry_times:
> +self.ip = self.node.split(":")[0]
> +self.port = int(self.node.split(":")[1])
> +self.session = pxssh.pxssh(encoding="utf-8")
> +try:
> +self.session.login(
> +self.ip,
> +self.username,
> +self.password,
> +original_prompt="[$#>]",
> +port=self.port,
> +login_timeout=20,
> +password_regex=r"(?i)(?:password:)|(?:passphrase 
> for key)|(?i)(password for .+:)",
> +)
> +except Exception as e:
> +print(e)
> +tim

Running DPDK application with memory Sanitizer

2022-09-08 Thread bhargav M.P
Hi,
I am trying to run a DPDK(20.11) application using memory sanitizer. I have
taken the patch from upstream branch:
https://github.com/DPDK/dpdk/commit/6cc51b1293ceac4a77d4bf7ac91a8bbd59e1f78c
and made a build with -fsanitize=address. The gcc version is: gcc 6.3.0 .
But the application doesn't abort when an invalid memory access is made.
Starting the application with below:

./dpdk-app -l 2-4 -a :00:05.0 -a :00:06.0 -a :00:07.0 -a
:00:08.0 --socket-mem=3072

an invalid memory access from an application does not abort.

Ex:  From the application doing below doesn't abort:

char *p = rte_zmalloc(NULL, 9, 0);

rte_free(p);

*p = 'a';

Am I missing something here.?

is DPDK ASAN supposed to be used only in 21.11 release onwards.? Really
appreciate your help on this as we are trying debug a possible
memory corruption with DPDK application and thought DPDK memory sanitizer
support will be of great help

Thanks for the help,
Bhargav


[PATCH] net/axgbe: remove freeing buffer in scattered rx

2022-09-08 Thread Bhagyada Modali
Removed freeing of mbuf in scattered Rx as it should not be freed in rx.

Fixes: 965b3127d425 ("net/axgbe: support scattered Rx")
Cc: sta...@dpdk.org

Signed-off-by: Bhagyada Modali 
---
 drivers/net/axgbe/axgbe_rxtx.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c
index 8b43e8160b..d4224992ee 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -458,14 +458,11 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
memset((void *)(&desc->read.desc2), 0, 8);
AXGMAC_SET_BITS_LE(desc->read.desc3, RX_NORMAL_DESC3, OWN, 1);
 
-   if (!eop) {
-   rte_pktmbuf_free(mbuf);
+   if (!eop)
goto next_desc;
-   }
 
first_seg->pkt_len = pkt_len;
rxq->bytes += pkt_len;
-   mbuf->next = NULL;
 
first_seg->port = rxq->port_id;
if (rxq->pdata->rx_csum_enable) {
-- 
2.25.1



SWX table question

2022-09-08 Thread Morten Brørup
Thank you for an interesting presentation today, Cristian.

It made me aware of the existence of the SWX Table, which could be used for 
connection tracking.

I hadn't noticed the library before, because it is documented (and named) as 
part of the SWX/Pipeline Framework, and we don't use the SWX/Pipeline 
Framework, so I have ignored anything related to that framework.

My question is: Is the SWX Table library tied to the SWX/Pipeline Framework, or 
can a DPDK application use it independently of that framework, like any other 
DPDK library? If so, are there any limitations or restrictions - e.g. is it 
lockless, and can it be accessed by multiple cores simultaneously?


Med venlig hilsen / Kind regards,
-Morten Brørup



RE: [PATCH v2 1/3] net/axgbe: reset the end of packet in scattered rx

2022-09-08 Thread Namburu, Chandu-babu
[Public]

For the series,
Acked-by: Chandubabu Namburu 

-Original Message-
From: Modali, Bhagyada  
Sent: Thursday, September 8, 2022 9:01 AM
To: Namburu, Chandu-babu ; Yigit, Ferruh 
Cc: dev@dpdk.org; sta...@dpdk.org; Modali, Bhagyada 
Subject: [PATCH v2 1/3] net/axgbe: reset the end of packet in scattered rx

Reset the eop in the failure scenario and also after the last segment.
Removed the packet length updation explicitly as it is done in Chaining.

Fixes: 965b3127d425 ("net/axgbe: support scattered Rx")
Cc: sta...@dpdk.org

Signed-off-by: Bhagyada Modali 
---
 drivers/net/axgbe/axgbe_rxtx.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c 
index 8b43e8160b..e1488483bc 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -346,10 +346,11 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
uint32_t error_status = 0;
uint16_t idx, pidx, data_len = 0, pkt_len = 0;
uint64_t offloads;
+   bool eop = 0;
 
idx = AXGBE_GET_DESC_IDX(rxq, rxq->cur);
+
while (nb_rx < nb_pkts) {
-   bool eop = 0;
 next_desc:
idx = AXGBE_GET_DESC_IDX(rxq, rxq->cur);
 
@@ -416,9 +417,12 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
mbuf->pkt_len = data_len;
 
if (first_seg != NULL) {
-   if (rte_pktmbuf_chain(first_seg, mbuf) != 0)
+   if (rte_pktmbuf_chain(first_seg, mbuf) != 0) {
rte_mempool_put(rxq->mb_pool,
first_seg);
+   eop = 0;
+   break;
+   }
} else {
first_seg = mbuf;
}
@@ -462,8 +466,8 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
rte_pktmbuf_free(mbuf);
goto next_desc;
}
+   eop = 0;
 
-   first_seg->pkt_len = pkt_len;
rxq->bytes += pkt_len;
mbuf->next = NULL;
 
--
2.25.1


RE: [PATCH] net/axgbe: remove freeing buffer in scattered rx

2022-09-08 Thread Namburu, Chandu-babu
[Public]

Acked-by: Chandubabu Namburu 

-Original Message-
From: Modali, Bhagyada  
Sent: Thursday, September 8, 2022 6:12 PM
To: Namburu, Chandu-babu ; Yigit, Ferruh 
Cc: dev@dpdk.org; sta...@dpdk.org; Modali, Bhagyada 
Subject: [PATCH] net/axgbe: remove freeing buffer in scattered rx

Removed freeing of mbuf in scattered Rx as it should not be freed in rx.

Fixes: 965b3127d425 ("net/axgbe: support scattered Rx")
Cc: sta...@dpdk.org

Signed-off-by: Bhagyada Modali 
---
 drivers/net/axgbe/axgbe_rxtx.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c 
index 8b43e8160b..d4224992ee 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -458,14 +458,11 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
memset((void *)(&desc->read.desc2), 0, 8);
AXGMAC_SET_BITS_LE(desc->read.desc3, RX_NORMAL_DESC3, OWN, 1);
 
-   if (!eop) {
-   rte_pktmbuf_free(mbuf);
+   if (!eop)
goto next_desc;
-   }
 
first_seg->pkt_len = pkt_len;
rxq->bytes += pkt_len;
-   mbuf->next = NULL;
 
first_seg->port = rxq->port_id;
if (rxq->pdata->rx_csum_enable) {
--
2.25.1


Re: [PATCH] net/axgbe: remove freeing buffer in scattered rx

2022-09-08 Thread Ferruh Yigit

On 9/8/2022 2:57 PM, Namburu, Chandu-babu wrote:


-Original Message-
From: Modali, Bhagyada 
Sent: Thursday, September 8, 2022 6:12 PM
To: Namburu, Chandu-babu ; Yigit, Ferruh 
Cc: dev@dpdk.org; sta...@dpdk.org; Modali, Bhagyada 
Subject: [PATCH] net/axgbe: remove freeing buffer in scattered rx

Removed freeing of mbuf in scattered Rx as it should not be freed in rx.

Fixes: 965b3127d425 ("net/axgbe: support scattered Rx")
Cc: sta...@dpdk.org

Signed-off-by: Bhagyada Modali 

Acked-by: Chandubabu Namburu 


Applied to dpdk-next-net/main, thanks.


RE: SWX table question

2022-09-08 Thread Dumitrescu, Cristian
Hi Morten,

> -Original Message-
> From: Morten Brørup 
> Sent: Thursday, September 8, 2022 2:49 PM
> To: Dumitrescu, Cristian 
> Cc: dev@dpdk.org
> Subject: SWX table question
> 
> Thank you for an interesting presentation today, Cristian.
> 
> It made me aware of the existence of the SWX Table, which could be used
> for connection tracking.
> 
> I hadn't noticed the library before, because it is documented (and named) as
> part of the SWX/Pipeline Framework, and we don't use the SWX/Pipeline
> Framework, so I have ignored anything related to that framework.
> 
> My question is: Is the SWX Table library tied to the SWX/Pipeline Framework,
> or can a DPDK application use it independently of that framework, like any
> other DPDK library? If so, are there any limitations or restrictions - e.g. 
> is it
> lockless, and can it be accessed by multiple cores simultaneously?
> 

Yes, the table library can absolutely be used directly by the applications.

No, the table library is not lockless. For a given table object, the create, 
add, delete, free operations need to be serialized, while for the lookup 
operation it depends on the table type: for some table types, the lookup 
operation is read only, hence multiple lookup operations launched from 
different threads can overlap, while for some other table types, notably the 
learner table (used to implement the P4 PNA add-on-miss tables), the lookup 
operation is also modifying the data structures (e.g. the key timestamp), hence 
this operation also needs serial access to the table.

> 
> Med venlig hilsen / Kind regards,
> -Morten Brørup

Regards,
Cristian


Re: [PATCH v8 05/12] net/nfp: add flower PF setup logic

2022-09-08 Thread Ferruh Yigit

On 9/8/2022 9:44 AM, Chaoyong He wrote:

Adds the vNIC initialization logic for the flower PF vNIC. The flower
firmware exposes this vNIC for the purposes of fallback traffic in the
switchdev use-case.

Adds minimal dev_ops for this PF device. Because the device is being
exposed externally to DPDK it should also be configured using DPDK
helpers like rte_eth_configure(). For these helpers to work the flower
logic needs to implements a minimal set of dev_ops.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
---
  drivers/net/nfp/flower/nfp_flower.c| 398 -
  drivers/net/nfp/flower/nfp_flower.h|   6 +
  drivers/net/nfp/flower/nfp_flower_ovs_compat.h |  37 +++


Can you please detail why OVS specific header is required? Having 
application specific code in PMD can be sign of some design issue, that 
is why can you please explain more what it does?


<...>


+static int
+nfp_flower_init_pf_vnic(struct nfp_net_hw *hw)
+{
+   int ret;
+   uint16_t i;
+   uint16_t n_txq;
+   uint16_t n_rxq;
+   uint16_t port_id;
+   unsigned int numa_node;
+   struct rte_mempool *mp;
+   struct nfp_pf_dev *pf_dev;
+   struct rte_eth_dev *eth_dev;
+   struct nfp_app_fw_flower *app_fw_flower;
+
+   static const struct rte_eth_conf port_conf = {
+   .rxmode = {
+   .mq_mode  = RTE_ETH_MQ_RX_RSS,
+   .offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
+   },
+   .txmode = {
+   .mq_mode = RTE_ETH_MQ_TX_NONE,
+   },
+   };
+
+   /* Set up some pointers here for ease of use */
+   pf_dev = hw->pf_dev;
+   app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+   /*
+* Perform the "common" part of setting up a flower vNIC.
+* Mostly reading configuration from hardware.
+*/
+   ret = nfp_flower_init_vnic_common(hw, "pf_vnic");
+   if (ret != 0)
+   goto done;
+
+   hw->eth_dev = rte_eth_dev_allocate("nfp_pf_vnic");
+   if (hw->eth_dev == NULL) {
+   ret = -ENOMEM;
+   goto done;
+   }
+
+   /* Grab the pointer to the newly created rte_eth_dev here */
+   eth_dev = hw->eth_dev;
+
+   numa_node = rte_socket_id();
+
+   /* Fill in some of the eth_dev fields */
+   eth_dev->device = &pf_dev->pci_dev->device;
+   eth_dev->data->dev_private = hw;
+
+   /* Create a mbuf pool for the PF */
+   app_fw_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create();
+   if (app_fw_flower->pf_pktmbuf_pool == NULL) {
+   ret = -ENOMEM;
+   goto port_release;
+   }
+
+   mp = app_fw_flower->pf_pktmbuf_pool;
+
+   /* Add Rx/Tx functions */
+   eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
+
+   /* PF vNIC gets a random MAC */
+   eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", RTE_ETHER_ADDR_LEN, 
0);
+   if (eth_dev->data->mac_addrs == NULL) {
+   ret = -ENOMEM;
+   goto mempool_cleanup;
+   }
+
+   rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes);
+   rte_eth_dev_probing_finish(eth_dev);
+
+   /* Configure the PF device now */
+   n_rxq = hw->max_rx_queues;
+   n_txq = hw->max_tx_queues;
+   port_id = hw->eth_dev->data->port_id;
+
+   ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf);


Still not sure about PMD calling 'rte_eth_dev_configure()', can you 
please give more details on what specific configuration is expected with 
that call?




Re: [PATCH v8 01/12] net/nfp: move app specific attributes to own struct

2022-09-08 Thread Ferruh Yigit

On 9/8/2022 9:44 AM, Chaoyong He wrote:

The NFP card can load different firmware applications. Currently
only the CoreNIC application is supported. This commit makes
needed infrastructure changes in order to support other firmware
applications too.

Clearer separation is made between the PF device and any application
specific concepts. The PF struct is now generic regardless of the
application loaded. A new struct is also made for the CoreNIC
application. Future additions to support other applications should
also add an applications specific struct.



What do you think to replace 'application' usage in the commit log with 
'application firmware'?


<...>


diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index e9d01f4..bd9cf67 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -39,15 +39,15 @@
  #include "nfp_cpp_bridge.h"
  
  static int

-nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
+nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_hw_nic, int port)


Is this intentional that struct name is 'nfp_app_fw_nic' but variable 
name is 'app_hw_nic'? Why is app_fw vs app_hw difference?


<...>


@@ -890,27 +937,12 @@
}
  
  	/* Populate the newly created PF device */

+   pf_dev->app_fw_id = app_hw_id;


ditto.


Re: [PATCH v8 04/12] net/nfp: add initial flower firmware support

2022-09-08 Thread Ferruh Yigit

On 9/8/2022 9:44 AM, Chaoyong He wrote:

Adds the basic probing infrastructure to support the flower firmware.

Adds the basic infrastructure needed by the flower firmware to operate.
The firmware requires threads to service both the PF vNIC and the ctrl
vNIC. The PF is responsible for handling any fallback traffic and the
ctrl vNIC is used to communicate various control messages to and
from the smartNIC. rte_services are used to facilitate this logic.

Adds the cpp service, used for some user tools.

Signed-off-by: Chaoyong He 
Signed-off-by: Heinrich Kuhn 
Reviewed-by: Niklas Söderlund 
---
  doc/guides/rel_notes/release_22_11.rst |  3 ++


Can you please update driver documentation too, 
'doc/guides/nics/nfp.rst', to document these new features?
Please update it gradually in each relevant patch, as you are already 
doing for release notes updates.



  drivers/net/nfp/flower/nfp_flower.c| 59 +++
  drivers/net/nfp/flower/nfp_flower.h| 18 +++
  drivers/net/nfp/meson.build|  1 +
  drivers/net/nfp/nfp_common.h   |  1 +
  drivers/net/nfp/nfp_cpp_bridge.c   | 87 +-
  drivers/net/nfp/nfp_cpp_bridge.h   |  6 ++-
  drivers/net/nfp/nfp_ethdev.c   | 28 ++-
  8 files changed, 187 insertions(+), 16 deletions(-)
  create mode 100644 drivers/net/nfp/flower/nfp_flower.c
  create mode 100644 drivers/net/nfp/flower/nfp_flower.h

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index f601617..bb170e3 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,9 @@ New Features
   Also, make sure to start the actual text at the margin.
   ===
  
+   * **Updated Netronome nfp driver.**


This is still in the section comment, need to fix the indentation to 
move it out of comment.


And please put an empty line after this line.

Please compile and verify the documentation updates.


+ Added the support of flower firmware.
+ Added the flower service infrastructure.


In output these lines are joined, if the intentions is have them as 
bullets, need to add '*' prefix, like:

```
 * **Updated Netronome nfp driver.**

 * Added the support of flower firmware.
 * Added the flower service infrastructure.
 * Added the control message interactive channels between PMD and 
firmware.

 * Added the support of representor port.
```


[PATCH] net/axgbe: support segmented Tx

2022-09-08 Thread Bhagyada Modali
Enable segmented tx support and add jumbo packet transmit capability

Signed-off-by: Bhagyada Modali 
---
 drivers/net/axgbe/axgbe_ethdev.c |   1 +
 drivers/net/axgbe/axgbe_ethdev.h |   1 +
 drivers/net/axgbe/axgbe_rxtx.c   | 215 ++-
 drivers/net/axgbe/axgbe_rxtx.h   |   4 +
 4 files changed, 220 insertions(+), 1 deletion(-)

diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index e6822fa711..b071e4e460 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -1228,6 +1228,7 @@ axgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
RTE_ETH_TX_OFFLOAD_VLAN_INSERT |
RTE_ETH_TX_OFFLOAD_QINQ_INSERT |
RTE_ETH_TX_OFFLOAD_IPV4_CKSUM  |
+   RTE_ETH_TX_OFFLOAD_MULTI_SEGS  |
RTE_ETH_TX_OFFLOAD_UDP_CKSUM   |
RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
 
diff --git a/drivers/net/axgbe/axgbe_ethdev.h b/drivers/net/axgbe/axgbe_ethdev.h
index e06d40f9eb..7f19321d88 100644
--- a/drivers/net/axgbe/axgbe_ethdev.h
+++ b/drivers/net/axgbe/axgbe_ethdev.h
@@ -582,6 +582,7 @@ struct axgbe_port {
unsigned int tx_pbl;
unsigned int tx_osp_mode;
unsigned int tx_max_fifo_size;
+   unsigned int multi_segs_tx;
 
/* Rx settings */
unsigned int rx_sf_mode;
diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c
index 8b43e8160b..c32ebe24bb 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -544,6 +544,7 @@ int axgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
unsigned int tsize;
const struct rte_memzone *tz;
uint64_t offloads;
+   struct rte_eth_dev_data *dev_data = dev->data;
 
tx_desc = nb_desc;
pdata = dev->data->dev_private;
@@ -611,7 +612,13 @@ int axgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
if (!pdata->tx_queues)
pdata->tx_queues = dev->data->tx_queues;
 
-   if (txq->vector_disable ||
+   if ((dev_data->dev_conf.txmode.offloads &
+   RTE_ETH_TX_OFFLOAD_MULTI_SEGS))
+   pdata->multi_segs_tx = true;
+
+   if (pdata->multi_segs_tx)
+   dev->tx_pkt_burst = &axgbe_xmit_pkts_seg;
+   else if (txq->vector_disable ||
rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_128)
dev->tx_pkt_burst = &axgbe_xmit_pkts;
else
@@ -762,6 +769,29 @@ void axgbe_dev_enable_tx(struct rte_eth_dev *dev)
AXGMAC_IOWRITE_BITS(pdata, MAC_TCR, TE, 1);
 }
 
+/* Free Tx conformed mbufs segments */
+static void
+axgbe_xmit_cleanup_seg(struct axgbe_tx_queue *txq)
+{
+   volatile struct axgbe_tx_desc *desc;
+   uint16_t idx;
+
+   idx = AXGBE_GET_DESC_IDX(txq, txq->dirty);
+   while (txq->cur != txq->dirty) {
+   if (unlikely(idx == txq->nb_desc))
+   idx = 0;
+   desc = &txq->desc[idx];
+   /* Check for ownership */
+   if (AXGMAC_GET_BITS_LE(desc->desc3, TX_NORMAL_DESC3, OWN))
+   return;
+   memset((void *)&desc->desc2, 0, 8);
+   /* Free mbuf */
+   rte_pktmbuf_free_seg(txq->sw_ring[idx]);
+   txq->sw_ring[idx++] = NULL;
+   txq->dirty++;
+   }
+}
+
 /* Free Tx conformed mbufs */
 static void axgbe_xmit_cleanup(struct axgbe_tx_queue *txq)
 {
@@ -854,6 +884,189 @@ static int axgbe_xmit_hw(struct axgbe_tx_queue *txq,
return 0;
 }
 
+/* Tx Descriptor formation for segmented mbuf
+ * Each mbuf will require multiple descriptors
+ */
+
+static int
+axgbe_xmit_hw_seg(struct axgbe_tx_queue *txq,
+   struct rte_mbuf *mbuf)
+{
+   volatile struct axgbe_tx_desc *desc;
+   uint16_t idx;
+   uint64_t mask;
+   int start_index;
+   uint32_t pkt_len = 0;
+   int nb_desc_free;
+   struct rte_mbuf  *tx_pkt;
+
+   nb_desc_free = txq->nb_desc - (txq->cur - txq->dirty);
+
+   if (mbuf->nb_segs > nb_desc_free) {
+   axgbe_xmit_cleanup_seg(txq);
+   nb_desc_free = txq->nb_desc - (txq->cur - txq->dirty);
+   if (unlikely(mbuf->nb_segs > nb_desc_free))
+   return RTE_ETH_TX_DESC_UNAVAIL;
+   }
+
+   idx = AXGBE_GET_DESC_IDX(txq, txq->cur);
+   desc = &txq->desc[idx];
+   /* Saving the start index for setting the OWN bit finally */
+   start_index = idx;
+
+   tx_pkt = mbuf;
+   /* Max_pkt len = 9018 ; need to update it according to Jumbo pkt size */
+   pkt_len = tx_pkt->pkt_len;
+
+   /* Update buffer address  and length */
+   desc->baddr = rte_mbuf_data_iova(tx_pkt);
+   AXGMAC_SET_BITS_LE(desc->desc2, TX_NORMAL_DESC2, HL_B1L,
+  tx_pkt->data_len);
+   /* Total msg length to transmit */
+   AXGMAC_SET

[PATCH 1/3] eventdev/eth_tx: add queue start stop API

2022-09-08 Thread Naga Harish K S V
This patch adds support to start or stop a particular queue
that is associated with the adapter.

Start function enables the Tx Adapter to start enqueueing
packets to the Tx queue.

Stop function stops the Tx Adapter from transmitting any
mbufs to the Tx queue. The Tx Adapter also frees any mbufs
that it may have buffered for this queue. All inflight packets
destined to the queue are freed until the queue is started again.

Signed-off-by: Naga Harish K S V 
---
 lib/eventdev/eventdev_pmd.h |  41 +
 lib/eventdev/rte_event_eth_tx_adapter.c | 113 +++-
 lib/eventdev/rte_event_eth_tx_adapter.h |  39 
 lib/eventdev/version.map|   2 +
 4 files changed, 191 insertions(+), 4 deletions(-)

diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index f514a37575..a27c0883c6 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -1294,6 +1294,43 @@ typedef int 
(*eventdev_eth_tx_adapter_stats_reset_t)(uint8_t id,
 typedef int (*eventdev_eth_tx_adapter_instance_get_t)
(uint16_t eth_dev_id, uint16_t tx_queue_id, uint8_t *txa_inst_id);
 
+/**
+ * Start a Tx queue that is assigned to TX adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device TX queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_start)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
+
+/**
+ * Stop a Tx queue that is assigned to TX adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device TX queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_stop)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
 
 /** Event device operations function pointer table */
 struct eventdev_ops {
@@ -1409,6 +1446,10 @@ struct eventdev_ops {
/**< Reset eth Tx adapter statistics */
eventdev_eth_tx_adapter_instance_get_t eth_tx_adapter_instance_get;
/**< Get Tx adapter instance id for Tx queue */
+   eventdev_eth_tx_adapter_queue_start eth_tx_adapter_queue_start;
+   /**< Start Tx queue assigned to Tx adapter instance */
+   eventdev_eth_tx_adapter_queue_stop eth_tx_adapter_queue_stop;
+   /**< Stop Tx queue assigned to Tx adapter instance */
 
eventdev_selftest dev_selftest;
/**< Start eventdev Selftest */
diff --git a/lib/eventdev/rte_event_eth_tx_adapter.c 
b/lib/eventdev/rte_event_eth_tx_adapter.c
index aaef352f5c..d0ed11ade5 100644
--- a/lib/eventdev/rte_event_eth_tx_adapter.c
+++ b/lib/eventdev/rte_event_eth_tx_adapter.c
@@ -47,6 +47,12 @@
 #define txa_dev_instance_get(id) \
txa_evdev(id)->dev_ops->eth_tx_adapter_instance_get
 
+#define txa_dev_queue_start(id) \
+   txa_evdev(id)->dev_ops->eth_tx_adapter_queue_start
+
+#define txa_dev_queue_stop(id) \
+   txa_evdev(id)->dev_ops->eth_tx_adapter_queue_stop
+
 #define RTE_EVENT_ETH_TX_ADAPTER_ID_VALID_OR_ERR_RET(id, retval) \
 do { \
if (!txa_valid_id(id)) { \
@@ -94,6 +100,8 @@ struct txa_retry {
 struct txa_service_queue_info {
/* Queue has been added */
uint8_t added;
+   /* Queue is stopped */
+   bool stopped;
/* Retry callback argument */
struct txa_retry txa_retry;
/* Tx buffer */
@@ -557,7 +565,7 @@ txa_process_event_vector(struct txa_service_data *txa,
port = vec->port;
queue = vec->queue;
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL || !tqi->added)) {
+   if (unlikely(tqi == NULL || !tqi->added || tqi->stopped)) {
rte_pktmbuf_free_bulk(mbufs, vec->nb_elem);
rte_mempool_put(rte_mempool_from_obj(vec), vec);
return 0;
@@ -571,7 +579,8 @@ txa_process_event_vector(struct txa_service_data *txa,
port = mbufs[i]->port;
queue = rte_event_eth_tx_adapter_txq_get(mbufs[i]);
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL || !tqi->added)) {
+   if (unlikely(tqi == NULL || !tqi->added ||
+tqi->stopped)) {
rte_pktmbuf_free(mbufs[i]);
continue;
}
@@ -608,7 +617,8 @@ txa_service_tx(struct txa_service_data *txa, struct 
rte_event *ev,
queue = rte_event_eth_tx_adapter_txq_get(m);
 
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL 

[PATCH 2/3] test/eth_tx: add testcase for queue start stop APIs

2022-09-08 Thread Naga Harish K S V
Added testcase for rte_event_eth_tx_adapter_queue_start()
and rte_event_eth_tx_adapter_queue_stop() APIs.

Signed-off-by: Naga Harish K S V 
---
 app/test/test_event_eth_tx_adapter.c | 86 
 1 file changed, 86 insertions(+)

diff --git a/app/test/test_event_eth_tx_adapter.c 
b/app/test/test_event_eth_tx_adapter.c
index 98debfdd2c..c19a87a86a 100644
--- a/app/test/test_event_eth_tx_adapter.c
+++ b/app/test/test_event_eth_tx_adapter.c
@@ -711,6 +711,90 @@ tx_adapter_instance_get(void)
return TEST_SUCCESS;
 }
 
+static int
+tx_adapter_queue_start_stop(void)
+{
+   int err;
+   uint16_t eth_dev_id;
+   struct rte_eth_dev_info dev_info;
+
+   /* Case 1: Test without adding eth Tx queue */
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 2: Test with wrong eth port */
+   eth_dev_id = rte_eth_dev_count_total() + 1;
+   err = rte_event_eth_tx_adapter_queue_start(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 3: Test with wrong tx queue */
+   err = rte_eth_dev_info_get(TEST_ETHDEV_ID, &dev_info);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 4: Test with right instance, port & rxq */
+   /* Add queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Add another queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Case 5: Test with right instance, port & wrong rxq */
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Delete all queues from the Tx adapter */
+   err = rte_event_eth_tx_adapter_queue_del(TEST_INST_ID,
+TEST_ETHDEV_ID,
+-1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   return TEST_SUCCESS;
+}
+
 static int
 tx_adapter_dynamic_device(void)
 {
@@ -770,6 +854,8 @@ static struct unit_test_suite event_eth_tx_tests = {
tx_adapter_service),
TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
tx_adapter_instance_get),
+   TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
+   tx_adapter_queue_start_stop)

[PATCH 3/3] doc: added eth Tx adapter queue start stop APIs

2022-09-08 Thread Naga Harish K S V
Added tx adapter queue start - rte_event_eth_rx_adapter_queue_start()
and tx adapter queue stop - rte_event_eth_tx_adapter_queue_stop()

Signed-off-by: Naga Harish K S V 
---
 doc/guides/rel_notes/release_22_11.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index c32c18ff49..dc1060660c 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -29,6 +29,10 @@ New Features
   ethernet device id and Rx queue index.
   Added ``rte_event_eth_tx_adapter_instance_get`` to get the Tx adapter 
instance id for specified
   ethernet device id and Tx queue index.
+  Added ``rte_event_eth_tx_adapter_queue_start`` to start enqueueing packets 
to the Tx queue by
+  Tx adapter.
+  Added ``rte_event_eth_tx_adapter_queue_start`` to stop the Tx Adapter from 
transmitting any
+  mbufs to the Tx_queue.
 
 .. This section should contain new features added in this release.
Sample format:
-- 
2.25.1



[PATCH v2] net/axgbe: support segmented Tx

2022-09-08 Thread Bhagyada Modali
Enable segmented tx support and add jumbo packet transmit capability

Signed-off-by: Bhagyada Modali 
---
 drivers/net/axgbe/axgbe_ethdev.c |   1 +
 drivers/net/axgbe/axgbe_ethdev.h |   1 +
 drivers/net/axgbe/axgbe_rxtx.c   | 213 ++-
 drivers/net/axgbe/axgbe_rxtx.h   |   4 +
 4 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index e6822fa711..b071e4e460 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -1228,6 +1228,7 @@ axgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
RTE_ETH_TX_OFFLOAD_VLAN_INSERT |
RTE_ETH_TX_OFFLOAD_QINQ_INSERT |
RTE_ETH_TX_OFFLOAD_IPV4_CKSUM  |
+   RTE_ETH_TX_OFFLOAD_MULTI_SEGS  |
RTE_ETH_TX_OFFLOAD_UDP_CKSUM   |
RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
 
diff --git a/drivers/net/axgbe/axgbe_ethdev.h b/drivers/net/axgbe/axgbe_ethdev.h
index e06d40f9eb..7f19321d88 100644
--- a/drivers/net/axgbe/axgbe_ethdev.h
+++ b/drivers/net/axgbe/axgbe_ethdev.h
@@ -582,6 +582,7 @@ struct axgbe_port {
unsigned int tx_pbl;
unsigned int tx_osp_mode;
unsigned int tx_max_fifo_size;
+   unsigned int multi_segs_tx;
 
/* Rx settings */
unsigned int rx_sf_mode;
diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c
index 8b43e8160b..881ffa01db 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -544,6 +544,7 @@ int axgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
unsigned int tsize;
const struct rte_memzone *tz;
uint64_t offloads;
+   struct rte_eth_dev_data *dev_data = dev->data;
 
tx_desc = nb_desc;
pdata = dev->data->dev_private;
@@ -611,7 +612,13 @@ int axgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
if (!pdata->tx_queues)
pdata->tx_queues = dev->data->tx_queues;
 
-   if (txq->vector_disable ||
+   if ((dev_data->dev_conf.txmode.offloads &
+   RTE_ETH_TX_OFFLOAD_MULTI_SEGS))
+   pdata->multi_segs_tx = true;
+
+   if (pdata->multi_segs_tx)
+   dev->tx_pkt_burst = &axgbe_xmit_pkts_seg;
+   else if (txq->vector_disable ||
rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_128)
dev->tx_pkt_burst = &axgbe_xmit_pkts;
else
@@ -762,6 +769,29 @@ void axgbe_dev_enable_tx(struct rte_eth_dev *dev)
AXGMAC_IOWRITE_BITS(pdata, MAC_TCR, TE, 1);
 }
 
+/* Free Tx conformed mbufs segments */
+static void
+axgbe_xmit_cleanup_seg(struct axgbe_tx_queue *txq)
+{
+   volatile struct axgbe_tx_desc *desc;
+   uint16_t idx;
+
+   idx = AXGBE_GET_DESC_IDX(txq, txq->dirty);
+   while (txq->cur != txq->dirty) {
+   if (unlikely(idx == txq->nb_desc))
+   idx = 0;
+   desc = &txq->desc[idx];
+   /* Check for ownership */
+   if (AXGMAC_GET_BITS_LE(desc->desc3, TX_NORMAL_DESC3, OWN))
+   return;
+   memset((void *)&desc->desc2, 0, 8);
+   /* Free mbuf */
+   rte_pktmbuf_free_seg(txq->sw_ring[idx]);
+   txq->sw_ring[idx++] = NULL;
+   txq->dirty++;
+   }
+}
+
 /* Free Tx conformed mbufs */
 static void axgbe_xmit_cleanup(struct axgbe_tx_queue *txq)
 {
@@ -854,6 +884,187 @@ static int axgbe_xmit_hw(struct axgbe_tx_queue *txq,
return 0;
 }
 
+/* Tx Descriptor formation for segmented mbuf
+ * Each mbuf will require multiple descriptors
+ */
+
+static int
+axgbe_xmit_hw_seg(struct axgbe_tx_queue *txq,
+   struct rte_mbuf *mbuf)
+{
+   volatile struct axgbe_tx_desc *desc;
+   uint16_t idx;
+   uint64_t mask;
+   int start_index;
+   uint32_t pkt_len = 0;
+   int nb_desc_free;
+   struct rte_mbuf  *tx_pkt;
+
+   nb_desc_free = txq->nb_desc - (txq->cur - txq->dirty);
+
+   if (mbuf->nb_segs > nb_desc_free) {
+   axgbe_xmit_cleanup_seg(txq);
+   nb_desc_free = txq->nb_desc - (txq->cur - txq->dirty);
+   if (unlikely(mbuf->nb_segs > nb_desc_free))
+   return RTE_ETH_TX_DESC_UNAVAIL;
+   }
+
+   idx = AXGBE_GET_DESC_IDX(txq, txq->cur);
+   desc = &txq->desc[idx];
+   /* Saving the start index for setting the OWN bit finally */
+   start_index = idx;
+
+   tx_pkt = mbuf;
+   /* Max_pkt len = 9018 ; need to update it according to Jumbo pkt size */
+   pkt_len = tx_pkt->pkt_len;
+
+   /* Update buffer address  and length */
+   desc->baddr = rte_mbuf_data_iova(tx_pkt);
+   AXGMAC_SET_BITS_LE(desc->desc2, TX_NORMAL_DESC2, HL_B1L,
+  tx_pkt->data_len);
+   /* Total msg length to transmit */
+   AXGMAC_SET

Re: [PATCH v2 1/5] examples/l3fwd: fix port group mask generation

2022-09-08 Thread David Christensen




On 9/2/22 2:18 AM, pbhagavat...@marvell.com wrote:

From: Pavan Nikhilesh 

Fix port group mask generation in altivec, vec_any_eq returns
0 or 1 while port_groupx4 expects comparison mask result.

Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on powerpc")
Cc: sta...@dpdk.org

Signed-off-by: Pavan Nikhilesh 
---
  v2 Changes:
  - Fix PPC, RISC-V, aarch32 compilation.

  examples/common/altivec/port_group.h | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/examples/common/altivec/port_group.h 
b/examples/common/altivec/port_group.h
index 5e209b02fa..592ef80b7f 100644
--- a/examples/common/altivec/port_group.h
+++ b/examples/common/altivec/port_group.h
@@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp,
uint16_t u16[FWDSTEP + 1];
uint64_t u64;
} *pnum = (void *)pn;
+   union u_vec {
+   __vector unsigned short v_us;
+   unsigned short s[8];
+   };

+   union u_vec res;
int32_t v;

-   v = vec_any_eq(dp1, dp2);
-
+   dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2);


Altivec vec_cmpeq() is similar to Intel _mm_cmpeq_*(), so this looks 
right to me.



+   res.v_us = dp1;

+   v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) |
+   (res.s[3] & 0x8);


This can be vectorized too.  The Intel _mm_unpacklo_epi16() intrinsic 
can be replaced with the following Altivec code:


extern __inline __m128i __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))

_mm_unpacklo_epi16 (__m128i __A, __m128i __B)
{
  return (__m128i) vec_mergeh ((__v8hi)__A, (__v8hi)__B);
}

The Intel _mm_movemask_ps() intrinsic can be replaced with the following 
Altivec implementation:


/* Creates a 4-bit mask from the most significant bits of the SPFP 
values.  */
extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))

_mm_movemask_ps (__m128  __A)
{
  __vector unsigned long long result;
  static const __vector unsigned int perm_mask =
{
#ifdef __LITTLE_ENDIAN__
0x00204060, 0x80808080, 0x80808080, 0x80808080
#else
  0x80808080, 0x80808080, 0x80808080, 0x00204060
#endif
};

  result = ((__vector unsigned long long)
vec_vbpermq ((__vector unsigned char) __A,
 (__vector unsigned char) perm_mask));

#ifdef __LITTLE_ENDIAN__
  return result[1];
#else
  return result[0];
#endif
}

Dave


[Patch v8 00/18] Introduce Microsoft Azure Network Adatper (MANA) PMD

2022-09-08 Thread longli
From: Long Li 

MANA is a network interface card to be used in the Azure cloud environment.
MANA provides safe access to user memory through memory registration. It has
IOMMU built into the hardware.

MANA uses IB verbs and RDMA layer to configure hardware resources. It
requires the corresponding RDMA kernel-mode and user-mode drivers.

The MANA RDMA kernel-mode driver is being reviewed at:
https://patchwork.kernel.org/project/netdevbpf/cover/1655345240-26411-1-git-send-email-lon...@linuxonhyperv.com/

The MANA RDMA user-mode driver is being reviewed at:
https://github.com/linux-rdma/rdma-core/pull/1177

Long Li (18):
  net/mana: add basic driver with build environment and doc
  net/mana: add device configuration and stop
  net/mana: add function to report support ptypes
  net/mana: add link update
  net/mana: add function for device removal interrupts
  net/mana: add device info
  net/mana: add function to configure RSS
  net/mana: add function to configure Rx queues
  net/mana: add function to configure Tx queues
  net/mana: implement memory registration
  net/mana: implement the hardware layer operations
  net/mana: add function to start/stop Tx queues
  net/mana: add function to start/stop Rx queues
  net/mana: add function to receive packets
  net/mana: add function to send packets
  net/mana: add function to start/stop device
  net/mana: add function to report queue stats
  net/mana: add function to support Rx interrupts

 MAINTAINERS   |6 +
 doc/guides/nics/features/mana.ini |   20 +
 doc/guides/nics/index.rst |1 +
 doc/guides/nics/mana.rst  |   69 ++
 drivers/net/mana/gdma.c   |  301 ++
 drivers/net/mana/mana.c   | 1501 +
 drivers/net/mana/mana.h   |  548 +++
 drivers/net/mana/meson.build  |   48 +
 drivers/net/mana/mp.c |  336 +++
 drivers/net/mana/mr.c |  348 +++
 drivers/net/mana/rx.c |  531 ++
 drivers/net/mana/tx.c |  415 
 drivers/net/mana/version.map  |3 +
 drivers/net/meson.build   |1 +
 14 files changed, 4128 insertions(+)
 create mode 100644 doc/guides/nics/features/mana.ini
 create mode 100644 doc/guides/nics/mana.rst
 create mode 100644 drivers/net/mana/gdma.c
 create mode 100644 drivers/net/mana/mana.c
 create mode 100644 drivers/net/mana/mana.h
 create mode 100644 drivers/net/mana/meson.build
 create mode 100644 drivers/net/mana/mp.c
 create mode 100644 drivers/net/mana/mr.c
 create mode 100644 drivers/net/mana/rx.c
 create mode 100644 drivers/net/mana/tx.c
 create mode 100644 drivers/net/mana/version.map

-- 
2.17.1



[Patch v8 01/18] net/mana: add basic driver with build environment and doc

2022-09-08 Thread longli
From: Long Li 

MANA is a PCI device. It uses IB verbs to access hardware through the
kernel RDMA layer. This patch introduces build environment and basic
device probe functions.

Signed-off-by: Long Li 
---
Change log:
v2:
Fix typos.
Make the driver build only on x86-64 and Linux.
Remove unused header files.
Change port definition to uint16_t or uint8_t (for IB).
Use getline() in place of fgets() to read and truncate a line.
v3:
Add meson build check for required functions from RDMA direct verb header file
v4:
Remove extra "\n" in logging code.
Use "r" in place of "rb" in fopen() to read text files.
v7:
Remove RTE_ETH_TX_OFFLOAD_TCP_TSO from offload cap.
v8:
Add clarification on driver args usage to nics guide.
Fix coding sytle on function definitions.
Use different variable names in MANA_MKSTR.
Use MANA_ prefix for all macros.
Use RTE_PMD_REGISTER_PCI in place of rte_pci_register.
Add .vendor_id = 0 to the end of PCI table.
Remove RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS from dev_flags.

 MAINTAINERS   |   6 +
 doc/guides/nics/features/mana.ini |  10 +
 doc/guides/nics/index.rst |   1 +
 doc/guides/nics/mana.rst  |  69 +++
 drivers/net/mana/mana.c   | 728 ++
 drivers/net/mana/mana.h   | 207 +
 drivers/net/mana/meson.build  |  44 ++
 drivers/net/mana/mp.c | 241 ++
 drivers/net/mana/version.map  |   3 +
 drivers/net/meson.build   |   1 +
 10 files changed, 1310 insertions(+)
 create mode 100644 doc/guides/nics/features/mana.ini
 create mode 100644 doc/guides/nics/mana.rst
 create mode 100644 drivers/net/mana/mana.c
 create mode 100644 drivers/net/mana/mana.h
 create mode 100644 drivers/net/mana/meson.build
 create mode 100644 drivers/net/mana/mp.c
 create mode 100644 drivers/net/mana/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 18d9edaf88..b8bda48a33 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -837,6 +837,12 @@ F: buildtools/options-ibverbs-static.sh
 F: doc/guides/nics/mlx5.rst
 F: doc/guides/nics/features/mlx5.ini
 
+Microsoft mana
+M: Long Li 
+F: drivers/net/mana
+F: doc/guides/nics/mana.rst
+F: doc/guides/nics/features/mana.ini
+
 Microsoft vdev_netvsc - EXPERIMENTAL
 M: Matan Azrad 
 F: drivers/net/vdev_netvsc/
diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
new file mode 100644
index 00..b92a27374c
--- /dev/null
+++ b/doc/guides/nics/features/mana.ini
@@ -0,0 +1,10 @@
+;
+; Supported features of the 'mana' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Linux= Y
+Multiprocess aware   = Y
+Usage doc= Y
+x86-64   = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 1c94caccea..2725d1d9f0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -41,6 +41,7 @@ Network Interface Controller Drivers
 intel_vf
 kni
 liquidio
+mana
 memif
 mlx4
 mlx5
diff --git a/doc/guides/nics/mana.rst b/doc/guides/nics/mana.rst
new file mode 100644
index 00..075cbf092d
--- /dev/null
+++ b/doc/guides/nics/mana.rst
@@ -0,0 +1,69 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright 2022 Microsoft Corporation
+
+MANA poll mode driver library
+=
+
+The MANA poll mode driver library (**librte_net_mana**) implements support
+for Microsoft Azure Network Adapter VF in SR-IOV context.
+
+Features
+
+
+Features of the MANA Ethdev PMD are:
+
+Prerequisites
+-
+
+This driver relies on external libraries and kernel drivers for resources
+allocations and initialization. The following dependencies are not part of
+DPDK and must be installed separately:
+
+- **libibverbs** (provided by rdma-core package)
+
+  User space verbs framework used by librte_net_mana. This library provides
+  a generic interface between the kernel and low-level user space drivers
+  such as libmana.
+
+  It allows slow and privileged operations (context initialization, hardware
+  resources allocations) to be managed by the kernel and fast operations to
+  never leave user space.
+
+- **libmana** (provided by rdma-core package)
+
+  Low-level user space driver library for Microsoft Azure Network Adapter
+  devices, it is automatically loaded by libibverbs. The minimal version of
+  rdma-core with libmana is v43.
+
+- **Kernel modules**
+
+  They provide the kernel-side verbs API and low level device drivers that
+  manage actual hardware initialization and resources sharing with user
+  space processes.
+
+  Unlike most other PMDs, these modules must remain loaded and bound to
+  their devices:
+
+  - mana: Ethernet device driver that provides kernel network interfaces.
+  - mana_ib: InifiniBand device driver.
+  - ib_uverbs: user space driver for verbs (entry point for libibverbs).
+
+Driver compilation and testing
+--
+
+Refer

[Patch v8 02/18] net/mana: add device configuration and stop

2022-09-08 Thread longli
From: Long Li 

MANA defines its memory allocation functions to override IB layer default
functions to allocate device queues. This patch adds the code for device
configuration and stop.

Signed-off-by: Long Li 
---
v2:
Removed validation for offload settings in mana_dev_configure().
v8:
Fix coding style to function definitions.

 drivers/net/mana/mana.c | 81 -
 drivers/net/mana/mana.h |  3 ++
 2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 8b9fa9bd07..d522294bd0 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -42,7 +42,85 @@ static rte_spinlock_t mana_shared_data_lock = 
RTE_SPINLOCK_INITIALIZER;
 int mana_logtype_driver;
 int mana_logtype_init;
 
+/*
+ * Callback from rdma-core to allocate a buffer for a queue.
+ */
+void *
+mana_alloc_verbs_buf(size_t size, void *data)
+{
+   void *ret;
+   size_t alignment = rte_mem_page_size();
+   int socket = (int)(uintptr_t)data;
+
+   DRV_LOG(DEBUG, "size=%zu socket=%d", size, socket);
+
+   if (alignment == (size_t)-1) {
+   DRV_LOG(ERR, "Failed to get mem page size");
+   rte_errno = ENOMEM;
+   return NULL;
+   }
+
+   ret = rte_zmalloc_socket("mana_verb_buf", size, alignment, socket);
+   if (!ret && size)
+   rte_errno = ENOMEM;
+   return ret;
+}
+
+void
+mana_free_verbs_buf(void *ptr, void *data __rte_unused)
+{
+   rte_free(ptr);
+}
+
+static int
+mana_dev_configure(struct rte_eth_dev *dev)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
+
+   if (dev_conf->rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG)
+   dev_conf->rxmode.offloads |= DEV_RX_OFFLOAD_RSS_HASH;
+
+   if (dev->data->nb_rx_queues != dev->data->nb_tx_queues) {
+   DRV_LOG(ERR, "Only support equal number of rx/tx queues");
+   return -EINVAL;
+   }
+
+   if (!rte_is_power_of_2(dev->data->nb_rx_queues)) {
+   DRV_LOG(ERR, "number of TX/RX queues must be power of 2");
+   return -EINVAL;
+   }
+
+   priv->num_queues = dev->data->nb_rx_queues;
+
+   manadv_set_context_attr(priv->ib_ctx, MANADV_CTX_ATTR_BUF_ALLOCATORS,
+   (void *)((uintptr_t)&(struct 
manadv_ctx_allocators){
+   .alloc = &mana_alloc_verbs_buf,
+   .free = &mana_free_verbs_buf,
+   .data = 0,
+   }));
+
+   return 0;
+}
+
+static int
+mana_dev_close(struct rte_eth_dev *dev)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   int ret;
+
+   ret = ibv_close_device(priv->ib_ctx);
+   if (ret) {
+   ret = errno;
+   return ret;
+   }
+
+   return 0;
+}
+
 static const struct eth_dev_ops mana_dev_ops = {
+   .dev_configure  = mana_dev_configure,
+   .dev_close  = mana_dev_close,
 };
 
 static const struct eth_dev_ops mana_dev_secondary_ops = {
@@ -649,8 +727,7 @@ mana_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 static int
 mana_dev_uninit(struct rte_eth_dev *dev)
 {
-   RTE_SET_USED(dev);
-   return 0;
+   return mana_dev_close(dev);
 }
 
 /*
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 098819e61e..d4a2fe7603 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -204,4 +204,7 @@ int mana_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 
 void mana_mp_req_on_rxtx(struct rte_eth_dev *dev, enum mana_mp_req_type type);
 
+void *mana_alloc_verbs_buf(size_t size, void *data);
+void mana_free_verbs_buf(void *ptr, void *data __rte_unused);
+
 #endif
-- 
2.17.1



[Patch v8 03/18] net/mana: add function to report support ptypes

2022-09-08 Thread longli
From: Long Li 

Report supported protocol types.

Signed-off-by: Long Li 
---
Change log.
v7: change link_speed to RTE_ETH_SPEED_NUM_100G

 drivers/net/mana/mana.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index d522294bd0..112d58a5d3 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -118,9 +118,26 @@ mana_dev_close(struct rte_eth_dev *dev)
return 0;
 }
 
+static const uint32_t *
+mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused)
+{
+   static const uint32_t ptypes[] = {
+   RTE_PTYPE_L2_ETHER,
+   RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+   RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+   RTE_PTYPE_L4_FRAG,
+   RTE_PTYPE_L4_TCP,
+   RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_UNKNOWN
+   };
+
+   return ptypes;
+}
+
 static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_close  = mana_dev_close,
+   .dev_supported_ptypes_get = mana_supported_ptypes,
 };
 
 static const struct eth_dev_ops mana_dev_secondary_ops = {
-- 
2.17.1



[Patch v8 04/18] net/mana: add link update

2022-09-08 Thread longli
From: Long Li 

The carrier state is managed by the Azure host. MANA runs as a VF and
always reports "up".

Signed-off-by: Long Li 
---
 doc/guides/nics/features/mana.ini |  1 +
 drivers/net/mana/mana.c   | 18 ++
 2 files changed, 19 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index b92a27374c..62554b0a0a 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -4,6 +4,7 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
+Link status  = P
 Linux= Y
 Multiprocess aware   = Y
 Usage doc= Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 112d58a5d3..714e4ede28 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -134,10 +134,28 @@ mana_supported_ptypes(struct rte_eth_dev *dev 
__rte_unused)
return ptypes;
 }
 
+static int
+mana_dev_link_update(struct rte_eth_dev *dev,
+int wait_to_complete __rte_unused)
+{
+   struct rte_eth_link link;
+
+   /* MANA has no concept of carrier state, always reporting UP */
+   link = (struct rte_eth_link) {
+   .link_duplex = RTE_ETH_LINK_FULL_DUPLEX,
+   .link_autoneg = RTE_ETH_LINK_SPEED_FIXED,
+   .link_speed = RTE_ETH_SPEED_NUM_100G,
+   .link_status = RTE_ETH_LINK_UP,
+   };
+
+   return rte_eth_linkstatus_set(dev, &link);
+}
+
 static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_close  = mana_dev_close,
.dev_supported_ptypes_get = mana_supported_ptypes,
+   .link_update= mana_dev_link_update,
 };
 
 static const struct eth_dev_ops mana_dev_secondary_ops = {
-- 
2.17.1



[Patch v8 05/18] net/mana: add function for device removal interrupts

2022-09-08 Thread longli
From: Long Li 

MANA supports PCI hot plug events. Add this interrupt to DPDK core so its
parent PMD can detect device removal during Azure servicing or live
migration.

Signed-off-by: Long Li 
---
Change log:
v8:
fix coding style of function definitions.

 doc/guides/nics/features/mana.ini |   1 +
 drivers/net/mana/mana.c   | 103 ++
 drivers/net/mana/mana.h   |   1 +
 3 files changed, 105 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 62554b0a0a..8043e11f99 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -7,5 +7,6 @@
 Link status  = P
 Linux= Y
 Multiprocess aware   = Y
+Removal event= Y
 Usage doc= Y
 x86-64   = Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 714e4ede28..8081a28acb 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -103,12 +103,18 @@ mana_dev_configure(struct rte_eth_dev *dev)
return 0;
 }
 
+static int mana_intr_uninstall(struct mana_priv *priv);
+
 static int
 mana_dev_close(struct rte_eth_dev *dev)
 {
struct mana_priv *priv = dev->data->dev_private;
int ret;
 
+   ret = mana_intr_uninstall(priv);
+   if (ret)
+   return ret;
+
ret = ibv_close_device(priv->ib_ctx);
if (ret) {
ret = errno;
@@ -340,6 +346,96 @@ mana_ibv_device_to_pci_addr(const struct ibv_device 
*device,
return 0;
 }
 
+/*
+ * Interrupt handler from IB layer to notify this device is being removed.
+ */
+static void
+mana_intr_handler(void *arg)
+{
+   struct mana_priv *priv = arg;
+   struct ibv_context *ctx = priv->ib_ctx;
+   struct ibv_async_event event;
+
+   /* Read and ack all messages from IB device */
+   while (true) {
+   if (ibv_get_async_event(ctx, &event))
+   break;
+
+   if (event.event_type == IBV_EVENT_DEVICE_FATAL) {
+   struct rte_eth_dev *dev;
+
+   dev = &rte_eth_devices[priv->port_id];
+   if (dev->data->dev_conf.intr_conf.rmv)
+   rte_eth_dev_callback_process(dev,
+   RTE_ETH_EVENT_INTR_RMV, NULL);
+   }
+
+   ibv_ack_async_event(&event);
+   }
+}
+
+static int
+mana_intr_uninstall(struct mana_priv *priv)
+{
+   int ret;
+
+   ret = rte_intr_callback_unregister(priv->intr_handle,
+  mana_intr_handler, priv);
+   if (ret <= 0) {
+   DRV_LOG(ERR, "Failed to unregister intr callback ret %d", ret);
+   return ret;
+   }
+
+   rte_intr_instance_free(priv->intr_handle);
+
+   return 0;
+}
+
+static int
+mana_intr_install(struct mana_priv *priv)
+{
+   int ret, flags;
+   struct ibv_context *ctx = priv->ib_ctx;
+
+   priv->intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_SHARED);
+   if (!priv->intr_handle) {
+   DRV_LOG(ERR, "Failed to allocate intr_handle");
+   rte_errno = ENOMEM;
+   return -ENOMEM;
+   }
+
+   rte_intr_fd_set(priv->intr_handle, -1);
+
+   flags = fcntl(ctx->async_fd, F_GETFL);
+   ret = fcntl(ctx->async_fd, F_SETFL, flags | O_NONBLOCK);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to change async_fd to NONBLOCK");
+   goto free_intr;
+   }
+
+   rte_intr_fd_set(priv->intr_handle, ctx->async_fd);
+   rte_intr_type_set(priv->intr_handle, RTE_INTR_HANDLE_EXT);
+
+   ret = rte_intr_callback_register(priv->intr_handle,
+mana_intr_handler, priv);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to register intr callback");
+   rte_intr_fd_set(priv->intr_handle, -1);
+   goto restore_fd;
+   }
+
+   return 0;
+
+restore_fd:
+   fcntl(ctx->async_fd, F_SETFL, flags);
+
+free_intr:
+   rte_intr_instance_free(priv->intr_handle);
+   priv->intr_handle = NULL;
+
+   return ret;
+}
+
 static int
 mana_proc_priv_init(struct rte_eth_dev *dev)
 {
@@ -667,6 +763,13 @@ mana_pci_probe_mac(struct rte_pci_device *pci_dev,
name, priv->max_rx_queues, priv->max_rx_desc,
priv->max_send_sge);
 
+   /* Create async interrupt handler */
+   ret = mana_intr_install(priv);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to install intr handler");
+   goto failed;
+   }
+
rte_spinlock_lock(&mana_shared_data->lock);
mana_shared_data->primary_cnt++;
rte_spinlock_unlock(&mana_shared_data->lock);
diff --git a/drivers/net/mana/mana.h b

[Patch v8 06/18] net/mana: add device info

2022-09-08 Thread longli
From: Long Li 

Add the function to get device info.

Signed-off-by: Long Li 
---
Change log:
v8:
use new macro definition start with "MANA_"
fix coding style to function definitions

 doc/guides/nics/features/mana.ini |  1 +
 drivers/net/mana/mana.c   | 83 +++
 2 files changed, 84 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 8043e11f99..566b3e8770 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -8,5 +8,6 @@ Link status  = P
 Linux= Y
 Multiprocess aware   = Y
 Removal event= Y
+Speed capabilities   = P
 Usage doc= Y
 x86-64   = Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 8081a28acb..9610782d6f 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -124,6 +124,87 @@ mana_dev_close(struct rte_eth_dev *dev)
return 0;
 }
 
+static int
+mana_dev_info_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+
+   dev_info->max_mtu = RTE_ETHER_MTU;
+
+   /* RX params */
+   dev_info->min_rx_bufsize = MIN_RX_BUF_SIZE;
+   dev_info->max_rx_pktlen = MAX_FRAME_SIZE;
+
+   dev_info->max_rx_queues = priv->max_rx_queues;
+   dev_info->max_tx_queues = priv->max_tx_queues;
+
+   dev_info->max_mac_addrs = MANA_MAX_MAC_ADDR;
+   dev_info->max_hash_mac_addrs = 0;
+
+   dev_info->max_vfs = 1;
+
+   /* Offload params */
+   dev_info->rx_offload_capa = MANA_DEV_RX_OFFLOAD_SUPPORT;
+
+   dev_info->tx_offload_capa = MANA_DEV_TX_OFFLOAD_SUPPORT;
+
+   /* RSS */
+   dev_info->reta_size = INDIRECTION_TABLE_NUM_ELEMENTS;
+   dev_info->hash_key_size = TOEPLITZ_HASH_KEY_SIZE_IN_BYTES;
+   dev_info->flow_type_rss_offloads = MANA_ETH_RSS_SUPPORT;
+
+   /* Thresholds */
+   dev_info->default_rxconf = (struct rte_eth_rxconf){
+   .rx_thresh = {
+   .pthresh = 8,
+   .hthresh = 8,
+   .wthresh = 0,
+   },
+   .rx_free_thresh = 32,
+   /* If no descriptors available, pkts are dropped by default */
+   .rx_drop_en = 1,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf){
+   .tx_thresh = {
+   .pthresh = 32,
+   .hthresh = 0,
+   .wthresh = 0,
+   },
+   .tx_rs_thresh = 32,
+   .tx_free_thresh = 32,
+   };
+
+   /* Buffer limits */
+   dev_info->rx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE;
+   dev_info->rx_desc_lim.nb_max = priv->max_rx_desc;
+   dev_info->rx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE;
+   dev_info->rx_desc_lim.nb_seg_max = priv->max_recv_sge;
+   dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge;
+
+   dev_info->tx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE;
+   dev_info->tx_desc_lim.nb_max = priv->max_tx_desc;
+   dev_info->tx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE;
+   dev_info->tx_desc_lim.nb_seg_max = priv->max_send_sge;
+   dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge;
+
+   /* Speed */
+   dev_info->speed_capa = ETH_LINK_SPEED_100G;
+
+   /* RX params */
+   dev_info->default_rxportconf.burst_size = 1;
+   dev_info->default_rxportconf.ring_size = MAX_RECEIVE_BUFFERS_PER_QUEUE;
+   dev_info->default_rxportconf.nb_queues = 1;
+
+   /* TX params */
+   dev_info->default_txportconf.burst_size = 1;
+   dev_info->default_txportconf.ring_size = MAX_SEND_BUFFERS_PER_QUEUE;
+   dev_info->default_txportconf.nb_queues = 1;
+
+   return 0;
+}
+
 static const uint32_t *
 mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused)
 {
@@ -160,11 +241,13 @@ mana_dev_link_update(struct rte_eth_dev *dev,
 static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_close  = mana_dev_close,
+   .dev_infos_get  = mana_dev_info_get,
.dev_supported_ptypes_get = mana_supported_ptypes,
.link_update= mana_dev_link_update,
 };
 
 static const struct eth_dev_ops mana_dev_secondary_ops = {
+   .dev_infos_get = mana_dev_info_get,
 };
 
 uint16_t
-- 
2.17.1



[Patch v8 07/18] net/mana: add function to configure RSS

2022-09-08 Thread longli
From: Long Li 

Currently this PMD supports RSS configuration when the device is stopped.
Configuring RSS in running state will be supported in the future.

Signed-off-by: Long Li 
---
change log:
v8:
fix coding sytle to function definitions

 doc/guides/nics/features/mana.ini |  1 +
 drivers/net/mana/mana.c   | 65 ++-
 drivers/net/mana/mana.h   |  1 +
 3 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 566b3e8770..a59c21cc10 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -8,6 +8,7 @@ Link status  = P
 Linux= Y
 Multiprocess aware   = Y
 Removal event= Y
+RSS hash = Y
 Speed capabilities   = P
 Usage doc= Y
 x86-64   = Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 9610782d6f..fe7eb19626 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -221,9 +221,70 @@ mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused)
return ptypes;
 }
 
+static int
+mana_rss_hash_update(struct rte_eth_dev *dev,
+struct rte_eth_rss_conf *rss_conf)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+
+   /* Currently can only update RSS hash when device is stopped */
+   if (dev->data->dev_started) {
+   DRV_LOG(ERR, "Can't update RSS after device has started");
+   return -ENODEV;
+   }
+
+   if (rss_conf->rss_hf & ~MANA_ETH_RSS_SUPPORT) {
+   DRV_LOG(ERR, "Port %u invalid RSS HF 0x%" PRIx64,
+   dev->data->port_id, rss_conf->rss_hf);
+   return -EINVAL;
+   }
+
+   if (rss_conf->rss_key && rss_conf->rss_key_len) {
+   if (rss_conf->rss_key_len != TOEPLITZ_HASH_KEY_SIZE_IN_BYTES) {
+   DRV_LOG(ERR, "Port %u key len must be %u long",
+   dev->data->port_id,
+   TOEPLITZ_HASH_KEY_SIZE_IN_BYTES);
+   return -EINVAL;
+   }
+
+   priv->rss_conf.rss_key_len = rss_conf->rss_key_len;
+   priv->rss_conf.rss_key =
+   rte_zmalloc("mana_rss", rss_conf->rss_key_len,
+   RTE_CACHE_LINE_SIZE);
+   if (!priv->rss_conf.rss_key)
+   return -ENOMEM;
+   memcpy(priv->rss_conf.rss_key, rss_conf->rss_key,
+  rss_conf->rss_key_len);
+   }
+   priv->rss_conf.rss_hf = rss_conf->rss_hf;
+
+   return 0;
+}
+
+static int
+mana_rss_hash_conf_get(struct rte_eth_dev *dev,
+  struct rte_eth_rss_conf *rss_conf)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+
+   if (!rss_conf)
+   return -EINVAL;
+
+   if (rss_conf->rss_key &&
+   rss_conf->rss_key_len >= priv->rss_conf.rss_key_len) {
+   memcpy(rss_conf->rss_key, priv->rss_conf.rss_key,
+  priv->rss_conf.rss_key_len);
+   }
+
+   rss_conf->rss_key_len = priv->rss_conf.rss_key_len;
+   rss_conf->rss_hf = priv->rss_conf.rss_hf;
+
+   return 0;
+}
+
 static int
 mana_dev_link_update(struct rte_eth_dev *dev,
-int wait_to_complete __rte_unused)
+   int wait_to_complete __rte_unused)
 {
struct rte_eth_link link;
 
@@ -243,6 +304,8 @@ static const struct eth_dev_ops mana_dev_ops = {
.dev_close  = mana_dev_close,
.dev_infos_get  = mana_dev_info_get,
.dev_supported_ptypes_get = mana_supported_ptypes,
+   .rss_hash_update= mana_rss_hash_update,
+   .rss_hash_conf_get  = mana_rss_hash_conf_get,
.link_update= mana_dev_link_update,
 };
 
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 4a84c6e778..04ccdfa0d1 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -71,6 +71,7 @@ struct mana_priv {
uint8_t ind_table_key[40];
struct ibv_qp *rwq_qp;
void *db_page;
+   struct rte_eth_rss_conf rss_conf;
struct rte_intr_handle *intr_handle;
int max_rx_queues;
int max_tx_queues;
-- 
2.17.1



[Patch v8 08/18] net/mana: add function to configure Rx queues

2022-09-08 Thread longli
From: Long Li 

Rx hardware queue is allocated when starting the queue. This function is
for queue configuration pre starting.

Signed-off-by: Long Li 
---
Change log:
v8:
fix coding style to function definitions

 drivers/net/mana/mana.c | 72 -
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index fe7eb19626..15bd7ea550 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -205,6 +205,17 @@ mana_dev_info_get(struct rte_eth_dev *dev,
return 0;
 }
 
+static void
+mana_dev_rx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id,
+  struct rte_eth_rxq_info *qinfo)
+{
+   struct mana_rxq *rxq = dev->data->rx_queues[queue_id];
+
+   qinfo->mp = rxq->mp;
+   qinfo->nb_desc = rxq->num_desc;
+   qinfo->conf.offloads = dev->data->dev_conf.rxmode.offloads;
+}
+
 static const uint32_t *
 mana_supported_ptypes(struct rte_eth_dev *dev __rte_unused)
 {
@@ -282,9 +293,65 @@ mana_rss_hash_conf_get(struct rte_eth_dev *dev,
return 0;
 }
 
+static int
+mana_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+   uint16_t nb_desc, unsigned int socket_id,
+   const struct rte_eth_rxconf *rx_conf __rte_unused,
+   struct rte_mempool *mp)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   struct mana_rxq *rxq;
+   int ret;
+
+   rxq = rte_zmalloc_socket("mana_rxq", sizeof(*rxq), 0, socket_id);
+   if (!rxq) {
+   DRV_LOG(ERR, "failed to allocate rxq");
+   return -ENOMEM;
+   }
+
+   DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u",
+   queue_idx, nb_desc, socket_id);
+
+   rxq->socket = socket_id;
+
+   rxq->desc_ring = rte_zmalloc_socket("mana_rx_mbuf_ring",
+   sizeof(struct mana_rxq_desc) *
+   nb_desc,
+   RTE_CACHE_LINE_SIZE, socket_id);
+
+   if (!rxq->desc_ring) {
+   DRV_LOG(ERR, "failed to allocate rxq desc_ring");
+   ret = -ENOMEM;
+   goto fail;
+   }
+
+   rxq->num_desc = nb_desc;
+
+   rxq->priv = priv;
+   rxq->num_desc = nb_desc;
+   rxq->mp = mp;
+   dev->data->rx_queues[queue_idx] = rxq;
+
+   return 0;
+
+fail:
+   rte_free(rxq->desc_ring);
+   rte_free(rxq);
+   return ret;
+}
+
+static void
+mana_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t qid)
+{
+   struct mana_rxq *rxq = dev->data->rx_queues[qid];
+
+   rte_free(rxq->desc_ring);
+   rte_free(rxq);
+}
+
 static int
 mana_dev_link_update(struct rte_eth_dev *dev,
-   int wait_to_complete __rte_unused)
+int wait_to_complete __rte_unused)
 {
struct rte_eth_link link;
 
@@ -303,9 +370,12 @@ static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_close  = mana_dev_close,
.dev_infos_get  = mana_dev_info_get,
+   .rxq_info_get   = mana_dev_rx_queue_info,
.dev_supported_ptypes_get = mana_supported_ptypes,
.rss_hash_update= mana_rss_hash_update,
.rss_hash_conf_get  = mana_rss_hash_conf_get,
+   .rx_queue_setup = mana_dev_rx_queue_setup,
+   .rx_queue_release   = mana_dev_rx_queue_release,
.link_update= mana_dev_link_update,
 };
 
-- 
2.17.1



[Patch v8 09/18] net/mana: add function to configure Tx queues

2022-09-08 Thread longli
From: Long Li 

Tx hardware queue is allocated when starting the queue, this is for
pre configuration.

Signed-off-by: Long Li 
---
change log:
v8:
fix coding style to function definitions

 drivers/net/mana/mana.c | 67 +
 1 file changed, 67 insertions(+)

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 15bd7ea550..bc8238a02b 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -205,6 +205,16 @@ mana_dev_info_get(struct rte_eth_dev *dev,
return 0;
 }
 
+static void
+mana_dev_tx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id,
+  struct rte_eth_txq_info *qinfo)
+{
+   struct mana_txq *txq = dev->data->tx_queues[queue_id];
+
+   qinfo->conf.offloads = dev->data->dev_conf.txmode.offloads;
+   qinfo->nb_desc = txq->num_desc;
+}
+
 static void
 mana_dev_rx_queue_info(struct rte_eth_dev *dev, uint16_t queue_id,
   struct rte_eth_rxq_info *qinfo)
@@ -293,6 +303,60 @@ mana_rss_hash_conf_get(struct rte_eth_dev *dev,
return 0;
 }
 
+static int
+mana_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+   uint16_t nb_desc, unsigned int socket_id,
+   const struct rte_eth_txconf *tx_conf __rte_unused)
+
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   struct mana_txq *txq;
+   int ret;
+
+   txq = rte_zmalloc_socket("mana_txq", sizeof(*txq), 0, socket_id);
+   if (!txq) {
+   DRV_LOG(ERR, "failed to allocate txq");
+   return -ENOMEM;
+   }
+
+   txq->socket = socket_id;
+
+   txq->desc_ring = rte_malloc_socket("mana_tx_desc_ring",
+  sizeof(struct mana_txq_desc) *
+   nb_desc,
+  RTE_CACHE_LINE_SIZE, socket_id);
+   if (!txq->desc_ring) {
+   DRV_LOG(ERR, "failed to allocate txq desc_ring");
+   ret = -ENOMEM;
+   goto fail;
+   }
+
+   DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u txq->desc_ring %p",
+   queue_idx, nb_desc, socket_id, txq->desc_ring);
+
+   txq->desc_ring_head = 0;
+   txq->desc_ring_tail = 0;
+   txq->priv = priv;
+   txq->num_desc = nb_desc;
+   dev->data->tx_queues[queue_idx] = txq;
+
+   return 0;
+
+fail:
+   rte_free(txq->desc_ring);
+   rte_free(txq);
+   return ret;
+}
+
+static void
+mana_dev_tx_queue_release(struct rte_eth_dev *dev, uint16_t qid)
+{
+   struct mana_txq *txq = dev->data->tx_queues[qid];
+
+   rte_free(txq->desc_ring);
+   rte_free(txq);
+}
+
 static int
 mana_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
uint16_t nb_desc, unsigned int socket_id,
@@ -370,10 +434,13 @@ static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_close  = mana_dev_close,
.dev_infos_get  = mana_dev_info_get,
+   .txq_info_get   = mana_dev_tx_queue_info,
.rxq_info_get   = mana_dev_rx_queue_info,
.dev_supported_ptypes_get = mana_supported_ptypes,
.rss_hash_update= mana_rss_hash_update,
.rss_hash_conf_get  = mana_rss_hash_conf_get,
+   .tx_queue_setup = mana_dev_tx_queue_setup,
+   .tx_queue_release   = mana_dev_tx_queue_release,
.rx_queue_setup = mana_dev_rx_queue_setup,
.rx_queue_release   = mana_dev_rx_queue_release,
.link_update= mana_dev_link_update,
-- 
2.17.1



[Patch v8 10/18] net/mana: implement memory registration

2022-09-08 Thread longli
From: Long Li 

MANA hardware has iommu built-in, that provides hardware safe access to
user memory through memory registration. Since memory registration is an
expensive operation, this patch implements a two level memory registration
cache mechanisum for each queue and for each port.

Signed-off-by: Long Li 
---
Change log:
v2:
Change all header file functions to start with mana_.
Use spinlock in place of rwlock to memory cache access.
Remove unused header files.
v4:
Remove extra "\n" in logging function.
v8:
Fix Coding style to function definitions.

 drivers/net/mana/mana.c  |  20 ++
 drivers/net/mana/mana.h  |  39 
 drivers/net/mana/meson.build |   1 +
 drivers/net/mana/mp.c|  92 +
 drivers/net/mana/mr.c| 348 +++
 5 files changed, 500 insertions(+)
 create mode 100644 drivers/net/mana/mr.c

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index bc8238a02b..67bef6bd32 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -111,6 +111,8 @@ mana_dev_close(struct rte_eth_dev *dev)
struct mana_priv *priv = dev->data->dev_private;
int ret;
 
+   mana_remove_all_mr(priv);
+
ret = mana_intr_uninstall(priv);
if (ret)
return ret;
@@ -331,6 +333,13 @@ mana_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_idx,
goto fail;
}
 
+   ret = mana_mr_btree_init(&txq->mr_btree,
+MANA_MR_BTREE_PER_QUEUE_N, socket_id);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to init TXQ MR btree");
+   goto fail;
+   }
+
DRV_LOG(DEBUG, "idx %u nb_desc %u socket %u txq->desc_ring %p",
queue_idx, nb_desc, socket_id, txq->desc_ring);
 
@@ -353,6 +362,8 @@ mana_dev_tx_queue_release(struct rte_eth_dev *dev, uint16_t 
qid)
 {
struct mana_txq *txq = dev->data->tx_queues[qid];
 
+   mana_mr_btree_free(&txq->mr_btree);
+
rte_free(txq->desc_ring);
rte_free(txq);
 }
@@ -389,6 +400,13 @@ mana_dev_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_idx,
goto fail;
}
 
+   ret = mana_mr_btree_init(&rxq->mr_btree,
+MANA_MR_BTREE_PER_QUEUE_N, socket_id);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to init RXQ MR btree");
+   goto fail;
+   }
+
rxq->num_desc = nb_desc;
 
rxq->priv = priv;
@@ -409,6 +427,8 @@ mana_dev_rx_queue_release(struct rte_eth_dev *dev, uint16_t 
qid)
 {
struct mana_rxq *rxq = dev->data->rx_queues[qid];
 
+   mana_mr_btree_free(&rxq->mr_btree);
+
rte_free(rxq->desc_ring);
rte_free(rxq);
 }
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 04ccdfa0d1..964c30551b 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -49,6 +49,22 @@ struct mana_shared_data {
 #define MAX_RECEIVE_BUFFERS_PER_QUEUE  256
 #define MAX_SEND_BUFFERS_PER_QUEUE 256
 
+struct mana_mr_cache {
+   uint32_tlkey;
+   uintptr_t   addr;
+   size_t  len;
+   void*verb_obj;
+};
+
+#define MANA_MR_BTREE_CACHE_N  512
+struct mana_mr_btree {
+   uint16_tlen;/* Used entries */
+   uint16_tsize;   /* Total entries */
+   int overflow;
+   int socket;
+   struct mana_mr_cache *table;
+};
+
 struct mana_process_priv {
void *db_page;
 };
@@ -81,6 +97,8 @@ struct mana_priv {
int max_recv_sge;
int max_mr;
uint64_t max_mr_size;
+   struct mana_mr_btree mr_btree;
+   rte_spinlock_t  mr_btree_lock;
 };
 
 struct mana_txq_desc {
@@ -130,6 +148,7 @@ struct mana_txq {
uint32_t desc_ring_head, desc_ring_tail;
 
struct mana_stats stats;
+   struct mana_mr_btree mr_btree;
unsigned int socket;
 };
 
@@ -152,6 +171,7 @@ struct mana_rxq {
struct mana_gdma_queue gdma_cq;
 
struct mana_stats stats;
+   struct mana_mr_btree mr_btree;
 
unsigned int socket;
 };
@@ -175,6 +195,24 @@ uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct 
rte_mbuf **pkts,
 uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts,
   uint16_t pkts_n);
 
+struct mana_mr_cache *mana_find_pmd_mr(struct mana_mr_btree *local_tree,
+  struct mana_priv *priv,
+  struct rte_mbuf *mbuf);
+int mana_new_pmd_mr(struct mana_mr_btree *local_tree, struct mana_priv *priv,
+   struct rte_mempool *pool);
+void mana_remove_all_mr(struct mana_priv *priv);
+void mana_del_pmd_mr(struct mana_mr_cache *mr);
+
+void mana_mempool_chunk_cb(struct rte_mempool *mp, void *opaque,
+  struct rte_mempool_memhdr *memhdr, unsigned int idx);
+
+struct mana_mr_cache *mana_mr_btree_lookup(struct mana_mr_btree *bt,
+

[Patch v8 11/18] net/mana: implement the hardware layer operations

2022-09-08 Thread longli
From: Long Li 

The hardware layer of MANA understands the device queue and doorbell
formats. Those functions are implemented for use by packet RX/TX code.

Signed-off-by: Long Li 
---
Change log:
v2:
Remove unused header files.
Rename a camel case.
v5:
Use RTE_BIT32() instead of defining a new BIT()
v6:
add rte_rmb() after reading owner bits
v8:
fix coding style to function definitions.
use capital letters for all enum names

 drivers/net/mana/gdma.c  | 301 +++
 drivers/net/mana/mana.h  | 183 +
 drivers/net/mana/meson.build |   1 +
 3 files changed, 485 insertions(+)
 create mode 100644 drivers/net/mana/gdma.c

diff --git a/drivers/net/mana/gdma.c b/drivers/net/mana/gdma.c
new file mode 100644
index 00..3f937d6c93
--- /dev/null
+++ b/drivers/net/mana/gdma.c
@@ -0,0 +1,301 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2022 Microsoft Corporation
+ */
+
+#include 
+#include 
+
+#include "mana.h"
+
+uint8_t *
+gdma_get_wqe_pointer(struct mana_gdma_queue *queue)
+{
+   uint32_t offset_in_bytes =
+   (queue->head * GDMA_WQE_ALIGNMENT_UNIT_SIZE) &
+   (queue->size - 1);
+
+   DRV_LOG(DEBUG, "txq sq_head %u sq_size %u offset_in_bytes %u",
+   queue->head, queue->size, offset_in_bytes);
+
+   if (offset_in_bytes + GDMA_WQE_ALIGNMENT_UNIT_SIZE > queue->size)
+   DRV_LOG(ERR, "fatal error: offset_in_bytes %u too big",
+   offset_in_bytes);
+
+   return ((uint8_t *)queue->buffer) + offset_in_bytes;
+}
+
+static uint32_t
+write_dma_client_oob(uint8_t *work_queue_buffer_pointer,
+const struct gdma_work_request *work_request,
+uint32_t client_oob_size)
+{
+   uint8_t *p = work_queue_buffer_pointer;
+
+   struct gdma_wqe_dma_oob *header = (struct gdma_wqe_dma_oob *)p;
+
+   memset(header, 0, sizeof(struct gdma_wqe_dma_oob));
+   header->num_sgl_entries = work_request->num_sgl_elements;
+   header->inline_client_oob_size_in_dwords =
+   client_oob_size / sizeof(uint32_t);
+   header->client_data_unit = work_request->client_data_unit;
+
+   DRV_LOG(DEBUG, "queue buf %p sgl %u oob_h %u du %u oob_buf %p oob_b %u",
+   work_queue_buffer_pointer, header->num_sgl_entries,
+   header->inline_client_oob_size_in_dwords,
+   header->client_data_unit, work_request->inline_oob_data,
+   work_request->inline_oob_size_in_bytes);
+
+   p += sizeof(struct gdma_wqe_dma_oob);
+   if (work_request->inline_oob_data &&
+   work_request->inline_oob_size_in_bytes > 0) {
+   memcpy(p, work_request->inline_oob_data,
+  work_request->inline_oob_size_in_bytes);
+   if (client_oob_size > work_request->inline_oob_size_in_bytes)
+   memset(p + work_request->inline_oob_size_in_bytes, 0,
+  client_oob_size -
+  work_request->inline_oob_size_in_bytes);
+   }
+
+   return sizeof(struct gdma_wqe_dma_oob) + client_oob_size;
+}
+
+static uint32_t
+write_scatter_gather_list(uint8_t *work_queue_head_pointer,
+ uint8_t *work_queue_end_pointer,
+ uint8_t *work_queue_cur_pointer,
+ struct gdma_work_request *work_request)
+{
+   struct gdma_sgl_element *sge_list;
+   struct gdma_sgl_element dummy_sgl[1];
+   uint8_t *address;
+   uint32_t size;
+   uint32_t num_sge;
+   uint32_t size_to_queue_end;
+   uint32_t sge_list_size;
+
+   DRV_LOG(DEBUG, "work_queue_cur_pointer %p work_request->flags %x",
+   work_queue_cur_pointer, work_request->flags);
+
+   num_sge = work_request->num_sgl_elements;
+   sge_list = work_request->sgl;
+   size_to_queue_end = (uint32_t)(work_queue_end_pointer -
+  work_queue_cur_pointer);
+
+   if (num_sge == 0) {
+   /* Per spec, the case of an empty SGL should be handled as
+* follows to avoid corrupted WQE errors:
+* Write one dummy SGL entry
+* Set the address to 1, leave the rest as 0
+*/
+   dummy_sgl[num_sge].address = 1;
+   dummy_sgl[num_sge].size = 0;
+   dummy_sgl[num_sge].memory_key = 0;
+   num_sge++;
+   sge_list = dummy_sgl;
+   }
+
+   sge_list_size = 0;
+   {
+   address = (uint8_t *)sge_list;
+   size = sizeof(struct gdma_sgl_element) * num_sge;
+   if (size_to_queue_end < size) {
+   memcpy(work_queue_cur_pointer, address,
+  size_to_queue_end);
+   work_queue_cur_pointer = work_queue_head_pointer;
+   address += size_to_queue_end;
+   size -

[Patch v8 12/18] net/mana: add function to start/stop Tx queues

2022-09-08 Thread longli
From: Long Li 

MANA allocate device queues through the IB layer when starting Tx queues.
When device is stopped all the queues are unmapped and freed.

Signed-off-by: Long Li 
---
Change log:
v2:
Add prefix mana_ to all function names.
Remove unused header files.
v8:
fix coding style to function definitions.

 doc/guides/nics/features/mana.ini |   1 +
 drivers/net/mana/mana.h   |   4 +
 drivers/net/mana/meson.build  |   1 +
 drivers/net/mana/tx.c | 166 ++
 4 files changed, 172 insertions(+)
 create mode 100644 drivers/net/mana/tx.c

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index a59c21cc10..821443b292 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -7,6 +7,7 @@
 Link status  = P
 Linux= Y
 Multiprocess aware   = Y
+Queue start/stop = Y
 Removal event= Y
 RSS hash = Y
 Speed capabilities   = P
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 5abebe8e21..6a28f7c261 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -378,6 +378,10 @@ uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct 
rte_mbuf **pkts,
 int gdma_poll_completion_queue(struct mana_gdma_queue *cq,
   struct gdma_comp *comp);
 
+int mana_start_tx_queues(struct rte_eth_dev *dev);
+
+int mana_stop_tx_queues(struct rte_eth_dev *dev);
+
 struct mana_mr_cache *mana_find_pmd_mr(struct mana_mr_btree *local_tree,
   struct mana_priv *priv,
   struct rte_mbuf *mbuf);
diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build
index 364d57a619..031f443d16 100644
--- a/drivers/net/mana/meson.build
+++ b/drivers/net/mana/meson.build
@@ -11,6 +11,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs']
 
 sources += files(
'mana.c',
+   'tx.c',
'mr.c',
'gdma.c',
'mp.c',
diff --git a/drivers/net/mana/tx.c b/drivers/net/mana/tx.c
new file mode 100644
index 00..e4ff0fbf56
--- /dev/null
+++ b/drivers/net/mana/tx.c
@@ -0,0 +1,166 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2022 Microsoft Corporation
+ */
+
+#include 
+
+#include 
+#include 
+
+#include "mana.h"
+
+int
+mana_stop_tx_queues(struct rte_eth_dev *dev)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   int i, ret;
+
+   for (i = 0; i < priv->num_queues; i++) {
+   struct mana_txq *txq = dev->data->tx_queues[i];
+
+   if (txq->qp) {
+   ret = ibv_destroy_qp(txq->qp);
+   if (ret)
+   DRV_LOG(ERR, "tx_queue destroy_qp failed %d",
+   ret);
+   txq->qp = NULL;
+   }
+
+   if (txq->cq) {
+   ret = ibv_destroy_cq(txq->cq);
+   if (ret)
+   DRV_LOG(ERR, "tx_queue destroy_cp failed %d",
+   ret);
+   txq->cq = NULL;
+   }
+
+   /* Drain and free posted WQEs */
+   while (txq->desc_ring_tail != txq->desc_ring_head) {
+   struct mana_txq_desc *desc =
+   &txq->desc_ring[txq->desc_ring_tail];
+
+   rte_pktmbuf_free(desc->pkt);
+
+   txq->desc_ring_tail =
+   (txq->desc_ring_tail + 1) % txq->num_desc;
+   }
+   txq->desc_ring_head = 0;
+   txq->desc_ring_tail = 0;
+
+   memset(&txq->gdma_sq, 0, sizeof(txq->gdma_sq));
+   memset(&txq->gdma_cq, 0, sizeof(txq->gdma_cq));
+   }
+
+   return 0;
+}
+
+int
+mana_start_tx_queues(struct rte_eth_dev *dev)
+{
+   struct mana_priv *priv = dev->data->dev_private;
+   int ret, i;
+
+   /* start TX queues */
+   for (i = 0; i < priv->num_queues; i++) {
+   struct mana_txq *txq;
+   struct ibv_qp_init_attr qp_attr = { 0 };
+   struct manadv_obj obj = {};
+   struct manadv_qp dv_qp;
+   struct manadv_cq dv_cq;
+
+   txq = dev->data->tx_queues[i];
+
+   manadv_set_context_attr(priv->ib_ctx,
+   MANADV_CTX_ATTR_BUF_ALLOCATORS,
+   (void *)((uintptr_t)&(struct manadv_ctx_allocators){
+   .alloc = &mana_alloc_verbs_buf,
+   .free = &mana_free_verbs_buf,
+   .data = (void *)(uintptr_t)txq->socket,
+   }));
+
+   txq->cq = ibv_create_cq(priv->ib_ctx, txq->num_desc,
+   NULL, NULL, 0);
+   if (!txq->cq) {
+   DRV_LOG(ERR, "failed to create cq queue inde

[Patch v8 13/18] net/mana: add function to start/stop Rx queues

2022-09-08 Thread longli
From: Long Li 

MANA allocates device queues through the IB layer when starting Rx queues.
When device is stopped all the queues are unmapped and freed.

Signed-off-by: Long Li 
---
Change log:
v2:
Add prefix mana_ to all function names.
Remove unused header files.
v4:
Move defition "uint32_t i" from inside "for ()" to outside
v8:
Fix coding style to function definitions.

 drivers/net/mana/mana.h  |   3 +
 drivers/net/mana/meson.build |   1 +
 drivers/net/mana/rx.c| 354 +++
 3 files changed, 358 insertions(+)
 create mode 100644 drivers/net/mana/rx.c

diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 6a28f7c261..27fff3 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -363,6 +363,7 @@ extern int mana_logtype_init;
 
 int mana_ring_doorbell(void *db_page, enum gdma_queue_types queue_type,
   uint32_t queue_id, uint32_t tail);
+int mana_rq_ring_doorbell(struct mana_rxq *rxq);
 
 int gdma_post_work_request(struct mana_gdma_queue *queue,
   struct gdma_work_request *work_req,
@@ -378,8 +379,10 @@ uint16_t mana_tx_burst_removed(void *dpdk_rxq, struct 
rte_mbuf **pkts,
 int gdma_poll_completion_queue(struct mana_gdma_queue *cq,
   struct gdma_comp *comp);
 
+int mana_start_rx_queues(struct rte_eth_dev *dev);
 int mana_start_tx_queues(struct rte_eth_dev *dev);
 
+int mana_stop_rx_queues(struct rte_eth_dev *dev);
 int mana_stop_tx_queues(struct rte_eth_dev *dev);
 
 struct mana_mr_cache *mana_find_pmd_mr(struct mana_mr_btree *local_tree,
diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build
index 031f443d16..62e103a510 100644
--- a/drivers/net/mana/meson.build
+++ b/drivers/net/mana/meson.build
@@ -11,6 +11,7 @@ deps += ['pci', 'bus_pci', 'net', 'eal', 'kvargs']
 
 sources += files(
'mana.c',
+   'rx.c',
'tx.c',
'mr.c',
'gdma.c',
diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c
new file mode 100644
index 00..968e50686d
--- /dev/null
+++ b/drivers/net/mana/rx.c
@@ -0,0 +1,354 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2022 Microsoft Corporation
+ */
+#include 
+
+#include 
+#include 
+
+#include "mana.h"
+
+static uint8_t mana_rss_hash_key_default[TOEPLITZ_HASH_KEY_SIZE_IN_BYTES] = {
+   0x2c, 0xc6, 0x81, 0xd1,
+   0x5b, 0xdb, 0xf4, 0xf7,
+   0xfc, 0xa2, 0x83, 0x19,
+   0xdb, 0x1a, 0x3e, 0x94,
+   0x6b, 0x9e, 0x38, 0xd9,
+   0x2c, 0x9c, 0x03, 0xd1,
+   0xad, 0x99, 0x44, 0xa7,
+   0xd9, 0x56, 0x3d, 0x59,
+   0x06, 0x3c, 0x25, 0xf3,
+   0xfc, 0x1f, 0xdc, 0x2a,
+};
+
+int
+mana_rq_ring_doorbell(struct mana_rxq *rxq)
+{
+   struct mana_priv *priv = rxq->priv;
+   int ret;
+   void *db_page = priv->db_page;
+
+   if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+   struct rte_eth_dev *dev =
+   &rte_eth_devices[priv->dev_data->port_id];
+   struct mana_process_priv *process_priv = dev->process_private;
+
+   db_page = process_priv->db_page;
+   }
+
+   ret = mana_ring_doorbell(db_page, GDMA_QUEUE_RECEIVE,
+rxq->gdma_rq.id,
+rxq->gdma_rq.head *
+   GDMA_WQE_ALIGNMENT_UNIT_SIZE);
+
+   if (ret)
+   DRV_LOG(ERR, "failed to ring RX doorbell ret %d", ret);
+
+   return ret;
+}
+
+static int
+mana_alloc_and_post_rx_wqe(struct mana_rxq *rxq)
+{
+   struct rte_mbuf *mbuf = NULL;
+   struct gdma_sgl_element sgl[1];
+   struct gdma_work_request request = {0};
+   struct gdma_posted_wqe_info wqe_info = {0};
+   struct mana_priv *priv = rxq->priv;
+   int ret;
+   struct mana_mr_cache *mr;
+
+   mbuf = rte_pktmbuf_alloc(rxq->mp);
+   if (!mbuf) {
+   rxq->stats.nombuf++;
+   return -ENOMEM;
+   }
+
+   mr = mana_find_pmd_mr(&rxq->mr_btree, priv, mbuf);
+   if (!mr) {
+   DRV_LOG(ERR, "failed to register RX MR");
+   rte_pktmbuf_free(mbuf);
+   return -ENOMEM;
+   }
+
+   request.gdma_header.struct_size = sizeof(request);
+   wqe_info.gdma_header.struct_size = sizeof(wqe_info);
+
+   sgl[0].address = rte_cpu_to_le_64(rte_pktmbuf_mtod(mbuf, uint64_t));
+   sgl[0].memory_key = mr->lkey;
+   sgl[0].size =
+   rte_pktmbuf_data_room_size(rxq->mp) -
+   RTE_PKTMBUF_HEADROOM;
+
+   request.sgl = sgl;
+   request.num_sgl_elements = 1;
+   request.inline_oob_data = NULL;
+   request.inline_oob_size_in_bytes = 0;
+   request.flags = 0;
+   request.client_data_unit = NOT_USING_CLIENT_DATA_UNIT;
+
+   ret = gdma_post_work_request(&rxq->gdma_rq, &request, &wqe_info);
+   if (!ret) {
+   struct mana_rxq_desc *desc =
+   &rxq->desc_ring[rxq->desc_ri

[Patch v8 14/18] net/mana: add function to receive packets

2022-09-08 Thread longli
From: Long Li 

With all the RX queues created, MANA can use those queues to receive
packets.

Signed-off-by: Long Li 
---
Change log:
v2:
Add mana_ to all function names.
Rename a camel case.
v8:
Fix coding style to function definitions.

 doc/guides/nics/features/mana.ini |   2 +
 drivers/net/mana/mana.c   |   2 +
 drivers/net/mana/mana.h   |  37 +++
 drivers/net/mana/mp.c |   2 +
 drivers/net/mana/rx.c | 105 ++
 5 files changed, 148 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 821443b292..fdbf22d335 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -6,6 +6,8 @@
 [Features]
 Link status  = P
 Linux= Y
+L3 checksum offload  = Y
+L4 checksum offload  = Y
 Multiprocess aware   = Y
 Queue start/stop = Y
 Removal event= Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 67bef6bd32..7ed6063cc3 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -990,6 +990,8 @@ mana_pci_probe_mac(struct rte_pci_device *pci_dev,
/* fd is no not used after mapping doorbell */
close(fd);
 
+   eth_dev->rx_pkt_burst = mana_rx_burst;
+
rte_spinlock_lock(&mana_shared_data->lock);
mana_shared_data->secondary_cnt++;
mana_local_data.secondary_cnt++;
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index 27fff3..c2ffa14009 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -177,6 +177,11 @@ struct gdma_work_request {
 
 enum mana_cqe_type {
CQE_INVALID = 0,
+
+   CQE_RX_OKAY = 1,
+   CQE_RX_COALESCED_4  = 2,
+   CQE_RX_OBJECT_FENCE = 3,
+   CQE_RX_TRUNCATED= 4,
 };
 
 struct mana_cqe_header {
@@ -202,6 +207,35 @@ struct mana_cqe_header {
(NDIS_HASH_TCP_IPV4 | NDIS_HASH_UDP_IPV4 | NDIS_HASH_TCP_IPV6 |  \
 NDIS_HASH_UDP_IPV6 | NDIS_HASH_TCP_IPV6_EX | NDIS_HASH_UDP_IPV6_EX)
 
+struct mana_rx_comp_per_packet_info {
+   uint32_t packet_length  : 16;
+   uint32_t reserved0  : 16;
+   uint32_t reserved1;
+   uint32_t packet_hash;
+}; /* HW DATA */
+#define RX_COM_OOB_NUM_PACKETINFO_SEGMENTS 4
+
+struct mana_rx_comp_oob {
+   struct mana_cqe_header cqe_hdr;
+
+   uint32_t rx_vlan_id : 12;
+   uint32_t rx_vlan_tag_present: 1;
+   uint32_t rx_outer_ip_header_checksum_succeeded  : 1;
+   uint32_t rx_outer_ip_header_checksum_failed : 1;
+   uint32_t reserved   : 1;
+   uint32_t rx_hash_type   : 9;
+   uint32_t rx_ip_header_checksum_succeeded: 1;
+   uint32_t rx_ip_header_checksum_failed   : 1;
+   uint32_t rx_tcp_checksum_succeeded  : 1;
+   uint32_t rx_tcp_checksum_failed : 1;
+   uint32_t rx_udp_checksum_succeeded  : 1;
+   uint32_t rx_udp_checksum_failed : 1;
+   uint32_t reserved1  : 1;
+   struct mana_rx_comp_per_packet_info
+   packet_info[RX_COM_OOB_NUM_PACKETINFO_SEGMENTS];
+   uint32_t received_wqe_offset;
+}; /* HW DATA */
+
 struct gdma_wqe_dma_oob {
uint32_t reserved:24;
uint32_t last_v_bytes:8;
@@ -370,6 +404,9 @@ int gdma_post_work_request(struct mana_gdma_queue *queue,
   struct gdma_posted_wqe_info *wqe_info);
 uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue *queue);
 
+uint16_t mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **rx_pkts,
+  uint16_t pkts_n);
+
 uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts,
   uint16_t pkts_n);
 
diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c
index a3b5ede559..feda30623a 100644
--- a/drivers/net/mana/mp.c
+++ b/drivers/net/mana/mp.c
@@ -141,6 +141,8 @@ mana_mp_secondary_handle(const struct rte_mp_msg *mp_msg, 
const void *peer)
case MANA_MP_REQ_START_RXTX:
DRV_LOG(INFO, "Port %u starting datapath", dev->data->port_id);
 
+   dev->rx_pkt_burst = mana_rx_burst;
+
rte_mb();
 
res->result = 0;
diff --git a/drivers/net/mana/rx.c b/drivers/net/mana/rx.c
index 968e50686d..b80a5d1c7a 100644
--- a/drivers/net/mana/rx.c
+++ b/drivers/net/mana/rx.c
@@ -352,3 +352,108 @@ mana_start_rx_queues(struct rte_eth_dev *dev)
mana_stop_rx_queues(dev);
return ret;
 }
+
+uint16_t
+mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
+{
+   uint16_t pkt_received = 0, cqe_processed = 0;
+   struct mana_rxq *rxq 

[Patch v8 15/18] net/mana: add function to send packets

2022-09-08 Thread longli
From: Long Li 

With all the TX queues created, MANA can send packets over those queues.

Signed-off-by: Long Li 
---
Change log:
v2: rename all camel cases.
v7: return the correct number of packets sent
v8:
fix coding style to function definitions.
change enum names to use capital letters.

 doc/guides/nics/features/mana.ini |   1 +
 drivers/net/mana/mana.c   |   1 +
 drivers/net/mana/mana.h   |  65 
 drivers/net/mana/mp.c |   1 +
 drivers/net/mana/tx.c | 248 ++
 5 files changed, 316 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index fdbf22d335..7922816d66 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -4,6 +4,7 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
+Free Tx mbuf on demand = Y
 Link status  = P
 Linux= Y
 L3 checksum offload  = Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 7ed6063cc3..92692037b1 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -990,6 +990,7 @@ mana_pci_probe_mac(struct rte_pci_device *pci_dev,
/* fd is no not used after mapping doorbell */
close(fd);
 
+   eth_dev->tx_pkt_burst = mana_tx_burst;
eth_dev->rx_pkt_burst = mana_rx_burst;
 
rte_spinlock_lock(&mana_shared_data->lock);
diff --git a/drivers/net/mana/mana.h b/drivers/net/mana/mana.h
index c2ffa14009..83e3be0d6d 100644
--- a/drivers/net/mana/mana.h
+++ b/drivers/net/mana/mana.h
@@ -61,6 +61,47 @@ struct mana_shared_data {
 
 #define NOT_USING_CLIENT_DATA_UNIT 0
 
+enum tx_packet_format_v2 {
+   SHORT_PACKET_FORMAT = 0,
+   LONG_PACKET_FORMAT = 1
+};
+
+struct transmit_short_oob_v2 {
+   enum tx_packet_format_v2 packet_format : 2;
+   uint32_t tx_is_outer_ipv4 : 1;
+   uint32_t tx_is_outer_ipv6 : 1;
+   uint32_t tx_compute_IP_header_checksum : 1;
+   uint32_t tx_compute_TCP_checksum : 1;
+   uint32_t tx_compute_UDP_checksum : 1;
+   uint32_t suppress_tx_CQE_generation : 1;
+   uint32_t VCQ_number : 24;
+   uint32_t tx_transport_header_offset : 10;
+   uint32_t VSQ_frame_num : 14;
+   uint32_t short_vport_offset : 8;
+};
+
+struct transmit_long_oob_v2 {
+   uint32_t tx_is_encapsulated_packet : 1;
+   uint32_t tx_inner_is_ipv6 : 1;
+   uint32_t tx_inner_TCP_options_present : 1;
+   uint32_t inject_vlan_prior_tag : 1;
+   uint32_t reserved1 : 12;
+   uint32_t priority_code_point : 3;
+   uint32_t drop_eligible_indicator : 1;
+   uint32_t vlan_identifier : 12;
+   uint32_t tx_inner_frame_offset : 10;
+   uint32_t tx_inner_IP_header_relative_offset : 6;
+   uint32_t long_vport_offset : 12;
+   uint32_t reserved3 : 4;
+   uint32_t reserved4 : 32;
+   uint32_t reserved5 : 32;
+};
+
+struct transmit_oob_v2 {
+   struct transmit_short_oob_v2 short_oob;
+   struct transmit_long_oob_v2 long_oob;
+};
+
 enum gdma_queue_types {
GDMA_QUEUE_TYPE_INVALID  = 0,
GDMA_QUEUE_SEND,
@@ -182,6 +223,17 @@ enum mana_cqe_type {
CQE_RX_COALESCED_4  = 2,
CQE_RX_OBJECT_FENCE = 3,
CQE_RX_TRUNCATED= 4,
+
+   CQE_TX_OKAY = 32,
+   CQE_TX_SA_DROP  = 33,
+   CQE_TX_MTU_DROP = 34,
+   CQE_TX_INVALID_OOB  = 35,
+   CQE_TX_INVALID_ETH_TYPE = 36,
+   CQE_TX_HDR_PROCESSING_ERROR = 37,
+   CQE_TX_VF_DISABLED  = 38,
+   CQE_TX_VPORT_IDX_OUT_OF_RANGE   = 39,
+   CQE_TX_VPORT_DISABLED   = 40,
+   CQE_TX_VLAN_TAGGING_VIOLATION   = 41,
 };
 
 struct mana_cqe_header {
@@ -190,6 +242,17 @@ struct mana_cqe_header {
uint32_t vendor_err  : 24;
 }; /* HW DATA */
 
+struct mana_tx_comp_oob {
+   struct mana_cqe_header cqe_hdr;
+
+   uint32_t tx_data_offset;
+
+   uint32_t tx_sgl_offset   : 5;
+   uint32_t tx_wqe_offset   : 27;
+
+   uint32_t reserved[12];
+}; /* HW DATA */
+
 /* NDIS HASH Types */
 #define BIT(nr)(1 << (nr))
 #define NDIS_HASH_IPV4  BIT(0)
@@ -406,6 +469,8 @@ uint8_t *gdma_get_wqe_pointer(struct mana_gdma_queue 
*queue);
 
 uint16_t mana_rx_burst(void *dpdk_rxq, struct rte_mbuf **rx_pkts,
   uint16_t pkts_n);
+uint16_t mana_tx_burst(void *dpdk_txq, struct rte_mbuf **tx_pkts,
+  uint16_t pkts_n);
 
 uint16_t mana_rx_burst_removed(void *dpdk_rxq, struct rte_mbuf **pkts,
   uint16_t pkts_n);
diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c
index feda30623a..92432c431d 100644
--- a/drivers/net/mana/mp.c
+++ b/drivers/net/mana/mp.c
@@ -141,6 +141,7 @@ mana_mp_secondary_han

[Patch v8 16/18] net/mana: add function to start/stop device

2022-09-08 Thread longli
From: Long Li 

Add support for starting/stopping the device.

Signed-off-by: Long Li 
---
Change log:
v2:
Use spinlock for memory registration cache.
Add prefix mana_ to all function names.
v6:
Roll back device state on error in mana_dev_start()

 drivers/net/mana/mana.c | 77 +
 1 file changed, 77 insertions(+)

diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 92692037b1..63937410b8 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -105,6 +105,81 @@ mana_dev_configure(struct rte_eth_dev *dev)
 
 static int mana_intr_uninstall(struct mana_priv *priv);
 
+static int
+mana_dev_start(struct rte_eth_dev *dev)
+{
+   int ret;
+   struct mana_priv *priv = dev->data->dev_private;
+
+   rte_spinlock_init(&priv->mr_btree_lock);
+   ret = mana_mr_btree_init(&priv->mr_btree, MANA_MR_BTREE_CACHE_N,
+dev->device->numa_node);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to init device MR btree %d", ret);
+   return ret;
+   }
+
+   ret = mana_start_tx_queues(dev);
+   if (ret) {
+   DRV_LOG(ERR, "failed to start tx queues %d", ret);
+   goto failed_tx;
+   }
+
+   ret = mana_start_rx_queues(dev);
+   if (ret) {
+   DRV_LOG(ERR, "failed to start rx queues %d", ret);
+   goto failed_rx;
+   }
+
+   rte_wmb();
+
+   dev->tx_pkt_burst = mana_tx_burst;
+   dev->rx_pkt_burst = mana_rx_burst;
+
+   DRV_LOG(INFO, "TX/RX queues have started");
+
+   /* Enable datapath for secondary processes */
+   mana_mp_req_on_rxtx(dev, MANA_MP_REQ_START_RXTX);
+
+   return 0;
+
+failed_rx:
+   mana_stop_tx_queues(dev);
+
+failed_tx:
+   mana_mr_btree_free(&priv->mr_btree);
+
+   return ret;
+}
+
+static int
+mana_dev_stop(struct rte_eth_dev *dev __rte_unused)
+{
+   int ret;
+
+   dev->tx_pkt_burst = mana_tx_burst_removed;
+   dev->rx_pkt_burst = mana_rx_burst_removed;
+
+   /* Stop datapath on secondary processes */
+   mana_mp_req_on_rxtx(dev, MANA_MP_REQ_STOP_RXTX);
+
+   rte_wmb();
+
+   ret = mana_stop_tx_queues(dev);
+   if (ret) {
+   DRV_LOG(ERR, "failed to stop tx queues");
+   return ret;
+   }
+
+   ret = mana_stop_rx_queues(dev);
+   if (ret) {
+   DRV_LOG(ERR, "failed to stop tx queues");
+   return ret;
+   }
+
+   return 0;
+}
+
 static int
 mana_dev_close(struct rte_eth_dev *dev)
 {
@@ -452,6 +527,8 @@ mana_dev_link_update(struct rte_eth_dev *dev,
 
 static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
+   .dev_start  = mana_dev_start,
+   .dev_stop   = mana_dev_stop,
.dev_close  = mana_dev_close,
.dev_infos_get  = mana_dev_info_get,
.txq_info_get   = mana_dev_tx_queue_info,
-- 
2.17.1



[Patch v8 17/18] net/mana: add function to report queue stats

2022-09-08 Thread longli
From: Long Li 

Report packet statistics.

Signed-off-by: Long Li 
---
Change log:
v5:
Fixed calculation of stats packets/bytes/errors by adding them over the queue 
stats.
v8:
Fixed coding style on function definitions.

 doc/guides/nics/features/mana.ini |  1 +
 drivers/net/mana/mana.c   | 77 +++
 2 files changed, 78 insertions(+)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 7922816d66..81ebc9c365 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -4,6 +4,7 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
+Basic stats  = Y
 Free Tx mbuf on demand = Y
 Link status  = P
 Linux= Y
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 63937410b8..70695d215d 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -525,6 +525,79 @@ mana_dev_link_update(struct rte_eth_dev *dev,
return rte_eth_linkstatus_set(dev, &link);
 }
 
+static int
+mana_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+   unsigned int i;
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   struct mana_txq *txq = dev->data->tx_queues[i];
+
+   if (!txq)
+   continue;
+
+   stats->opackets = txq->stats.packets;
+   stats->obytes = txq->stats.bytes;
+   stats->oerrors = txq->stats.errors;
+
+   if (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) {
+   stats->q_opackets[i] = txq->stats.packets;
+   stats->q_obytes[i] = txq->stats.bytes;
+   }
+   }
+
+   stats->rx_nombuf = 0;
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   struct mana_rxq *rxq = dev->data->rx_queues[i];
+
+   if (!rxq)
+   continue;
+
+   stats->ipackets = rxq->stats.packets;
+   stats->ibytes = rxq->stats.bytes;
+   stats->ierrors = rxq->stats.errors;
+
+   /* There is no good way to get stats->imissed, not setting it */
+
+   if (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) {
+   stats->q_ipackets[i] = rxq->stats.packets;
+   stats->q_ibytes[i] = rxq->stats.bytes;
+   }
+
+   stats->rx_nombuf += rxq->stats.nombuf;
+   }
+
+   return 0;
+}
+
+static int
+mana_dev_stats_reset(struct rte_eth_dev *dev __rte_unused)
+{
+   unsigned int i;
+
+   PMD_INIT_FUNC_TRACE();
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   struct mana_txq *txq = dev->data->tx_queues[i];
+
+   if (!txq)
+   continue;
+
+   memset(&txq->stats, 0, sizeof(txq->stats));
+   }
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   struct mana_rxq *rxq = dev->data->rx_queues[i];
+
+   if (!rxq)
+   continue;
+
+   memset(&rxq->stats, 0, sizeof(rxq->stats));
+   }
+
+   return 0;
+}
+
 static const struct eth_dev_ops mana_dev_ops = {
.dev_configure  = mana_dev_configure,
.dev_start  = mana_dev_start,
@@ -541,9 +614,13 @@ static const struct eth_dev_ops mana_dev_ops = {
.rx_queue_setup = mana_dev_rx_queue_setup,
.rx_queue_release   = mana_dev_rx_queue_release,
.link_update= mana_dev_link_update,
+   .stats_get  = mana_dev_stats_get,
+   .stats_reset= mana_dev_stats_reset,
 };
 
 static const struct eth_dev_ops mana_dev_secondary_ops = {
+   .stats_get = mana_dev_stats_get,
+   .stats_reset = mana_dev_stats_reset,
.dev_infos_get = mana_dev_info_get,
 };
 
-- 
2.17.1



[Patch v8 18/18] net/mana: add function to support Rx interrupts

2022-09-08 Thread longli
From: Long Li 

mana can receive Rx interrupts from kernel through RDMA verbs interface.
Implement Rx interrupts in the driver.

Signed-off-by: Long Li 
---
Change log:
v5:
New patch added to the series
v8:
Fix coding style on function definitions.

 doc/guides/nics/features/mana.ini |   1 +
 drivers/net/mana/gdma.c   |  10 +--
 drivers/net/mana/mana.c   | 128 ++
 drivers/net/mana/mana.h   |   9 ++-
 drivers/net/mana/rx.c |  94 +++---
 drivers/net/mana/tx.c |   3 +-
 6 files changed, 211 insertions(+), 34 deletions(-)

diff --git a/doc/guides/nics/features/mana.ini 
b/doc/guides/nics/features/mana.ini
index 81ebc9c365..5fb62ea85d 100644
--- a/doc/guides/nics/features/mana.ini
+++ b/doc/guides/nics/features/mana.ini
@@ -14,6 +14,7 @@ Multiprocess aware   = Y
 Queue start/stop = Y
 Removal event= Y
 RSS hash = Y
+Rx interrupt = Y
 Speed capabilities   = P
 Usage doc= Y
 x86-64   = Y
diff --git a/drivers/net/mana/gdma.c b/drivers/net/mana/gdma.c
index 3f937d6c93..c67c5af2f9 100644
--- a/drivers/net/mana/gdma.c
+++ b/drivers/net/mana/gdma.c
@@ -213,7 +213,7 @@ union gdma_doorbell_entry {
  */
 int
 mana_ring_doorbell(void *db_page, enum gdma_queue_types queue_type,
-  uint32_t queue_id, uint32_t tail)
+  uint32_t queue_id, uint32_t tail, uint8_t arm)
 {
uint8_t *addr = db_page;
union gdma_doorbell_entry e = {};
@@ -228,14 +228,14 @@ mana_ring_doorbell(void *db_page, enum gdma_queue_types 
queue_type,
case GDMA_QUEUE_RECEIVE:
e.rq.id = queue_id;
e.rq.tail_ptr = tail;
-   e.rq.wqe_cnt = 1;
+   e.rq.wqe_cnt = arm;
addr += DOORBELL_OFFSET_RQ;
break;
 
case GDMA_QUEUE_COMPLETION:
e.cq.id = queue_id;
e.cq.tail_ptr = tail;
-   e.cq.arm = 1;
+   e.cq.arm = arm;
addr += DOORBELL_OFFSET_CQ;
break;
 
@@ -247,8 +247,8 @@ mana_ring_doorbell(void *db_page, enum gdma_queue_types 
queue_type,
/* Ensure all writes are done before ringing doorbell */
rte_wmb();
 
-   DRV_LOG(DEBUG, "db_page %p addr %p queue_id %u type %u tail %u",
-   db_page, addr, queue_id, queue_type, tail);
+   DRV_LOG(DEBUG, "db_page %p addr %p queue_id %u type %u tail %u arm %u",
+   db_page, addr, queue_id, queue_type, tail, arm);
 
rte_write64(e.as_uint64, addr);
return 0;
diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c
index 70695d215d..8bfccaf013 100644
--- a/drivers/net/mana/mana.c
+++ b/drivers/net/mana/mana.c
@@ -103,7 +103,72 @@ mana_dev_configure(struct rte_eth_dev *dev)
return 0;
 }
 
-static int mana_intr_uninstall(struct mana_priv *priv);
+static void
+rx_intr_vec_disable(struct mana_priv *priv)
+{
+   struct rte_intr_handle *intr_handle = priv->intr_handle;
+
+   rte_intr_free_epoll_fd(intr_handle);
+   rte_intr_vec_list_free(intr_handle);
+   rte_intr_nb_efd_set(intr_handle, 0);
+}
+
+static int
+rx_intr_vec_enable(struct mana_priv *priv)
+{
+   unsigned int i;
+   unsigned int rxqs_n = priv->dev_data->nb_rx_queues;
+   unsigned int n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+   struct rte_intr_handle *intr_handle = priv->intr_handle;
+   int ret;
+
+   rx_intr_vec_disable(priv);
+
+   if (rte_intr_vec_list_alloc(intr_handle, NULL, n)) {
+   DRV_LOG(ERR, "Failed to allocate memory for interrupt vector");
+   return -ENOMEM;
+   }
+
+   for (i = 0; i < n; i++) {
+   struct mana_rxq *rxq = priv->dev_data->rx_queues[i];
+
+   ret = rte_intr_vec_list_index_set(intr_handle, i,
+ RTE_INTR_VEC_RXTX_OFFSET + i);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to set intr vec %u", i);
+   return ret;
+   }
+
+   ret = rte_intr_efds_index_set(intr_handle, i, rxq->channel->fd);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to set FD at intr %u", i);
+   return ret;
+   }
+   }
+
+   return rte_intr_nb_efd_set(intr_handle, n);
+}
+
+static void
+rxq_intr_disable(struct mana_priv *priv)
+{
+   int err = rte_errno;
+
+   rx_intr_vec_disable(priv);
+   rte_errno = err;
+}
+
+static int
+rxq_intr_enable(struct mana_priv *priv)
+{
+   const struct rte_eth_intr_conf *const intr_conf =
+   &priv->dev_data->dev_conf.intr_conf;
+
+   if (!intr_conf->rxq)
+   return 0;
+
+   return rx_intr_vec_enable(priv);
+}
 
 static int
 mana_dev_start(struct rte_eth_dev *dev)
@@ -141,8 +206,17 @@ mana_dev_start(struct rte_eth_dev *dev)
/* Enable datapath for secon

RE: [PATCH v2 1/8] vdpa/ifc: add new device ID

2022-09-08 Thread Xia, Chenbo
Hi Andy,

> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, September 8, 2022 1:54 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com; Huang Wei 
> Subject: [PATCH v2 1/8] vdpa/ifc: add new device ID

Title could be: add new device ID for legacy network device

> 
> From: Huang Wei 
> 
> Add new device id to support IFCVF_NET_TRANSITIONAL_DEVICE_ID (0x1000).
> 
> Signed-off-by: Huang Wei 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.h | 4 +++-
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 9 -
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index 9d95aac..7ede738 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -12,11 +12,13 @@
>  #define IFCVF_BLK1
> 
>  #define IFCVF_VENDOR_ID 0x1AF4
> -#define IFCVF_NET_DEVICE_ID 0x1041
> +#define IFCVF_NET_MODERN_DEVICE_ID  0x1041
>  #define IFCVF_BLK_MODERN_DEVICE_ID  0x1042
> +#define IFCVF_NET_TRANSITIONAL_DEVICE_ID0x1000
>  #define IFCVF_BLK_TRANSITIONAL_DEVICE_ID0x1001
>  #define IFCVF_SUBSYS_VENDOR_ID  0x8086
>  #define IFCVF_SUBSYS_DEVICE_ID  0x001A
> +#define IFCVF_NET_DEVICE_ID 0x0001

For subsystem device ID, I suggest to add _SUBSYS_, please check all
Subsystem device ID and make all the names well-defined.

Thanks,
Chenbo

>  #define IFCVF_BLK_DEVICE_ID 0x0002
> 
>  #define IFCVF_MAX_QUEUES 1
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index ac42de9..61d0250 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -1684,13 +1684,20 @@ struct rte_vdpa_dev_info dev_info[] = {
>  static const struct rte_pci_id pci_id_ifcvf_map[] = {
>   { .class_id = RTE_CLASS_ANY_ID,
> .vendor_id = IFCVF_VENDOR_ID,
> -   .device_id = IFCVF_NET_DEVICE_ID,
> +   .device_id = IFCVF_NET_MODERN_DEVICE_ID,
> .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> .subsystem_device_id = IFCVF_SUBSYS_DEVICE_ID,
>   },
> 
>   { .class_id = RTE_CLASS_ANY_ID,
> .vendor_id = IFCVF_VENDOR_ID,
> +   .device_id = IFCVF_NET_TRANSITIONAL_DEVICE_ID,
> +   .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> +   .subsystem_device_id = IFCVF_NET_DEVICE_ID,
> + },
> +
> + { .class_id = RTE_CLASS_ANY_ID,
> +   .vendor_id = IFCVF_VENDOR_ID,
> .device_id = IFCVF_BLK_TRANSITIONAL_DEVICE_ID,
> .subsystem_vendor_id = IFCVF_SUBSYS_VENDOR_ID,
> .subsystem_device_id = IFCVF_BLK_DEVICE_ID,
> --
> 1.8.3.1



RE: [PATCH v2 2/8] vdpa/ifc: add multi queue support

2022-09-08 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, September 8, 2022 1:54 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com; Huang Wei 
> Subject: [PATCH v2 2/8] vdpa/ifc: add multi queue support

multi-queue

> 
> Enable VHOST_USER_PROTOCOL_F_MQ feature.
> ExposeIFCVF_MQ_OFFSET register to enable multi queue.

Please rephase it in a better way, at least add a space before
IFCVF_MP_OFFSET...

> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 5 +
>  drivers/vdpa/ifc/base/ifcvf.h | 2 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 1 +
>  3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index f1e1474..34c8226 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -90,6 +90,11 @@
>   if (!hw->lm_cfg)
>   WARNINGOUT("HW support live migration not support!\n");
> 
> + if (hw->mem_resource[4].addr)
> + hw->mq_cfg = hw->mem_resource[4].addr + IFCVF_MQ_OFFSET;
> + else
> + hw->mq_cfg = NULL;

Could you help me understand the logic here? There are two cases that BAR 4
mmap-able and not mmap-able?

Thanks,
Chenbo

> +
>   if (hw->common_cfg == NULL || hw->notify_base == NULL ||
>   hw->isr == NULL || hw->dev_cfg == NULL) {
>   DEBUGOUT("capability incomplete\n");
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index 7ede738..ad505f1 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -50,6 +50,7 @@
> 
>  #define IFCVF_LM_CFG_SIZE0x40
>  #define IFCVF_LM_RING_STATE_OFFSET   0x20
> +#define IFCVF_MQ_OFFSET  0x28
> 
>  #define IFCVF_LM_LOGGING_CTRL0x0
> 
> @@ -149,6 +150,7 @@ struct ifcvf_hw {
>   u16*notify_base;
>   u16*notify_addr[IFCVF_MAX_QUEUES * 2];
>   u8 *lm_cfg;
> + u8 *mq_cfg;
>   struct vring_info vring[IFCVF_MAX_QUEUES * 2];
>   u8 nr_vring;
>   int device_type;
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 61d0250..2d165c0 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -1248,6 +1248,7 @@ struct rte_vdpa_dev_info {
>1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
>1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
>1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | \
> +  1ULL << VHOST_USER_PROTOCOL_F_MQ | \
>1ULL << VHOST_USER_PROTOCOL_F_STATUS)
> 
>  #define VDPA_BLK_PROTOCOL_FEATURES \
> --
> 1.8.3.1



RE: [PATCH v8 01/12] net/nfp: move app specific attributes to own struct

2022-09-08 Thread Chaoyong He
> On 9/8/2022 9:44 AM, Chaoyong He wrote:
> > The NFP card can load different firmware applications. Currently only
> > the CoreNIC application is supported. This commit makes needed
> > infrastructure changes in order to support other firmware applications
> > too.
> >
> > Clearer separation is made between the PF device and any application
> > specific concepts. The PF struct is now generic regardless of the
> > application loaded. A new struct is also made for the CoreNIC
> > application. Future additions to support other applications should
> > also add an applications specific struct.
> >
> 
> What do you think to replace 'application' usage in the commit log with
> 'application firmware'?
> 
> <...>
> 
> > diff --git a/drivers/net/nfp/nfp_ethdev.c
> > b/drivers/net/nfp/nfp_ethdev.c index e9d01f4..bd9cf67 100644
> > --- a/drivers/net/nfp/nfp_ethdev.c
> > +++ b/drivers/net/nfp/nfp_ethdev.c
> > @@ -39,15 +39,15 @@
> >   #include "nfp_cpp_bridge.h"
> >
> >   static int
> > -nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
> > +nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_hw_nic, int port)
> 
> Is this intentional that struct name is 'nfp_app_fw_nic' but variable name is
> 'app_hw_nic'? Why is app_fw vs app_hw difference?
> 
Sorry, I'm not quite sure I catch your doubt.
Do you mean I should just use `app_hw` as variable name if the function only 
process one type of the application firmware?

> <...>
> 
> > @@ -890,27 +937,12 @@
> > }
> >
> > /* Populate the newly created PF device */
> > +   pf_dev->app_fw_id = app_hw_id;
> 
> ditto.
Our PMD driver can support two different application firmwares now.
We get the application firmware's type from the firmware in the probe function, 
and store it in the structure of pf device. Then we can invoke different 
initialization 
logics according to the application firmware's type.

We have a `structure nfp_pf_dev`, which is used to store the common 
information. 
We defined a structure for each type of application firmware to keep the 
specific information.
The `structure nfp_pf_dev` has a `void *app_fw_priv` filed, which can point to 
different structure 
based on the type of application firmware.

So, what you mean is we need not store it in the structure of pf device as no 
other logics 
out of this function will use it?



RE: [PATCH v8 05/12] net/nfp: add flower PF setup logic

2022-09-08 Thread Chaoyong He
> On 9/8/2022 9:44 AM, Chaoyong He wrote:
> > Adds the vNIC initialization logic for the flower PF vNIC. The flower
> > firmware exposes this vNIC for the purposes of fallback traffic in the
> > switchdev use-case.
> >
> > Adds minimal dev_ops for this PF device. Because the device is being
> > exposed externally to DPDK it should also be configured using DPDK
> > helpers like rte_eth_configure(). For these helpers to work the flower
> > logic needs to implements a minimal set of dev_ops.
> >
> > Signed-off-by: Chaoyong He 
> > Reviewed-by: Niklas Söderlund 
> > ---
> >   drivers/net/nfp/flower/nfp_flower.c| 398
> -
> >   drivers/net/nfp/flower/nfp_flower.h|   6 +
> >   drivers/net/nfp/flower/nfp_flower_ovs_compat.h |  37 +++
> 
> Can you please detail why OVS specific header is required? Having application
> specific code in PMD can be sign of some design issue, that is why can you
> please explain more what it does?
> 

Basically, there exist two layers polling mode to move a pkt from firmware to 
OVS.

When our card using flower application firmware receive pkt and find the pkt 
can't be offloaded, 
it will record the input port in a place of the pkt, we call it metadata.

There exist a rte_ring for each representor port.

We use the pf device as a multiplexer, which keeps polling pkts from the 
firmware. 
Based on the metadata, it will enqueue the pkt into the rte_ring of the 
corresponding representor port.

On the OVS side, it will keeps try to dequeue the pkt from the rte_ring of the 
representor port.
Once it gets the pkt, the OVS will go its logic and treat the pkt as `struct 
dp_packet`.

So we copy the definition of `struct dp_packet` from OVS to prevent the 
coredump caused by memory read/write out of range.

Another possible way is defining a big enough mbuf_priv_len using macro to 
prevent this structure definition from OVS.
Is this the right way? 

> <...>
> 
> > +static int
> > +nfp_flower_init_pf_vnic(struct nfp_net_hw *hw) {
> > +   int ret;
> > +   uint16_t i;
> > +   uint16_t n_txq;
> > +   uint16_t n_rxq;
> > +   uint16_t port_id;
> > +   unsigned int numa_node;
> > +   struct rte_mempool *mp;
> > +   struct nfp_pf_dev *pf_dev;
> > +   struct rte_eth_dev *eth_dev;
> > +   struct nfp_app_fw_flower *app_fw_flower;
> > +
> > +   static const struct rte_eth_conf port_conf = {
> > +   .rxmode = {
> > +   .mq_mode  = RTE_ETH_MQ_RX_RSS,
> > +   .offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
> > +   },
> > +   .txmode = {
> > +   .mq_mode = RTE_ETH_MQ_TX_NONE,
> > +   },
> > +   };
> > +
> > +   /* Set up some pointers here for ease of use */
> > +   pf_dev = hw->pf_dev;
> > +   app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev-
> >app_fw_priv);
> > +
> > +   /*
> > +* Perform the "common" part of setting up a flower vNIC.
> > +* Mostly reading configuration from hardware.
> > +*/
> > +   ret = nfp_flower_init_vnic_common(hw, "pf_vnic");
> > +   if (ret != 0)
> > +   goto done;
> > +
> > +   hw->eth_dev = rte_eth_dev_allocate("nfp_pf_vnic");
> > +   if (hw->eth_dev == NULL) {
> > +   ret = -ENOMEM;
> > +   goto done;
> > +   }
> > +
> > +   /* Grab the pointer to the newly created rte_eth_dev here */
> > +   eth_dev = hw->eth_dev;
> > +
> > +   numa_node = rte_socket_id();
> > +
> > +   /* Fill in some of the eth_dev fields */
> > +   eth_dev->device = &pf_dev->pci_dev->device;
> > +   eth_dev->data->dev_private = hw;
> > +
> > +   /* Create a mbuf pool for the PF */
> > +   app_fw_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create();
> > +   if (app_fw_flower->pf_pktmbuf_pool == NULL) {
> > +   ret = -ENOMEM;
> > +   goto port_release;
> > +   }
> > +
> > +   mp = app_fw_flower->pf_pktmbuf_pool;
> > +
> > +   /* Add Rx/Tx functions */
> > +   eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
> > +
> > +   /* PF vNIC gets a random MAC */
> > +   eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
> RTE_ETHER_ADDR_LEN, 0);
> > +   if (eth_dev->data->mac_addrs == NULL) {
> > +   ret = -ENOMEM;
> > +   goto mempool_cleanup;
> > +   }
> > +
> > +   rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes);
> > +   rte_eth_dev_probing_finish(eth_dev);
> > +
> > +   /* Configure the PF device now */
> > +   n_rxq = hw->max_rx_queues;
> > +   n_txq = hw->max_tx_queues;
> > +   port_id = hw->eth_dev->data->port_id;
> > +
> > +   ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf);
> 
> Still not sure about PMD calling 'rte_eth_dev_configure()', can you please
> give more details on what specific configuration is expected with that call?

The main configuration we need is the number of rx/tx queue.
So we should use the internal api `eth_dev_rx/tx_queue_config` to instead?



[PATCH 1/3] eventdev/eth_tx: add queue start stop API

2022-09-08 Thread Naga Harish K S V
This patch adds support to start or stop a particular queue
that is associated with the adapter.

Start function enables the Tx Adapter to start enqueueing
packets to the Tx queue.

Stop function stops the Tx Adapter from transmitting any
mbufs to the Tx queue. The Tx Adapter also frees any mbufs
that it may have buffered for this queue. All inflight packets
destined to the queue are freed until the queue is started again.

Signed-off-by: Naga Harish K S V 
---
 lib/eventdev/eventdev_pmd.h |  41 +
 lib/eventdev/rte_event_eth_tx_adapter.c | 114 +++-
 lib/eventdev/rte_event_eth_tx_adapter.h |  39 
 lib/eventdev/version.map|   2 +
 4 files changed, 192 insertions(+), 4 deletions(-)

diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index f514a37575..a27c0883c6 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -1294,6 +1294,43 @@ typedef int 
(*eventdev_eth_tx_adapter_stats_reset_t)(uint8_t id,
 typedef int (*eventdev_eth_tx_adapter_instance_get_t)
(uint16_t eth_dev_id, uint16_t tx_queue_id, uint8_t *txa_inst_id);
 
+/**
+ * Start a Tx queue that is assigned to TX adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device TX queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_start)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
+
+/**
+ * Stop a Tx queue that is assigned to TX adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device TX queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_stop)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
 
 /** Event device operations function pointer table */
 struct eventdev_ops {
@@ -1409,6 +1446,10 @@ struct eventdev_ops {
/**< Reset eth Tx adapter statistics */
eventdev_eth_tx_adapter_instance_get_t eth_tx_adapter_instance_get;
/**< Get Tx adapter instance id for Tx queue */
+   eventdev_eth_tx_adapter_queue_start eth_tx_adapter_queue_start;
+   /**< Start Tx queue assigned to Tx adapter instance */
+   eventdev_eth_tx_adapter_queue_stop eth_tx_adapter_queue_stop;
+   /**< Stop Tx queue assigned to Tx adapter instance */
 
eventdev_selftest dev_selftest;
/**< Start eventdev Selftest */
diff --git a/lib/eventdev/rte_event_eth_tx_adapter.c 
b/lib/eventdev/rte_event_eth_tx_adapter.c
index aaef352f5c..f2e426d6ab 100644
--- a/lib/eventdev/rte_event_eth_tx_adapter.c
+++ b/lib/eventdev/rte_event_eth_tx_adapter.c
@@ -47,6 +47,12 @@
 #define txa_dev_instance_get(id) \
txa_evdev(id)->dev_ops->eth_tx_adapter_instance_get
 
+#define txa_dev_queue_start(id) \
+   txa_evdev(id)->dev_ops->eth_tx_adapter_queue_start
+
+#define txa_dev_queue_stop(id) \
+   txa_evdev(id)->dev_ops->eth_tx_adapter_queue_stop
+
 #define RTE_EVENT_ETH_TX_ADAPTER_ID_VALID_OR_ERR_RET(id, retval) \
 do { \
if (!txa_valid_id(id)) { \
@@ -94,6 +100,8 @@ struct txa_retry {
 struct txa_service_queue_info {
/* Queue has been added */
uint8_t added;
+   /* Queue is stopped */
+   bool stopped;
/* Retry callback argument */
struct txa_retry txa_retry;
/* Tx buffer */
@@ -557,7 +565,7 @@ txa_process_event_vector(struct txa_service_data *txa,
port = vec->port;
queue = vec->queue;
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL || !tqi->added)) {
+   if (unlikely(tqi == NULL || !tqi->added || tqi->stopped)) {
rte_pktmbuf_free_bulk(mbufs, vec->nb_elem);
rte_mempool_put(rte_mempool_from_obj(vec), vec);
return 0;
@@ -571,7 +579,8 @@ txa_process_event_vector(struct txa_service_data *txa,
port = mbufs[i]->port;
queue = rte_event_eth_tx_adapter_txq_get(mbufs[i]);
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL || !tqi->added)) {
+   if (unlikely(tqi == NULL || !tqi->added ||
+tqi->stopped)) {
rte_pktmbuf_free(mbufs[i]);
continue;
}
@@ -608,7 +617,8 @@ txa_service_tx(struct txa_service_data *txa, struct 
rte_event *ev,
queue = rte_event_eth_tx_adapter_txq_get(m);
 
tqi = txa_service_queue(txa, port, queue);
-   if (unlikely(tqi == NULL 

[PATCH 2/3] test/eth_tx: add testcase for queue start stop APIs

2022-09-08 Thread Naga Harish K S V
Added testcase for rte_event_eth_tx_adapter_queue_start()
and rte_event_eth_tx_adapter_queue_stop() APIs.

Signed-off-by: Naga Harish K S V 
---
 app/test/test_event_eth_tx_adapter.c | 86 
 1 file changed, 86 insertions(+)

diff --git a/app/test/test_event_eth_tx_adapter.c 
b/app/test/test_event_eth_tx_adapter.c
index 98debfdd2c..c19a87a86a 100644
--- a/app/test/test_event_eth_tx_adapter.c
+++ b/app/test/test_event_eth_tx_adapter.c
@@ -711,6 +711,90 @@ tx_adapter_instance_get(void)
return TEST_SUCCESS;
 }
 
+static int
+tx_adapter_queue_start_stop(void)
+{
+   int err;
+   uint16_t eth_dev_id;
+   struct rte_eth_dev_info dev_info;
+
+   /* Case 1: Test without adding eth Tx queue */
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 2: Test with wrong eth port */
+   eth_dev_id = rte_eth_dev_count_total() + 1;
+   err = rte_event_eth_tx_adapter_queue_start(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 3: Test with wrong tx queue */
+   err = rte_eth_dev_info_get(TEST_ETHDEV_ID, &dev_info);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 4: Test with right instance, port & rxq */
+   /* Add queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Add another queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Case 5: Test with right instance, port & wrong rxq */
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Delete all queues from the Tx adapter */
+   err = rte_event_eth_tx_adapter_queue_del(TEST_INST_ID,
+TEST_ETHDEV_ID,
+-1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   return TEST_SUCCESS;
+}
+
 static int
 tx_adapter_dynamic_device(void)
 {
@@ -770,6 +854,8 @@ static struct unit_test_suite event_eth_tx_tests = {
tx_adapter_service),
TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
tx_adapter_instance_get),
+   TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
+   tx_adapter_queue_start_stop)

[PATCH 3/3] doc: added eth Tx adapter queue start stop APIs

2022-09-08 Thread Naga Harish K S V
Added tx adapter queue start - rte_event_eth_rx_adapter_queue_start()
and tx adapter queue stop - rte_event_eth_tx_adapter_queue_stop()

Signed-off-by: Naga Harish K S V 
---
 doc/guides/rel_notes/release_22_11.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index c32c18ff49..dc1060660c 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -29,6 +29,10 @@ New Features
   ethernet device id and Rx queue index.
   Added ``rte_event_eth_tx_adapter_instance_get`` to get the Tx adapter 
instance id for specified
   ethernet device id and Tx queue index.
+  Added ``rte_event_eth_tx_adapter_queue_start`` to start enqueueing packets 
to the Tx queue by
+  Tx adapter.
+  Added ``rte_event_eth_tx_adapter_queue_start`` to stop the Tx Adapter from 
transmitting any
+  mbufs to the Tx_queue.
 
 .. This section should contain new features added in this release.
Sample format:
-- 
2.25.1



OVSCI Automation Upgrade planned for Friday 9/9/22 net-virt-staff

2022-09-08 Thread Michael Santana
Hi all,

We are planning on doing a system upgrade on our OVSCI automation
system this Friday 9/9/22.

This will affect our upstream robots (i.e. ovs, ovn, dpdk). This
affects patchwork sync and github actions sync, etc.
After the upgrade is completed the robots will be started again and
expected to work normally. Any obvious issues that appear will be
resolved on demand

The upgrade will start estimated at 9AM EST and last for the entire day

The highlights are:
* First and foremost Backup. If we run into any major roadblocks we
can always rollback
* Upgrade from rhel7.9 to rhel8.4
* After the upgrade Jenkins jobs will be enabled manually one by one
for testing and making sure they work properly
* We have done some testing with the upgrade ahead of time so we know
what to expect. However, there is still a chance of unexpected things
to happen. This is why we are enabling jobs one by one. This will
allow us to fix any issues on demand



[Bug 1079] [dpdk 22.11] kernel/linux/kni meson build failed with gcc 7.5 on suse15.4

2022-09-08 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1079

Bug ID: 1079
   Summary: [dpdk 22.11] kernel/linux/kni meson build failed with
gcc 7.5 on suse15.4
   Product: DPDK
   Version: unspecified
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: core
  Assignee: dev@dpdk.org
  Reporter: daxuex@intel.com
  Target Milestone: ---

[DPDK version]:
commit 72206323a5dd3182b13f61b25a64abdddfee595c (HEAD -> main, origin/main,
origin/for-next-net, origin/HEAD)
Author: David Marchand 
Date:   Sat Jul 9 10:43:09 2022 +0200

version: 22.11-rc0

Start a new release cycle with empty release notes.

The ABI version becomes 23.0.
The map files are updated to the new ABI major number (23).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Special handling of removed drivers is also dropped in check-abi.sh and
a note has been added in libabigail.abignore as a reminder.

Signed-off-by: David Marchand 
Acked-by: Thomas Monjalon 

[OS version]:
openSUSE Leap 15.4/Linux 5.14.21-150400.24.18-default
gcc version 7.5 20220909

[Test Setup]:
sed -i "" "/#define RTE_LIBRTE_PMD_SKELETON_EVENTDEV_DEBUG/d"
config/rte_config.h

CC=gcc meson --werror -Denable_kmods=True -Dlibdir=lib -Dexamples=all
--default-library=static x86_64-native-linuxapp-gcc 
ninja -j 10 -C x86_64-native-linuxapp-gcc/


[Log]
ninja: Entering directory `x86_64-native-linuxapp-gcc'
[3575/3580] Generating rte_kni with a custom command
FAILED: kernel/linux/kni/rte_kni.ko
/usr/bin/make -j4 -C /lib/modules/5.14.21-150400.24.18-default/build
M=/opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni
src=/opt/dpdk/kernel/linux/kni 'MODULE_CFLAGS= -DHAVE_ARG_TX_QUEUE -include
/opt/dpdk/config/rte_config.h -I/opt/dpdk/lib/eal/include -I/opt/dpdk/lib/kni
-I/opt/dpdk/x86_64-native-linuxapp-gcc -I/opt/dpdk/kernel/linux/kni' modules
make: Entering directory
'/usr/src/linux-5.14.21-150400.24.18-obj/x86_64/default'
  CC [M]  /opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_net.o
  CC [M]  /opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_misc.o
In file included from
/usr/src/linux-5.14.21-150400.24.18/include/linux/mm_types.h:14:0,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/buildid.h:5,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/module.h:14,
 from /opt/dpdk/kernel/linux/kni/kni_misc.c:7:
/usr/src/linux-5.14.21-150400.24.18/include/linux/uprobes.h:91:1: internal
compiler error: Segmentation fault
 };
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[2]: *** [/usr/src/linux-5.14.21-150400.24.18/scripts/Makefile.build:272:
/opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_misc.o] Error 1
make[2]: *** Waiting for unfinished jobs
In file included from
/usr/src/linux-5.14.21-150400.24.18/include/linux/cred.h:13:0,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/sched/signal.h:10,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/rcuwait.h:6,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/percpu-rwsem.h:7,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/fs.h:33,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/huge_mm.h:8,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/mm.h:728,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/bvec.h:14,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/skbuff.h:17,
 from
/usr/src/linux-5.14.21-150400.24.18/include/net/net_namespace.h:39,
 from
/usr/src/linux-5.14.21-150400.24.18/include/linux/netdevice.h:37,
 from /opt/dpdk/kernel/linux/kni/kni_net.c:14:
/usr/src/linux-5.14.21-150400.24.18/include/linux/key.h:247:3: internal
compiler error: Segmentation fault
   };
   ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
make[2]: *** [/usr/src/linux-5.14.21-150400.24.18/scripts/Makefile.build:272:
/opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_net.o] Error 1
make[1]: *** [/usr/src/linux-5.14.21-150400.24.18/Makefile:1885:
/opt/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni] Error 2
make: *** [../../../linux-5.14.21-150400.24.18/Makefile:220: __sub-make] Error
2
make: Leaving directory
'/usr/src/linux-5.14.21-150400.24.18-obj/x86_64/default'
[3580/3580] Linking target examples/dpdk-vmdq_dcb
ninja: build stopped: subcommand failed.

[Bad commit]
This is new os found problem, old os no found problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [PATCH v4 2/4] event/sw: report periodic event timer capability

2022-09-08 Thread Naga Harish K, S V
Hi Jerin,
This patch set (all 4 patches in the series) is acked by the respective 
maintainers.
Please review and do the needful.

-Harish

> -Original Message-
> From: Van Haaren, Harry 
> Sent: Tuesday, September 6, 2022 1:40 PM
> To: Naga Harish K, S V ; Carrillo, Erik G
> ; jer...@marvell.com
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v4 2/4] event/sw: report periodic event timer capability
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Friday, August 12, 2022 5:08 PM
> > To: Carrillo, Erik G ; jer...@marvell.com;
> > Van Haaren, Harry 
> > Cc: dev@dpdk.org
> > Subject: [PATCH v4 2/4] event/sw: report periodic event timer
> > capability
> >
> > update the software eventdev pmd timer_adapter_caps_get callback
> > function to report the support of periodic event timer capability
> >
> > Signed-off-by: Naga Harish K S V 
> 
> Thanks for explaining how things work on the v2 & follow-up reworks;
> Acked-by: Harry van Haaren 


RE: [PATCH 1/2] eventdev/eth_tx: add spinlock for adapter start/stop

2022-09-08 Thread Naga Harish K, S V
Hi Jerin,
   This patch set is acked by maintainers.
Please review and do the needful.

-Harish

> -Original Message-
> From: Jayatheerthan, Jay 
> Sent: Monday, August 1, 2022 12:23 PM
> To: Naga Harish K, S V ; jer...@marvell.com
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH 1/2] eventdev/eth_tx: add spinlock for adapter
> start/stop
> 
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Tuesday, July 26, 2022 9:52 AM
> > To: Jayatheerthan, Jay ;
> > jer...@marvell.com
> > Cc: dev@dpdk.org; sta...@dpdk.org
> > Subject: [PATCH 1/2] eventdev/eth_tx: add spinlock for adapter
> > start/stop
> >
> > add spinlock protection for tx adapter stop and start APIs add null
> > check for tx adapter service pointer in adapter start/stop apis.
> >
> > Fixes: a3bbf2e09756 ("eventdev: add eth Tx adapter implementation")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V 
> > ---
> >  lib/eventdev/rte_event_eth_tx_adapter.c | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/eventdev/rte_event_eth_tx_adapter.c
> > b/lib/eventdev/rte_event_eth_tx_adapter.c
> > index a237e8edba..3251dad61f 100644
> > --- a/lib/eventdev/rte_event_eth_tx_adapter.c
> > +++ b/lib/eventdev/rte_event_eth_tx_adapter.c
> > @@ -44,7 +44,7 @@
> >  #define RTE_EVENT_ETH_TX_ADAPTER_ID_VALID_OR_ERR_RET(id, retval)
> \
> > do { \
> > if (!txa_valid_id(id)) { \
> > -   RTE_EDEV_LOG_ERR("Invalid eth Rx adapter id = %d", id); \
> > +   RTE_EDEV_LOG_ERR("Invalid eth Tx adapter id = %d", id); \
> > return retval; \
> > } \
> >  } while (0)
> > @@ -468,10 +468,13 @@ txa_service_ctrl(uint8_t id, int start)
> > struct txa_service_data *txa;
> >
> > txa = txa_service_id_to_data(id);
> > -   if (txa->service_id == TXA_INVALID_SERVICE_ID)
> > +   if (txa == NULL || txa->service_id == TXA_INVALID_SERVICE_ID)
> > return 0;
> >
> > +   rte_spinlock_lock(&txa->tx_lock);
> > ret = rte_service_runstate_set(txa->service_id, start);
> > +   rte_spinlock_unlock(&txa->tx_lock);
> > +
> > if (ret == 0 && !start) {
> > while (rte_service_may_be_active(txa->service_id))
> > rte_pause();
> > --
> > 2.23.0
> 
> There are three different changes in this patch. But since they are quite
> small, it should be ok.
> 
> Acked-by: Jay Jayatheerthan 



RE: [PATCH v8 01/12] net/nfp: move app specific attributes to own struct

2022-09-08 Thread Chaoyong He
> > On 9/8/2022 9:44 AM, Chaoyong He wrote:
> > > The NFP card can load different firmware applications. Currently
> > > only the CoreNIC application is supported. This commit makes needed
> > > infrastructure changes in order to support other firmware
> > > applications too.
> > >
> > > Clearer separation is made between the PF device and any application
> > > specific concepts. The PF struct is now generic regardless of the
> > > application loaded. A new struct is also made for the CoreNIC
> > > application. Future additions to support other applications should
> > > also add an applications specific struct.
> > >
> >
> > What do you think to replace 'application' usage in the commit log
> > with 'application firmware'?
> >
> > <...>
> >
> > > diff --git a/drivers/net/nfp/nfp_ethdev.c
> > > b/drivers/net/nfp/nfp_ethdev.c index e9d01f4..bd9cf67 100644
> > > --- a/drivers/net/nfp/nfp_ethdev.c
> > > +++ b/drivers/net/nfp/nfp_ethdev.c
> > > @@ -39,15 +39,15 @@
> > >   #include "nfp_cpp_bridge.h"
> > >
> > >   static int
> > > -nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
> > > +nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_hw_nic, int port)
> >
> > Is this intentional that struct name is 'nfp_app_fw_nic' but variable
> > name is 'app_hw_nic'? Why is app_fw vs app_hw difference?
> >
> Sorry, I'm not quite sure I catch your doubt.
> Do you mean I should just use `app_hw` as variable name if the function only
> process one type of the application firmware?
> 
Oh, sorry, I understand now. 
I misspelled 'app_fw' to 'app_hw' in some place, I'll revise and check it.
Thanks!

> > <...>
> >
> > > @@ -890,27 +937,12 @@
> > >   }
> > >
> > >   /* Populate the newly created PF device */
> > > + pf_dev->app_fw_id = app_hw_id;
> >
> > ditto.
Our PMD driver can support two different application firmwares now.
We get the application firmware's type from the firmware in the probe
function, and store it in the structure of pf device. Then we can invoke
different initialization logics according to the application firmware's type.

We have a `structure nfp_pf_dev`, which is used to store the common
information.
We defined a structure for each type of application firmware to keep the
specific information.
The `structure nfp_pf_dev` has a `void *app_fw_priv` filed, which can point
to different structure based on the type of application firmware.

So, what you mean is we need not store it in the structure of pf device as no
other logics out of this function will use it?



RE: [PATCH v2 3/8] vdpa/ifc: set max queues according to HW spec

2022-09-08 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, September 8, 2022 1:54 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com; Huang Wei 
> Subject: [PATCH v2 3/8] vdpa/ifc: set max queues according to HW spec

vdpa/ifc: set max queues based on virtio spec

> 
> Set max_queues according to virtio HW spec.
> For virtio BLK device, set max_queues to the value of "num_queues".
> "num_queues" is element of struct virtio_blk_config.

Both virtio-net/blk should be described.

> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 

Email is wrong, please fix all in next version

> ---
>  drivers/vdpa/ifc/base/ifcvf.h |  2 +-
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 21 -
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index ad505f1..c17bf2a 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -21,7 +21,7 @@
>  #define IFCVF_NET_DEVICE_ID 0x0001
>  #define IFCVF_BLK_DEVICE_ID 0x0002
> 
> -#define IFCVF_MAX_QUEUES 1
> +#define IFCVF_MAX_QUEUES 32
> 
>  #ifndef VIRTIO_F_IOMMU_PLATFORM
>  #define VIRTIO_F_IOMMU_PLATFORM  33
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 2d165c0..2b42850 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -26,6 +26,18 @@
> 
>  #include "base/ifcvf.h"
> 
> +/*
> + * RTE_MAX() and RTE_MIN() cannot be used since braced-group within
> + * expression allowed only inside a function, but MAX() is used as
> + * a number of elements in array.
> + */
> +#ifndef MAX
> +#define MAX(v1, v2)  ((v1) > (v2) ? (v1) : (v2))
> +#endif
> +#ifndef MIN
> +#define MIN(v1, v2)  ((v1) < (v2) ? (v1) : (v2))
> +#endif

Above ifndef is not needed?

Seems MAX is not used, so remove it

> +
>  RTE_LOG_REGISTER(ifcvf_vdpa_logtype, pmd.vdpa.ifcvf, NOTICE);
>  #define DRV_LOG(level, fmt, args...) \
>   rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
> @@ -1512,6 +1524,7 @@ struct rte_vdpa_dev_info dev_info[] = {
>   uint64_t capacity = 0;
>   uint8_t *byte;
>   uint32_t i;
> + uint16_t queue_pairs;
> 
>   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>   return 0;
> @@ -1559,7 +1572,6 @@ struct rte_vdpa_dev_info dev_info[] = {
>   }
> 
>   internal->configured = 0;
> - internal->max_queues = IFCVF_MAX_QUEUES;
>   features = ifcvf_get_features(&internal->hw);
> 
>   device_id = ifcvf_pci_get_device_type(pci_dev);
> @@ -1570,6 +1582,10 @@ struct rte_vdpa_dev_info dev_info[] = {
> 
>   if (device_id == VIRTIO_ID_NET) {
>   internal->hw.device_type = IFCVF_NET;
> + queue_pairs = (internal->hw.common_cfg->num_queues - 1) / 2;

Please note this logic assumes CTRL_VQ is always there, if for the read 
hardware,
that is the case, then it's fine. You can decide yourself to check CTRL_VQ 
feature
is there or not.

Thanks,
Chenbo

> + DRV_LOG(INFO, "%s support %u queue pairs", pci_dev->name,
> + queue_pairs);
> + internal->max_queues = MIN(IFCVF_MAX_QUEUES, queue_pairs);
>   internal->features = features &
>   ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
>   internal->features |= dev_info[IFCVF_NET].features;
> @@ -1609,6 +1625,9 @@ struct rte_vdpa_dev_info dev_info[] = {
>   internal->hw.blk_cfg->geometry.sectors);
>   DRV_LOG(DEBUG, "num_queues: 0x%08x",
>   internal->hw.blk_cfg->num_queues);
> +
> + internal->max_queues = MIN(IFCVF_MAX_QUEUES,
> + internal->hw.blk_cfg->num_queues);
>   }
> 
>   list->internal = internal;
> --
> 1.8.3.1



RE: [EXT] Re: [PATCH v2 1/5] examples/l3fwd: fix port group mask generation

2022-09-08 Thread Pavan Nikhilesh Bhagavatula
> On 9/2/22 2:18 AM, pbhagavat...@marvell.com wrote:
> > From: Pavan Nikhilesh 
> >
> > Fix port group mask generation in altivec, vec_any_eq returns
> > 0 or 1 while port_groupx4 expects comparison mask result.
> >
> > Fixes: 2193b7467f7a ("examples/l3fwd: optimize packet processing on
> powerpc")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Pavan Nikhilesh 
> > ---
> >   v2 Changes:
> >   - Fix PPC, RISC-V, aarch32 compilation.
> >
> >   examples/common/altivec/port_group.h | 11 +--
> >   1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/examples/common/altivec/port_group.h
> b/examples/common/altivec/port_group.h
> > index 5e209b02fa..592ef80b7f 100644
> > --- a/examples/common/altivec/port_group.h
> > +++ b/examples/common/altivec/port_group.h
> > @@ -26,12 +26,19 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t
> *lp,
> > uint16_t u16[FWDSTEP + 1];
> > uint64_t u64;
> > } *pnum = (void *)pn;
> > +   union u_vec {
> > +   __vector unsigned short v_us;
> > +   unsigned short s[8];
> > +   };
> >
> > +   union u_vec res;
> > int32_t v;
> >
> > -   v = vec_any_eq(dp1, dp2);
> > -
> > +   dp1 = (__vector unsigned short)vec_cmpeq(dp1, dp2);
> 
> Altivec vec_cmpeq() is similar to Intel _mm_cmpeq_*(), so this looks
> right to me.
> 
> > +   res.v_us = dp1;
> >
> > +   v = (res.s[0] & 0x1) | (res.s[1] & 0x2) | (res.s[2] & 0x4) |
> > +   (res.s[3] & 0x8);
> 
> This can be vectorized too.  The Intel _mm_unpacklo_epi16() intrinsic
> can be replaced with the following Altivec code:
> 
> extern __inline __m128i __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> _mm_unpacklo_epi16 (__m128i __A, __m128i __B)
> {
>return (__m128i) vec_mergeh ((__v8hi)__A, (__v8hi)__B);
> }
> 
> The Intel _mm_movemask_ps() intrinsic can be replaced with the following
> Altivec implementation:
> 
> /* Creates a 4-bit mask from the most significant bits of the SPFP
> values.  */
> extern __inline int __attribute__((__gnu_inline__, __always_inline__,
> __artificial__))
> _mm_movemask_ps (__m128  __A)
> {
>__vector unsigned long long result;
>static const __vector unsigned int perm_mask =
>  {
> #ifdef __LITTLE_ENDIAN__
>  0x00204060, 0x80808080, 0x80808080, 0x80808080
> #else
>0x80808080, 0x80808080, 0x80808080, 0x00204060
> #endif
>  };
> 
>result = ((__vector unsigned long long)
>  vec_vbpermq ((__vector unsigned char) __A,
>   (__vector unsigned char) perm_mask));
> 
> #ifdef __LITTLE_ENDIAN__
>return result[1];
> #else
>return result[0];
> #endif
> }
> 

Sure I will add this to the next version.

> Dave

Thanks, 
Pavan.