Hi Stephen, Some time ago we received the change request about usage of threads inside of the ntnic PMD driver. I would like to discuss this request. In our case, some threads are part of the design of the protocols between the driver and the FPGA, and are required for both functionality, robustness, and performance.
We currently have 3 distinct threads: Monitoring thread, for health and statistics. Link manager thread, for automatic control of link of the physical ports. Flow thread, to service the flow related FIFOs of the FPGA. The monitoring thread could be redesigned to API calls instead, for example rte_alarm, though the work will take some time. The link manager and flow thread can’t be redesigned without 1) significant changes to our FPGA or 2) introduction of a kernel driver to handle the functionality. Due to this, we want to leave usage of threads as it is. We wonder if removing threads from the PMD driver is a strict requirement or a "nice to have feature?" What are the technical concerns behind this request? What would the impact be if usage of threads remains the same(will the driver be excluded from the repo)? Any thoughts about it are valuable; we will gladly discuss them. Thanks, Serhii ________________________________ From: Stephen Hemminger <step...@networkplumber.org> Sent: 22 February 2025 23:41 To: Serhii Iliushyk <sil-...@napatech.com> Cc: dev@dpdk.org <dev@dpdk.org>; Mykola Kostenok <mko-...@napatech.com>; Christian Koue Muf <c...@napatech.com> Subject: Re: [PATCH v1 00/32] add new adapter NT400D13 On Thu, 20 Feb 2025 23:03:24 +0100 Serhii Iliushyk <sil-...@napatech.com> wrote: > This patchset adds support for the new adapter NT400D13. > > Danylo Vodopianov (23): > net/ntnic: add link agx 100g > net/ntnic: add link state machine > net/ntnic: add rpf and gfg init > net/ntnic: add agx setup for port > net/ntnic: add host loopback init > net/ntnic: add line loopback init > net/ntnic: add 100 gbps port init > net/ntnic: add port post init > net/ntnic: add nim low power API > net/ntnic: add link handling API > net/ntnic: add port init to the state machine > net/ntnic: add port disable API > net/ntnic: add nt400d13 pcm init > net/ntnic: add HIF clock test > net/ntnic: add nt400d13 PRM module init > net/ntnic: add nt400d13 PRM module reset > net/ntnic: add SPI v3 support for FPGA > net/ntnic: add i2cm init > net/ntnic: add pca init > net/ntnic: add pcal init > net/ntnic: add reset PHY init > net/ntnic: add igam module init > net/ntnic: init IGAM and config PLL for FPGA > > Serhii Iliushyk (9): > net/ntnic: add minimal initialization new NIC NT400D13 > net/ntnic: add minimal reset FPGA > net/ntnic: add FPGA modules and registers > net/ntnic: add setup for fpga reset > net/ntnic: add default reset setting for NT400D13 > net/ntnic: add DDR calibration to reset stage > net/ntnic: add PHY ftile reset > net/ntnic: add clock init > net/ntnic: revert untrusted loop bound > > doc/guides/nics/ntnic.rst | 7 +- > doc/guides/rel_notes/release_25_03.rst | 4 + > drivers/net/ntnic/adapter/nt4ga_adapter.c | 9 + > drivers/net/ntnic/include/nt4ga_link.h | 7 + > drivers/net/ntnic/include/nthw_gfg.h | 33 + > drivers/net/ntnic/include/ntnic_nim.h | 5 + > .../include/ntnic_nthw_fpga_rst_nt400dxx.h | 34 + > .../link_agx_100g/nt4ga_agx_link_100g.c | 1029 ++++++ > > drivers/net/ntnic/https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fmeson.build&c=E,1,V2jW0UmjAZTalpELBq8Dn7dVNfOTq_s_dwmagWH1e90cq826b3Jzkh8fF9_OEiNqN1Y6yB9ByFEKV0xnAbe5QV4HLrYCdQV_kB9SynmAtvcFYup0pmk,&typo=1 > | 16 + > drivers/net/ntnic/nim/i2c_nim.c | 158 +- > drivers/net/ntnic/nim/i2c_nim.h | 6 + > ...00D13_U62_Si5332-GM2-RevD-1_V5-Registers.h | 425 +++ > .../net/ntnic/nthw/core/include/nthw_fpga.h | 10 + > .../net/ntnic/nthw/core/include/nthw_gmf.h | 2 + > .../net/ntnic/nthw/core/include/nthw_hif.h | 4 + > .../net/ntnic/nthw/core/include/nthw_i2cm.h | 3 + > .../net/ntnic/nthw/core/include/nthw_igam.h | 40 + > .../ntnic/nthw/core/include/nthw_pca9532.h | 25 + > .../ntnic/nthw/core/include/nthw_pcal6416a.h | 33 + > .../nthw/core/include/nthw_pcm_nt400dxx.h | 40 + > .../ntnic/nthw/core/include/nthw_phy_tile.h | 156 + > .../nthw/core/include/nthw_prm_nt400dxx.h | 32 + > .../nthw/core/include/nthw_si5332_si5156.h | 63 + > .../net/ntnic/nthw/core/include/nthw_spi_v3.h | 107 + > .../net/ntnic/nthw/core/include/nthw_spim.h | 58 + > .../net/ntnic/nthw/core/include/nthw_spis.h | 63 + > .../nthw/core/nt400dxx/nthw_fpga_nt400dxx.c | 220 ++ > .../core/nt400dxx/reset/nthw_fpga_rst9574.c | 377 ++ > .../nt400dxx/reset/nthw_fpga_rst_nt400dxx.c | 427 +++ > drivers/net/ntnic/nthw/core/nthw_fpga.c | 464 +++ > drivers/net/ntnic/nthw/core/nthw_gfg.c | 340 ++ > drivers/net/ntnic/nthw/core/nthw_gmf.c | 41 + > drivers/net/ntnic/nthw/core/nthw_hif.c | 92 + > drivers/net/ntnic/nthw/core/nthw_i2cm.c | 139 + > drivers/net/ntnic/nthw/core/nthw_igam.c | 93 + > drivers/net/ntnic/nthw/core/nthw_pca9532.c | 60 + > drivers/net/ntnic/nthw/core/nthw_pcal6416a.c | 103 + > .../net/ntnic/nthw/core/nthw_pcm_nt400dxx.c | 80 + > drivers/net/ntnic/nthw/core/nthw_phy_tile.c | 1242 +++++++ > .../net/ntnic/nthw/core/nthw_prm_nt400dxx.c | 55 + > .../net/ntnic/nthw/core/nthw_si5332_si5156.c | 142 + > drivers/net/ntnic/nthw/core/nthw_spi_v3.c | 358 ++ > drivers/net/ntnic/nthw/core/nthw_spim.c | 113 + > drivers/net/ntnic/nthw/core/nthw_spis.c | 121 + > drivers/net/ntnic/nthw/nthw_drv.h | 31 + > drivers/net/ntnic/nthw/nthw_platform.c | 3 + > drivers/net/ntnic/nthw/nthw_platform_drv.h | 2 + > .../supported/nthw_fpga_9574_055_049_0000.c | 3124 +++++++++++++++++ > .../nthw/supported/nthw_fpga_instances.c | 5 +- > .../nthw/supported/nthw_fpga_instances.h | 1 + > .../ntnic/nthw/supported/nthw_fpga_mod_defs.h | 11 + > .../nthw/supported/nthw_fpga_mod_str_map.c | 11 + > .../ntnic/nthw/supported/nthw_fpga_reg_defs.h | 11 + > .../nthw/supported/nthw_fpga_reg_defs_igam.h | 32 + > .../supported/nthw_fpga_reg_defs_pci_ta.h | 33 + > .../nthw_fpga_reg_defs_pcm_nt400dxx.h | 29 + > .../nthw/supported/nthw_fpga_reg_defs_pdi.h | 49 + > .../supported/nthw_fpga_reg_defs_phy_tile.h | 213 ++ > .../nthw_fpga_reg_defs_prm_nt400dxx.h | 26 + > .../nthw/supported/nthw_fpga_reg_defs_rfd.h | 38 + > .../supported/nthw_fpga_reg_defs_rst9574.h | 35 + > .../nthw/supported/nthw_fpga_reg_defs_spim.h | 76 + > .../nthw/supported/nthw_fpga_reg_defs_spis.h | 51 + > .../nthw/supported/nthw_fpga_reg_defs_tint.h | 28 + > drivers/net/ntnic/ntnic_ethdev.c | 1 + > drivers/net/ntnic/ntnic_filter/ntnic_filter.c | 2 +- > drivers/net/ntnic/ntnic_mod_reg.c | 47 + > drivers/net/ntnic/ntnic_mod_reg.h | 25 + > 68 files changed, 10709 insertions(+), 11 deletions(-) > create mode 100644 drivers/net/ntnic/include/nthw_gfg.h > create mode 100644 drivers/net/ntnic/include/ntnic_nthw_fpga_rst_nt400dxx.h > create mode 100644 > drivers/net/ntnic/link_mgmt/link_agx_100g/nt4ga_agx_link_100g.c > create mode 100644 > drivers/net/ntnic/nthw/core/include/NT400D13_U62_Si5332-GM2-RevD-1_V5-Registers.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_igam.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_pca9532.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_pcal6416a.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_pcm_nt400dxx.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_phy_tile.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_prm_nt400dxx.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_si5332_si5156.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_spi_v3.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_spim.h > create mode 100644 drivers/net/ntnic/nthw/core/include/nthw_spis.h > create mode 100644 drivers/net/ntnic/nthw/core/nt400dxx/nthw_fpga_nt400dxx.c > create mode 100644 > drivers/net/ntnic/nthw/core/nt400dxx/reset/nthw_fpga_rst9574.c > create mode 100644 > drivers/net/ntnic/nthw/core/nt400dxx/reset/nthw_fpga_rst_nt400dxx.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_gfg.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_igam.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_pca9532.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_pcal6416a.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_pcm_nt400dxx.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_phy_tile.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_prm_nt400dxx.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_si5332_si5156.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_spi_v3.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_spim.c > create mode 100644 drivers/net/ntnic/nthw/core/nthw_spis.c > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_9574_055_049_0000.c > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_igam.h > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_pci_ta.h > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_pcm_nt400dxx.h > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_pdi.h > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_phy_tile.h > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_prm_nt400dxx.h > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_rfd.h > create mode 100644 > drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_rst9574.h > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_spim.h > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_spis.h > create mode 100644 drivers/net/ntnic/nthw/supported/nthw_fpga_reg_defs_tint.h > I will merge this for next-net BUT The driver is better after this patch series, but still low quality. Several pre-existing issues in this driver; looks like it did not get enough review during 24.11 release when it was merged. ✔ passed ✘ Failed Basic hygiene ✔ Look at CI results in patchwork ✔ Merge cleanly with git am ✔ Run checkpatches ✔ Run check-git-log ✔ Run https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fcheck-symbol-maps.sh&c=E,1,_xJVo6bETFeApP-wtnvbWAvxaCa7qbnLMt7zRlpiRqjS0m95ct6mvHNzJet-519faqsNDIQ1syIb2Zy2Ofg6Ow0SCxip8zUntKKIuMak&typo=1 ✔ Run check-doc-vs-code ✔ Run check-spdk-tag Builds ✔ Normal Gcc build; make sure driver is built! ✔ Use latest experimental Gcc 15 ✔ Clang build using current version (clang-19) ✔ Doc build o Build for 32 bit x86 o Cross build for Windows (if applicable) ✔ Debug build ✔ Enable asserts ✔ Test meson builds Experimental builds: ✔ Enable address sanitizer ✘ Enable extra warnings (edit https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fmeson.build&c=E,1,mL1G4BVjr4bxM-sD_s9RIW1LCs5wCyCL61shJTZo3iTRNFbe50BYXiKNbPD6OaCA84bSRTFNPk1K6QYBvlf7Q3XzyW8lLPKkgH9tUdywAJZkhd-X5iW-f4bjPA,,&typo=1) for -Wvla, -Wformat-truncation, -Waddress-of-packed-member Let's not add more VLA's, these could be arrays with a fixed size. [1470/3259] Compiling C object drivers...ofile_inline_flow_api_hw_db_inline.c.o ../drivers/net/ntnic/nthw/flow_api/profile_inline/flow_api_hw_db_inline.c: In function ‘hw_db_inline_alloc_prioritized_cfn’: ../drivers/net/ntnic/nthw/flow_api/profile_inline/flow_api_hw_db_inline.c:1346:9: warning: ISO C90 forbids variable length array ‘sorted_priority’ [-Wvla] 1346 | } sorted_priority[db->nb_cat]; | ^ [1479/3259] Compiling C object drivers...ile_inline_flow_api_profile_inline.c.o ../drivers/net/ntnic/nthw/flow_api/profile_inline/flow_api_profile_inline.c: In function ‘setup_db_qsl_data’: ../drivers/net/ntnic/nthw/flow_api/profile_inline/flow_api_profile_inline.c:3193:17: warning: ISO C90 forbids variable length array ‘ports’ [-Wvla] 3193 | uint32_t ports[fd->dst_num_avail]; | ^~~~~~~~ ../drivers/net/ntnic/nthw/flow_api/profile_inline/flow_api_profile_inline.c:3194:17: warning: ISO C90 forbids variable length array ‘queues’ [-Wvla] 3194 | uint32_t queues[fd->dst_num_avail]; | ^~~~~~~~ This is not doing what you expect, ports is uint32_t array uint32_t ports[fd->dst_num_avail]; uint32_t queues[fd->dst_num_avail]; memset(ports, 0, fd->dst_num_avail); memset(queues, 0, fd->dst_num_avail); Instead use HW_DB_INLINE_MAX_QST_PER_QSL Look for anti-patterns: ✘ Driver must not disable warnings with compiler flags or pragma's Using pragma pack() ✘ Driver must not use thread and signal Using thread to monitor, should not be done by PMD specific thread. ✘ Driver should not call abort() or assert() directly Is using assert() when should be using RTE_ASSERT() ✘ Review exposed symbol names The driver exposes lots of global symbols (when statically linked) that do not have a consistent prefix of nthw_... Examples: get_rx_idle(), set_rx_idle(), dev_flow_init() ✔ Apply coccinelle scripts ✘ Review use of malloc Several places call malloc but do not check return value ✘ Review use of memset The code related to stats has several issues: - function returns -1 but never checked by callers - stats structure is already zero'd by ethdev - if queue is greater than RTE_ETHDEV_QUEUE_STAT_CNTRS the statistics should still be counted for that queue, just no per-queue stats - the use of term if_index is potentially confusing; normally if_index refers to the interface index assigned by the OS used for ioctl's etc. In this driver it appears to be the index of the phy. static int dpdk_stats_collect(struct pmd_internals *internals, struct rte_eth_stats *stats) { const struct ntnic_filter_ops *ntnic_filter_ops = get_ntnic_filter_ops(); if (ntnic_filter_ops == NULL) { NT_LOG_DBGX(ERR, NTNIC, "ntnic_filter_ops uninitialized"); return -1; } ... ntnic_filter_ops->poll_statistics(internals); memset(stats, 0, sizeof(*stats)); for (i = 0; i < RTE_ETHDEV_QUEUE_STAT_CNTRS && i < internals->nb_rx_queues; i++) { stats->q_ipackets[i] = internals->rxq_scg[i].rx_pkts; stats->q_ibytes[i] = internals->rxq_scg[i].rx_bytes; rx_total += stats->q_ipackets[i]; rx_total_b += stats->q_ibytes[i]; } for (i = 0; i < RTE_ETHDEV_QUEUE_STAT_CNTRS && i < internals->nb_tx_queues; i++) { stats->q_opackets[i] = internals->txq_scg[i].tx_pkts; stats->q_obytes[i] = internals->txq_scg[i].tx_bytes; stats->q_errors[i] = internals->txq_scg[i].err_pkts; tx_total += stats->q_opackets[i]; tx_total_b += stats->q_obytes[i]; tx_err_total += stats->q_errors[i]; } Other: The handling of queue start/stop in this device is odd. Doesn't do deferred start. When Rx is stopped most drivers do some action to stop the hardware. Given the use of thread, driver should *not* have been merged in 24.11. The DPDK has a set of assumptions about thread and process model, and drivers making their own threads can cause problems in applications. (Other drivers use alarm for this type of port monitor function).