Re: [Intel-wired-lan] [PATCH iwl-next v4 1/2] ixgbe: Refactor overtemp event handling

2023-12-13 Thread Przemek Kitszel

On 12/12/23 11:46, Jedrzej Jagielski wrote:

Currently ixgbe driver is notified of overheating events
via internal IXGBE_ERR_OVERTEMP error code.

Change the approach for handle_lasi() to use freshly introduced
is_overtemp function parameter which set when such event occurs.
Change check_overtemp() to bool and return true if overtemp
event occurs.

Signed-off-by: Jedrzej Jagielski 
---
v2: change aproach to use additional function parameter to notify when overheat
v4: change check_overtemp to bool

https://lore.kernel.org/netdev/20231208090055.303507-1-jedrzej.jagiel...@intel.com/T/
---
  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 19 
  drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c  | 26 ++-
  drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h  |  2 +-
  drivers/net/ethernet/intel/ixgbe/ixgbe_type.h |  4 +-
  drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 45 +++
  5 files changed, 54 insertions(+), 42 deletions(-)


Reviewed-by: Przemek Kitszel 

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH v2 iwl-next] ice: Fix some null pointer dereference issues in ice_ptp.c

2023-12-13 Thread Przemek Kitszel

On 12/12/23 03:40, Kunwu Chan wrote:

devm_kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.

Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS")
Cc: Kunwu Chan 
Suggested-by: Przemek Kitszel 


You found the bug (or some some static analysis tool in that case);
there is no need to add Suggested-by for every person that suggests
something during review - the tag is for "person/s that suggested
making such change in the repo".

Subject line would be better if less generic, eg:
ice: avoid null deref of ptp auxbus name


Signed-off-by: Kunwu Chan 
---
  drivers/net/ethernet/intel/ice/ice_ptp.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c 
b/drivers/net/ethernet/intel/ice/ice_ptp.c
index e9e59f4b5580..848e3e063e64 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -2743,6 +2743,8 @@ static int ice_ptp_register_auxbus_driver(struct ice_pf 
*pf)
name = devm_kasprintf(dev, GFP_KERNEL, "ptp_aux_dev_%u_%u_clk%u",
  pf->pdev->bus->number, PCI_SLOT(pf->pdev->devfn),
  ice_get_ptp_src_clock_index(&pf->hw));
+   if (!name)
+   return -ENOMEM;
  
  	aux_driver->name = name;

aux_driver->shutdown = ice_ptp_auxbus_shutdown;
@@ -2989,6 +2991,8 @@ static int ice_ptp_create_auxbus_device(struct ice_pf *pf)
name = devm_kasprintf(dev, GFP_KERNEL, "ptp_aux_dev_%u_%u_clk%u",
  pf->pdev->bus->number, PCI_SLOT(pf->pdev->devfn),
  ice_get_ptp_src_clock_index(&pf->hw));
+   if (!name)
+   return -ENOMEM;
  
  	aux_dev->name = name;

aux_dev->id = id;


Reviewed-by: Przemek Kitszel 

Regarding iwl-next vs iwl-net: this bug is really unlikely to manifest,
as we take care of both earlier and future mem allocs for ptp auxbus,
and auxiliary_device_init() checks for null name, so no big deal,
so: -next is fine
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH net-next v6 08/12] libie: add Rx buffer management (via Page Pool)

2023-12-13 Thread Alexander Lobakin
From: Jakub Kicinski 
Date: Mon, 11 Dec 2023 11:23:32 -0800

> On Mon, 11 Dec 2023 11:16:20 +0100 Alexander Lobakin wrote:
>> Ideally, I'd like to pass a CPU ID this queue will be run on and use
>> cpu_to_node(), but currently there's no NUMA-aware allocations in the
>> Intel drivers and Rx queues don't get the corresponding CPU ID when
>> configuring. I may revisit this later, but for now NUMA_NO_NODE is the
>> most optimal solution here.
> 
> Hm, I've been wondering about persistent IRQ mappings. Drivers
> resetting IRQ mapping on reconfiguration is a major PITA in production
> clusters. You change the RSS hash and some NICs suddenly forget
> affinitization 🤯️
> 
> The connection with memory allocations changes the math on that a bit.
> 
> The question is really whether we add CPU <> NAPI config as a netdev
> Netlink API or build around the generic IRQ affinity API. The latter 
> is definitely better from "don't duplicate uAPI" perspective.
> But we need to reset the queues and reallocate their state when 
> the mapping is changed. And shutting down queues on 
> 
>   echo $cpu > /../smp_affinity_list
> 
> seems moderately insane. Perhaps some middle-ground exists.
> 
> Anyway, if you do find cycles to tackle this - pls try to do it
> generically not just for Intel? :)

Sounds good, adding to my fathomless backlog :>

Thanks,
Olek
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [PATCH net-next v7 01/12] page_pool: make sure frag API fields don't span between cachelines

2023-12-13 Thread Alexander Lobakin
After commit 5027ec19f104 ("net: page_pool: split the page_pool_params
into fast and slow") that made &page_pool contain only "hot" params at
the start, cacheline boundary chops frag API fields group in the middle
again.
To not bother with this each time fast params get expanded or shrunk,
let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to
their actual size (2 longs + 2 ints). This ensures 16-byte alignment for
the 32-bit architectures and 32-byte alignment for the 64-bit ones,
excluding unnecessary false-sharing.

Signed-off-by: Alexander Lobakin 
---
 include/net/page_pool/types.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index ac286ea8ce2d..35ab82da7f2a 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -130,7 +130,16 @@ struct page_pool {
 
bool has_init_callback;
 
-   long frag_users;
+   /* The following block must stay within one cacheline. On 32-bit
+* systems, sizeof(long) == sizeof(int), so that the block size is
+* precisely ``4 * sizeof(long)``. On 64-bit systems, the actual size
+* is ``2 * sizeof(long) + 2 * sizeof(int)``, i.e. 24 bytes, but the
+* closest pow-2 to that is 32 bytes, which also equals to
+* ``4 * sizeof(long)``, so just use that one for simplicity.
+* Having it aligned to a cacheline boundary may be excessive and
+* doesn't bring any good.
+*/
+   long frag_users __aligned(4 * sizeof(long));
struct page *frag_page;
unsigned int frag_offset;
u32 pages_state_hold_cnt;
-- 
2.43.0

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [PATCH net-next v7 00/12] net: intel: start The Great Code Dedup + Page Pool for iavf

2023-12-13 Thread Alexander Lobakin
Here's a two-shot: introduce Intel Ethernet common library (libie) and
switch iavf to Page Pool. Details are in the commit messages; here's
a summary:

Not a secret there's a ton of code duplication between two and more Intel
ethernet modules. Before introducing new changes, which would need to be
copied over again, start decoupling the already existing duplicate
functionality into a new module, which will be shared between several
Intel Ethernet drivers. The first name that came to my mind was
"libie" -- "Intel Ethernet common library". Also this sounds like
"lovelie" (-> one word, no "lib I E" pls) and can be expanded as
"lib Internet Explorer" :P
The series is only the beginning. From now on, adding every new feature
or doing any good driver refactoring will remove much more lines than add
for quite some time. There's a basic roadmap with some deduplications
planned already, not speaking of that touching every line now asks:
"can I share this?". The final destination is very ambitious: have only
one unified driver for at least i40e, ice, iavf, and idpf with a struct
ops for each generation. That's never gonna happen, right? But you still
can at least try.
PP conversion for iavf lands within the same series as these two are tied
closely. libie will support Page Pool model only, so that a driver can't
use much of the lib until it's converted. iavf is only the example, the
rest will eventually be converted soon on a per-driver basis. That is
when it gets really interesting. Stay tech.

Alexander Lobakin (12):
  page_pool: make sure frag API fields don't span between cachelines
  page_pool: don't use driver-set flags field directly
  net: intel: introduce Intel Ethernet common library
  iavf: kill "legacy-rx" for good
  iavf: drop page splitting and recycling
  page_pool: constify some read-only function arguments
  page_pool: add DMA-sync-for-CPU inline helper
  libie: add Rx buffer management (via Page Pool)
  iavf: pack iavf_ring more efficiently
  iavf: switch to Page Pool
  libie: add common queue stats
  iavf: switch queue stats to libie

 MAINTAINERS   |   3 +-
 drivers/net/ethernet/intel/Kconfig|   6 +
 drivers/net/ethernet/intel/Makefile   |   1 +
 drivers/net/ethernet/intel/i40e/i40e_common.c | 253 ---
 drivers/net/ethernet/intel/i40e/i40e_main.c   |   1 +
 .../net/ethernet/intel/i40e/i40e_prototype.h  |   7 -
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  72 +-
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  88 ---
 drivers/net/ethernet/intel/iavf/iavf.h|   2 +-
 drivers/net/ethernet/intel/iavf/iavf_common.c | 253 ---
 .../net/ethernet/intel/iavf/iavf_ethtool.c| 234 +--
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  42 +-
 .../net/ethernet/intel/iavf/iavf_prototype.h  |   7 -
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   | 624 --
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   | 174 +
 drivers/net/ethernet/intel/iavf/iavf_type.h   |  90 ---
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  17 +-
 .../net/ethernet/intel/ice/ice_lan_tx_rx.h| 316 -
 drivers/net/ethernet/intel/ice/ice_main.c |   1 +
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  74 +--
 drivers/net/ethernet/intel/libie/Kconfig  |   9 +
 drivers/net/ethernet/intel/libie/Makefile |   7 +
 drivers/net/ethernet/intel/libie/rx.c | 179 +
 drivers/net/ethernet/intel/libie/stats.c  | 121 
 include/linux/net/intel/libie/rx.h| 259 
 include/linux/net/intel/libie/stats.h | 179 +
 include/net/page_pool/helpers.h   |  34 +-
 include/net/page_pool/types.h |  19 +-
 net/core/page_pool.c  |  42 +-
 29 files changed, 1058 insertions(+), 2056 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/libie/Kconfig
 create mode 100644 drivers/net/ethernet/intel/libie/Makefile
 create mode 100644 drivers/net/ethernet/intel/libie/rx.c
 create mode 100644 drivers/net/ethernet/intel/libie/stats.c
 create mode 100644 include/linux/net/intel/libie/rx.h
 create mode 100644 include/linux/net/intel/libie/stats.h

---
>From v6[0]:
* #04: resolve ethtool_puts() Git conflict (Jakub);
* #06: pick RB from Ilias;
* no functional changes.

>From v5[1]:
* drop Page Pool DMA shortcut: will pick up Eric's more global DMA sync
  optimization[1] and expand it to cover both IOMMU and direct DMA a bit
  later (Yunsheng);
* drop per-queue Page Pool Ethtool stats: they are now exported via
  generic Netlink interface (Jakub);
* #01: leave a comment why exactly this alignment (Jakub, Yunsheng);
* #08: make use of page_pool_params::netdev when calculating PP params;
* #08: rename ``libie_rx_queue`` -> ``libie_buf_queue``.

>From v4[3]:
* make use of Jakub's &page_pool_params split;
* #01: prevent frag fields from spanning into 2 cachelines after
  splitting &page_pool_params into fast and slow;
* #02-03: bring back the DMA sync shortcut, now as a 

[Intel-wired-lan] [PATCH net-next v7 02/12] page_pool: don't use driver-set flags field directly

2023-12-13 Thread Alexander Lobakin
page_pool::p is driver-defined params, copied directly from the
structure passed to page_pool_create(). The structure isn't meant
to be modified by the Page Pool core code and this even might look
confusing[0][1].
In order to be able to alter some flags, let's define our own, internal
fields the same way as the already existing one (::has_init_callback).
They are defined as bits in the driver-set params, leave them so here
as well, to not waste byte-per-bit or so. Almost 30 bits are still free
for future extensions.
We could've defined only new flags here or only the ones we may need
to alter, but checking some flags in one place while others in another
doesn't sound convenient or intuitive. ::flags passed by the driver can
now go to the "slow" PP params.

Suggested-by: Jakub Kicinski 
Link[0]: https://lore.kernel.org/netdev/20230703133207.4f0c5...@kernel.org
Suggested-by: Alexander Duyck 
Link[1]: 
https://lore.kernel.org/netdev/cakgt0ufzcgnwgoh96e4gv3zp6llbrohm7she8nkwq+exx+g...@mail.gmail.com
Signed-off-by: Alexander Lobakin 
---
 include/net/page_pool/types.h |  8 +---
 net/core/page_pool.c  | 34 ++
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
index 35ab82da7f2a..51f6cdd60757 100644
--- a/include/net/page_pool/types.h
+++ b/include/net/page_pool/types.h
@@ -44,7 +44,6 @@ struct pp_alloc_cache {
 
 /**
  * struct page_pool_params - page pool parameters
- * @flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV
  * @order: 2^order pages on allocation
  * @pool_size: size of the ptr_ring
  * @nid:   NUMA node id to allocate from pages from
@@ -54,10 +53,10 @@ struct pp_alloc_cache {
  * @dma_dir:   DMA mapping direction
  * @max_len:   max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
  * @offset:DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
+ * @flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV
  */
 struct page_pool_params {
struct_group_tagged(page_pool_params_fast, fast,
-   unsigned intflags;
unsigned intorder;
unsigned intpool_size;
int nid;
@@ -68,6 +67,7 @@ struct page_pool_params {
unsigned intoffset;
);
struct_group_tagged(page_pool_params_slow, slow,
+   unsigned intflags;
struct net_device *netdev;
 /* private: used by test code only */
void (*init_callback)(struct page *page, void *arg);
@@ -128,7 +128,9 @@ struct page_pool_stats {
 struct page_pool {
struct page_pool_params_fast p;
 
-   bool has_init_callback;
+   bool dma_map:1; /* Perform DMA mapping */
+   bool dma_sync:1;/* Perform DMA sync */
+   bool has_init_callback:1;   /* slow.init_callback is set */
 
/* The following block must stay within one cacheline. On 32-bit
 * systems, sizeof(long) == sizeof(int), so that the block size is
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c2e7c9a6efbe..59aca3339222 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -179,7 +179,7 @@ static int page_pool_init(struct page_pool *pool,
memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow));
 
/* Validate only known flags were used */
-   if (pool->p.flags & ~(PP_FLAG_ALL))
+   if (pool->slow.flags & ~(PP_FLAG_ALL))
return -EINVAL;
 
if (pool->p.pool_size)
@@ -193,22 +193,26 @@ static int page_pool_init(struct page_pool *pool,
 * DMA_BIDIRECTIONAL is for allowing page used for DMA sending,
 * which is the XDP_TX use-case.
 */
-   if (pool->p.flags & PP_FLAG_DMA_MAP) {
+   if (pool->slow.flags & PP_FLAG_DMA_MAP) {
if ((pool->p.dma_dir != DMA_FROM_DEVICE) &&
(pool->p.dma_dir != DMA_BIDIRECTIONAL))
return -EINVAL;
+
+   pool->dma_map = true;
}
 
-   if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) {
+   if (pool->slow.flags & PP_FLAG_DMA_SYNC_DEV) {
/* In order to request DMA-sync-for-device the page
 * needs to be mapped
 */
-   if (!(pool->p.flags & PP_FLAG_DMA_MAP))
+   if (!(pool->slow.flags & PP_FLAG_DMA_MAP))
return -EINVAL;
 
if (!pool->p.max_len)
return -EINVAL;
 
+   pool->dma_sync = true;
+
/* pool->p.offset has to be set according to the address
 * offset used by the DMA engine to start copying rx data
 */
@@ -234,7 +238,7 @@ static int page_pool_init(struct page_pool *pool,
/* Driver calling page_pool_create() also call page_pool_destroy() */
refcount_set(&pool->user_cnt, 1);
 
-   if (pool->p.flags & PP_FLAG_DMA_MAP)
+   if (pool->dma_

[Intel-wired-lan] [PATCH net-next v7 03/12] net: intel: introduce Intel Ethernet common library

2023-12-13 Thread Alexander Lobakin
Not a secret there's a ton of code duplication between two and more Intel
ethernet modules.
Before introducing new changes, which would need to be copied over again,
start decoupling the already existing duplicate functionality into a new
module, which will be shared between several Intel Ethernet drivers.
Add the lookup table which converts 8/10-bit hardware packet type into
a parsed bitfield structure for easy checking packet format parameters,
such as payload level, IP version, etc. This is currently used by i40e,
ice and iavf and it's all the same in all three drivers.
The only difference introduced in this implementation is that instead of
defining a 256 (or 1024 in case of ice) element array, add unlikely()
condition to limit the input to 154 (current maximum non-reserved packet
type). There's no reason to waste 600 (or even 3600) bytes only to not
hurt very unlikely exception packets.
The hash computation function now takes payload level directly as a
pkt_hash_type. There's a couple cases when non-IP ptypes are marked as
L3 payload and in the previous versions their hash level would be 2, not
3. But skb_set_hash() only sees difference between L4 and non-L4, thus
this won't change anything at all.
The module is behind the hidden Kconfig symbol, which the drivers will
select when needed. The exports are behind 'LIBIE' namespace to limit
the scope of the functions.

Signed-off-by: Alexander Lobakin 
---
 MAINTAINERS   |   3 +-
 drivers/net/ethernet/intel/Kconfig|   6 +
 drivers/net/ethernet/intel/Makefile   |   1 +
 drivers/net/ethernet/intel/i40e/i40e_common.c | 253 --
 drivers/net/ethernet/intel/i40e/i40e_main.c   |   1 +
 .../net/ethernet/intel/i40e/i40e_prototype.h  |   7 -
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  72 +---
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  88 -
 drivers/net/ethernet/intel/iavf/iavf_common.c | 253 --
 drivers/net/ethernet/intel/iavf/iavf_main.c   |   1 +
 .../net/ethernet/intel/iavf/iavf_prototype.h  |   7 -
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   |  70 +---
 drivers/net/ethernet/intel/iavf/iavf_type.h   |  88 -
 .../net/ethernet/intel/ice/ice_lan_tx_rx.h| 316 --
 drivers/net/ethernet/intel/ice/ice_main.c |   1 +
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  74 +---
 drivers/net/ethernet/intel/libie/Kconfig  |   8 +
 drivers/net/ethernet/intel/libie/Makefile |   6 +
 drivers/net/ethernet/intel/libie/rx.c | 110 ++
 include/linux/net/intel/libie/rx.h| 128 +++
 20 files changed, 315 insertions(+), 1178 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/libie/Kconfig
 create mode 100644 drivers/net/ethernet/intel/libie/Makefile
 create mode 100644 drivers/net/ethernet/intel/libie/rx.c
 create mode 100644 include/linux/net/intel/libie/rx.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a151988646fe..95f2eff1b1e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10650,7 +10650,8 @@ F:  
Documentation/networking/device_drivers/ethernet/intel/
 F: drivers/net/ethernet/intel/
 F: drivers/net/ethernet/intel/*/
 F: include/linux/avf/virtchnl.h
-F: include/linux/net/intel/iidc.h
+F: include/linux/net/intel/
+F: include/linux/net/intel/*/
 
 INTEL ETHERNET PROTOCOL DRIVER FOR RDMA
 M: Mustafa Ismail 
diff --git a/drivers/net/ethernet/intel/Kconfig 
b/drivers/net/ethernet/intel/Kconfig
index d55638ad8704..c7da7d05d93e 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -16,6 +16,8 @@ config NET_VENDOR_INTEL
 
 if NET_VENDOR_INTEL
 
+source "drivers/net/ethernet/intel/libie/Kconfig"
+
 config E100
tristate "Intel(R) PRO/100+ support"
depends on PCI
@@ -225,6 +227,7 @@ config I40E
depends on PTP_1588_CLOCK_OPTIONAL
depends on PCI
select AUXILIARY_BUS
+   select LIBIE
select NET_DEVLINK
help
  This driver supports Intel(R) Ethernet Controller XL710 Family of
@@ -253,6 +256,8 @@ config I40E_DCB
 # so that CONFIG_IAVF symbol will always mirror the state of CONFIG_I40EVF
 config IAVF
tristate
+   select LIBIE
+
 config I40EVF
tristate "Intel(R) Ethernet Adaptive Virtual Function support"
select IAVF
@@ -283,6 +288,7 @@ config ICE
depends on GNSS || GNSS = n
select AUXILIARY_BUS
select DIMLIB
+   select LIBIE
select NET_DEVLINK
select PLDMFW
select DPLL
diff --git a/drivers/net/ethernet/intel/Makefile 
b/drivers/net/ethernet/intel/Makefile
index dacb481ee5b1..60ca00a01de4 100644
--- a/drivers/net/ethernet/intel/Makefile
+++ b/drivers/net/ethernet/intel/Makefile
@@ -16,3 +16,4 @@ obj-$(CONFIG_IAVF) += iavf/
 obj-$(CONFIG_FM10K) += fm10k/
 obj-$(CONFIG_ICE) += ice/
 obj-$(CONFIG_IDPF) += idpf/
+obj-$(CONFIG_LIBIE) += libie/
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c 
b/drivers/net/ethernet/int

[Intel-wired-lan] [PATCH net-next v7 04/12] iavf: kill "legacy-rx" for good

2023-12-13 Thread Alexander Lobakin
Ever since build_skb() became stable, the old way with allocating an skb
for storing the headers separately, which will be then copied manually,
was slower, less flexible, and thus obsolete.

* It had higher pressure on MM since it actually allocates new pages,
  which then get split and refcount-biased (NAPI page cache);
* It implies memcpy() of packet headers (40+ bytes per each frame);
* the actual header length was calculated via eth_get_headlen(), which
  invokes Flow Dissector and thus wastes a bunch of CPU cycles;
* XDP makes it even more weird since it requires headroom for long and
  also tailroom for some time (since mbuf landed). Take a look at the
  ice driver, which is built around work-arounds to make XDP work with
  it.

Even on some quite low-end hardware (not a common case for 100G NICs) it
was performing worse.
The only advantage "legacy-rx" had is that it didn't require any
reserved headroom and tailroom. But iavf didn't use this, as it always
splits pages into two halves of 2k, while that save would only be useful
when striding. And again, XDP effectively removes that sole pro.

There's a train of features to land in IAVF soon: Page Pool, XDP, XSk,
multi-buffer etc. Each new would require adding more and more Danse
Macabre for absolutely no reason, besides making hotpath less and less
effective.
Remove the "feature" with all the related code. This includes at least
one very hot branch (typically hit on each new frame), which was either
always-true or always-false at least for a complete NAPI bulk of 64
frames, the whole private flags cruft, and so on. Some stats:

Function: add/remove: 0/4 grow/shrink: 0/7 up/down: 0/-721 (-721)
RO Data: add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-40 (-40)

Reviewed-by: Alexander Duyck 
Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/iavf/iavf.h|   2 +-
 .../net/ethernet/intel/iavf/iavf_ethtool.c| 140 --
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  10 +-
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   |  84 +--
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   |  27 +---
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |   3 +-
 6 files changed, 8 insertions(+), 258 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf.h 
b/drivers/net/ethernet/intel/iavf/iavf.h
index e7ab89dc883a..64c94e9a6b2d 100644
--- a/drivers/net/ethernet/intel/iavf/iavf.h
+++ b/drivers/net/ethernet/intel/iavf/iavf.h
@@ -287,7 +287,7 @@ struct iavf_adapter {
 #define IAVF_FLAG_RESET_PENDINGBIT(4)
 #define IAVF_FLAG_RESET_NEEDED BIT(5)
 #define IAVF_FLAG_WB_ON_ITR_CAPABLEBIT(6)
-#define IAVF_FLAG_LEGACY_RXBIT(15)
+/* BIT(15) is free, was IAVF_FLAG_LEGACY_RX */
 #define IAVF_FLAG_REINIT_ITR_NEEDEDBIT(16)
 #define IAVF_FLAG_QUEUES_DISABLED  BIT(17)
 #define IAVF_FLAG_SETUP_NETDEV_FEATURESBIT(18)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c 
b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
index 631c7cfdc814..4b8b068e9d11 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
@@ -239,29 +239,6 @@ static const struct iavf_stats iavf_gstrings_stats[] = {
 
 #define IAVF_QUEUE_STATS_LEN   ARRAY_SIZE(iavf_gstrings_queue_stats)
 
-/* For now we have one and only one private flag and it is only defined
- * when we have support for the SKIP_CPU_SYNC DMA attribute.  Instead
- * of leaving all this code sitting around empty we will strip it unless
- * our one private flag is actually available.
- */
-struct iavf_priv_flags {
-   char flag_string[ETH_GSTRING_LEN];
-   u32 flag;
-   bool read_only;
-};
-
-#define IAVF_PRIV_FLAG(_name, _flag, _read_only) { \
-   .flag_string = _name, \
-   .flag = _flag, \
-   .read_only = _read_only, \
-}
-
-static const struct iavf_priv_flags iavf_gstrings_priv_flags[] = {
-   IAVF_PRIV_FLAG("legacy-rx", IAVF_FLAG_LEGACY_RX, 0),
-};
-
-#define IAVF_PRIV_FLAGS_STR_LEN ARRAY_SIZE(iavf_gstrings_priv_flags)
-
 /**
  * iavf_get_link_ksettings - Get Link Speed and Duplex settings
  * @netdev: network interface device structure
@@ -341,8 +318,6 @@ static int iavf_get_sset_count(struct net_device *netdev, 
int sset)
return IAVF_STATS_LEN +
(IAVF_QUEUE_STATS_LEN * 2 *
 netdev->real_num_tx_queues);
-   else if (sset == ETH_SS_PRIV_FLAGS)
-   return IAVF_PRIV_FLAGS_STR_LEN;
else
return -EINVAL;
 }
@@ -384,21 +359,6 @@ static void iavf_get_ethtool_stats(struct net_device 
*netdev,
rcu_read_unlock();
 }
 
-/**
- * iavf_get_priv_flag_strings - Get private flag strings
- * @netdev: network interface device structure
- * @data: buffer for string data
- *
- * Builds the private flags string table
- **/
-static void iavf_get_priv_flag_strings(struct net_device *netdev, u8 *data)
-{
-   unsigned int i;
-
-

[Intel-wired-lan] [PATCH net-next v7 05/12] iavf: drop page splitting and recycling

2023-12-13 Thread Alexander Lobakin
As an intermediate step, remove all page splitting/recycling code. Just
always allocate a new page and don't touch its refcount, so that it gets
freed by the core stack later.
Same for the "in-place" recycling, i.e. when an unused buffer gets
assigned to a first needs-refilling descriptor. In some cases, this
was leading to moving up to 63 &iavf_rx_buf structures around the ring
on a per-field basis -- not something wanted on hotpath.
The change allows to greatly simplify certain parts of the code:

Function: add/remove: 0/2 grow/shrink: 0/7 up/down: 0/-744 (-744)

Although the array of &iavf_rx_buf is barely used now and could be
replaced with just page pointer array, don't touch it now to not
complicate replacing it with libie Rx buffer struct later on.
No surprise perf loses up to 30% here, but that regression will
go away once PP lands.
Note that iavf_rx_pg_*() definitions are left to reduce diffstat.
They will be removed with the conversion to Page Pool.

Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  24 +--
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   | 152 +-
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   |  65 
 drivers/net/ethernet/intel/iavf/iavf_type.h   |   2 -
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |   8 +-
 5 files changed, 10 insertions(+), 241 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c 
b/drivers/net/ethernet/intel/iavf/iavf_main.c
index f81144466496..060002d1ebf0 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -736,32 +736,10 @@ static void iavf_configure_tx(struct iavf_adapter 
*adapter)
  **/
 static void iavf_configure_rx(struct iavf_adapter *adapter)
 {
-   unsigned int rx_buf_len = IAVF_RXBUFFER_2048;
struct iavf_hw *hw = &adapter->hw;
-   int i;
-
-   if (PAGE_SIZE < 8192) {
-   struct net_device *netdev = adapter->netdev;
 
-   /* For jumbo frames on systems with 4K pages we have to use
-* an order 1 page, so we might as well increase the size
-* of our Rx buffer to make better use of the available space
-*/
-   rx_buf_len = IAVF_RXBUFFER_3072;
-
-   /* We use a 1536 buffer size for configurations with
-* standard Ethernet mtu.  On x86 this gives us enough room
-* for shared info and 192 bytes of padding.
-*/
-   if (!IAVF_2K_TOO_SMALL_WITH_PADDING &&
-   (netdev->mtu <= ETH_DATA_LEN))
-   rx_buf_len = IAVF_RXBUFFER_1536 - NET_IP_ALIGN;
-   }
-
-   for (i = 0; i < adapter->num_active_queues; i++) {
+   for (u32 i = 0; i < adapter->num_active_queues; i++)
adapter->rx_rings[i].tail = hw->hw_addr + IAVF_QRX_TAIL1(i);
-   adapter->rx_rings[i].rx_buf_len = rx_buf_len;
-   }
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c 
b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index 27cea26cc53e..665ee1feb877 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -714,7 +714,7 @@ static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
dma_sync_single_range_for_cpu(rx_ring->dev,
  rx_bi->dma,
  rx_bi->page_offset,
- rx_ring->rx_buf_len,
+ IAVF_RXBUFFER_3072,
  DMA_FROM_DEVICE);
 
/* free resources associated with mapping */
@@ -723,7 +723,7 @@ static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
 DMA_FROM_DEVICE,
 IAVF_RX_DMA_ATTR);
 
-   __page_frag_cache_drain(rx_bi->page, rx_bi->pagecnt_bias);
+   __free_page(rx_bi->page);
 
rx_bi->page = NULL;
rx_bi->page_offset = 0;
@@ -735,7 +735,6 @@ static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
/* Zero out the descriptor ring */
memset(rx_ring->desc, 0, rx_ring->size);
 
-   rx_ring->next_to_alloc = 0;
rx_ring->next_to_clean = 0;
rx_ring->next_to_use = 0;
 }
@@ -791,7 +790,6 @@ int iavf_setup_rx_descriptors(struct iavf_ring *rx_ring)
goto err;
}
 
-   rx_ring->next_to_alloc = 0;
rx_ring->next_to_clean = 0;
rx_ring->next_to_use = 0;
 
@@ -811,9 +809,6 @@ static void iavf_release_rx_desc(struct iavf_ring *rx_ring, 
u32 val)
 {
rx_ring->next_to_use = val;
 
-   /* update next to alloc since we have filled the ring */
-   rx_ring->next_to_alloc = val;
-
/* Force memory writes to complete before letting h/w
 * know there are new descriptors to fetch.  (Only
 * applica

[Intel-wired-lan] [PATCH net-next v7 06/12] page_pool: constify some read-only function arguments

2023-12-13 Thread Alexander Lobakin
There are several functions taking pointers to data they don't modify.
This includes statistics fetching, page and page_pool parameters, etc.
Constify the pointers, so that call sites will be able to pass const
pointers as well.
No functional changes, no visible changes in functions sizes.

Reviewed-by: Ilias Apalodimas 
Signed-off-by: Alexander Lobakin 
---
 include/net/page_pool/helpers.h | 10 +-
 net/core/page_pool.c|  8 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 7dc65774cde5..c860fad50d00 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -58,7 +58,7 @@
 /* Deprecated driver-facing API, use netlink instead */
 int page_pool_ethtool_stats_get_count(void);
 u8 *page_pool_ethtool_stats_get_strings(u8 *data);
-u64 *page_pool_ethtool_stats_get(u64 *data, void *stats);
+u64 *page_pool_ethtool_stats_get(u64 *data, const void *stats);
 
 bool page_pool_get_stats(const struct page_pool *pool,
 struct page_pool_stats *stats);
@@ -73,7 +73,7 @@ static inline u8 *page_pool_ethtool_stats_get_strings(u8 
*data)
return data;
 }
 
-static inline u64 *page_pool_ethtool_stats_get(u64 *data, void *stats)
+static inline u64 *page_pool_ethtool_stats_get(u64 *data, const void *stats)
 {
return data;
 }
@@ -204,8 +204,8 @@ static inline void *page_pool_dev_alloc_va(struct page_pool 
*pool,
  * Get the stored dma direction. A driver might decide to store this locally
  * and avoid the extra cache line from page_pool to determine the direction.
  */
-static
-inline enum dma_data_direction page_pool_get_dma_dir(struct page_pool *pool)
+static inline enum dma_data_direction
+page_pool_get_dma_dir(const struct page_pool *pool)
 {
return pool->p.dma_dir;
 }
@@ -357,7 +357,7 @@ static inline void page_pool_free_va(struct page_pool 
*pool, void *va,
  * Fetch the DMA address of the page. The page pool to which the page belongs
  * must had been created with PP_FLAG_DMA_MAP.
  */
-static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
+static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
 {
dma_addr_t ret = page->dma_addr;
 
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 59aca3339222..4295aec0be40 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -121,9 +121,9 @@ int page_pool_ethtool_stats_get_count(void)
 }
 EXPORT_SYMBOL(page_pool_ethtool_stats_get_count);
 
-u64 *page_pool_ethtool_stats_get(u64 *data, void *stats)
+u64 *page_pool_ethtool_stats_get(u64 *data, const void *stats)
 {
-   struct page_pool_stats *pool_stats = stats;
+   const struct page_pool_stats *pool_stats = stats;
 
*data++ = pool_stats->alloc_stats.fast;
*data++ = pool_stats->alloc_stats.slow;
@@ -360,8 +360,8 @@ static struct page *__page_pool_get_cached(struct page_pool 
*pool)
return page;
 }
 
-static void page_pool_dma_sync_for_device(struct page_pool *pool,
- struct page *page,
+static void page_pool_dma_sync_for_device(const struct page_pool *pool,
+ const struct page *page,
  unsigned int dma_sync_size)
 {
dma_addr_t dma_addr = page_pool_get_dma_addr(page);
-- 
2.43.0

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [PATCH net-next v7 07/12] page_pool: add DMA-sync-for-CPU inline helper

2023-12-13 Thread Alexander Lobakin
Each driver is responsible for syncing buffers written by HW for CPU
before accessing them. Almost each PP-enabled driver uses the same
pattern, which could be shorthanded into a static inline to make driver
code a little bit more compact.
Introduce a simple helper which performs DMA synchronization for the
size passed from the driver. It can be used even when the pool doesn't
manage DMA-syncs-for-device, just make sure the page has a correct DMA
address set via page_pool_set_dma_addr().

Signed-off-by: Alexander Lobakin 
---
 include/net/page_pool/helpers.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index c860fad50d00..73ef4d1e12d4 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -52,6 +52,8 @@
 #ifndef _NET_PAGE_POOL_HELPERS_H
 #define _NET_PAGE_POOL_HELPERS_H
 
+#include 
+
 #include 
 
 #ifdef CONFIG_PAGE_POOL_STATS
@@ -382,6 +384,28 @@ static inline bool page_pool_set_dma_addr(struct page 
*page, dma_addr_t addr)
return false;
 }
 
+/**
+ * page_pool_dma_sync_for_cpu - sync Rx page for CPU after it's written by HW
+ * @pool: &page_pool the @page belongs to
+ * @page: page to sync
+ * @offset: offset from page start to "hard" start if using PP frags
+ * @dma_sync_size: size of the data written to the page
+ *
+ * Can be used as a shorthand to sync Rx pages before accessing them in the
+ * driver. Caller must ensure the pool was created with ``PP_FLAG_DMA_MAP``.
+ * Note that this version performs DMA sync unconditionally, even if the
+ * associated PP doesn't perform sync-for-device.
+ */
+static inline void page_pool_dma_sync_for_cpu(const struct page_pool *pool,
+ const struct page *page,
+ u32 offset, u32 dma_sync_size)
+{
+   dma_sync_single_range_for_cpu(pool->p.dev,
+ page_pool_get_dma_addr(page),
+ offset + pool->p.offset, dma_sync_size,
+ page_pool_get_dma_dir(pool));
+}
+
 static inline bool page_pool_put(struct page_pool *pool)
 {
return refcount_dec_and_test(&pool->user_cnt);
-- 
2.43.0

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [PATCH net-next v7 08/12] libie: add Rx buffer management (via Page Pool)

2023-12-13 Thread Alexander Lobakin
Add a couple intuitive helpers to hide Rx buffer implementation details
in the library and not multiplicate it between drivers. The settings are
optimized for Intel hardware, but nothing really HW-specific here.
Use the new page_pool_dev_alloc() to dynamically switch between
split-page and full-page modes depending on MTU, page size, required
headroom etc. For example, on x86_64 with the default driver settings
each page is shared between 2 buffers. Turning on XDP (not in this
series) -> increasing headroom requirement pushes truesize out of 2048
boundary, leading to that each buffer starts getting a full page.
The "ceiling" limit is %PAGE_SIZE, as only order-0 pages are used to
avoid compound overhead. For the above architecture, this means maximum
linear frame size of 3712 w/o XDP.
Not that &libie_buf_queue is not a complete queue/ring structure for
now, rather a shim, but eventually the libie-enabled drivers will move
to it, with iavf being the first one.

Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/libie/Kconfig |   1 +
 drivers/net/ethernet/intel/libie/rx.c|  69 
 include/linux/net/intel/libie/rx.h   | 133 ++-
 3 files changed, 202 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/libie/Kconfig 
b/drivers/net/ethernet/intel/libie/Kconfig
index 1eda4a5faa5a..6e0162fb94d2 100644
--- a/drivers/net/ethernet/intel/libie/Kconfig
+++ b/drivers/net/ethernet/intel/libie/Kconfig
@@ -3,6 +3,7 @@
 
 config LIBIE
tristate
+   select PAGE_POOL
help
  libie (Intel Ethernet library) is a common library containing
  routines shared between several Intel Ethernet drivers.
diff --git a/drivers/net/ethernet/intel/libie/rx.c 
b/drivers/net/ethernet/intel/libie/rx.c
index f503476d8eef..04037973cd87 100644
--- a/drivers/net/ethernet/intel/libie/rx.c
+++ b/drivers/net/ethernet/intel/libie/rx.c
@@ -3,6 +3,75 @@
 
 #include 
 
+/* Rx buffer management */
+
+/**
+ * libie_rx_hw_len - get the actual buffer size to be passed to HW
+ * @pp: &page_pool_params of the netdev to calculate the size for
+ *
+ * Return: HW-writeable length per one buffer to pass it to the HW accounting:
+ * MTU the @dev has, HW required alignment, minimum and maximum allowed values,
+ * and system's page size.
+ */
+static u32 libie_rx_hw_len(const struct page_pool_params *pp)
+{
+   u32 len;
+
+   len = READ_ONCE(pp->netdev->mtu) + LIBIE_RX_LL_LEN;
+   len = ALIGN(len, LIBIE_RX_BUF_LEN_ALIGN);
+   len = clamp(len, LIBIE_MIN_RX_BUF_LEN, pp->max_len);
+
+   return len;
+}
+
+/**
+ * libie_rx_page_pool_create - create a PP with the default libie settings
+ * @bq: buffer queue struct to fill
+ * @napi: &napi_struct covering this PP (no usage outside its poll loops)
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int libie_rx_page_pool_create(struct libie_buf_queue *bq,
+ struct napi_struct *napi)
+{
+   struct page_pool_params pp = {
+   .flags  = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
+   .order  = LIBIE_RX_PAGE_ORDER,
+   .pool_size  = bq->count,
+   .nid= NUMA_NO_NODE,
+   .dev= napi->dev->dev.parent,
+   .netdev = napi->dev,
+   .napi   = napi,
+   .dma_dir= DMA_FROM_DEVICE,
+   .offset = LIBIE_SKB_HEADROOM,
+   };
+
+   /* HW-writeable / syncable length per one page */
+   pp.max_len = LIBIE_RX_BUF_LEN(pp.offset);
+
+   /* HW-writeable length per buffer */
+   bq->rx_buf_len = libie_rx_hw_len(&pp);
+   /* Buffer size to allocate */
+   bq->truesize = roundup_pow_of_two(SKB_HEAD_ALIGN(pp.offset +
+bq->rx_buf_len));
+
+   bq->pp = page_pool_create(&pp);
+
+   return PTR_ERR_OR_ZERO(bq->pp);
+}
+EXPORT_SYMBOL_NS_GPL(libie_rx_page_pool_create, LIBIE);
+
+/**
+ * libie_rx_page_pool_destroy - destroy a &page_pool created by libie
+ * @bq: buffer queue to process
+ */
+void libie_rx_page_pool_destroy(struct libie_buf_queue *bq)
+{
+   page_pool_destroy(bq->pp);
+   bq->pp = NULL;
+}
+EXPORT_SYMBOL_NS_GPL(libie_rx_page_pool_destroy, LIBIE);
+
 /* O(1) converting i40e/ice/iavf's 8/10-bit hardware packet type to a parsed
  * bitfield struct.
  */
diff --git a/include/linux/net/intel/libie/rx.h 
b/include/linux/net/intel/libie/rx.h
index 55263930aa99..71bc9a1a9856 100644
--- a/include/linux/net/intel/libie/rx.h
+++ b/include/linux/net/intel/libie/rx.h
@@ -4,7 +4,138 @@
 #ifndef __LIBIE_RX_H
 #define __LIBIE_RX_H
 
-#include 
+#include 
+#include 
+
+/* Rx MTU/buffer/truesize helpers. Mostly pure software-side; HW-defined values
+ * are valid for all Intel HW.
+ */
+
+/* Space reserved in front of each frame */
+#define LIBIE_SKB_HEADROOM (NET_SKB_PAD + NET_IP_ALIGN)
+/* Maximum headroom to calculate max MT

[Intel-wired-lan] [PATCH net-next v7 09/12] iavf: pack iavf_ring more efficiently

2023-12-13 Thread Alexander Lobakin
Before replacing the Rx buffer management with libie, clean up
&iavf_ring a bit.
There are several fields not used anywhere in the code -- simply remove
them. Move ::tail up to remove a hole. Replace ::arm_wb boolean with
1-bit flag in ::flags to free 1 more byte. Finally, move ::prev_pkt_ctr
out of &iavf_tx_queue_stats -- it doesn't belong there (used for Tx
stall detection). Place it next to the stats on the ring itself to fill
the 4-byte slot.
The result: no holes and all the hot fields fit into the first 64-byte
cacheline.

Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/iavf/iavf_txrx.c | 12 +--
 drivers/net/ethernet/intel/iavf/iavf_txrx.h | 22 +++--
 2 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c 
b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index 665ee1feb877..62f976d322ab 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -184,7 +184,7 @@ void iavf_detect_recover_hung(struct iavf_vsi *vsi)
 * pending work.
 */
packets = tx_ring->stats.packets & INT_MAX;
-   if (tx_ring->tx_stats.prev_pkt_ctr == packets) {
+   if (tx_ring->prev_pkt_ctr == packets) {
iavf_force_wb(vsi, tx_ring->q_vector);
continue;
}
@@ -193,7 +193,7 @@ void iavf_detect_recover_hung(struct iavf_vsi *vsi)
 * to iavf_get_tx_pending()
 */
smp_rmb();
-   tx_ring->tx_stats.prev_pkt_ctr =
+   tx_ring->prev_pkt_ctr =
  iavf_get_tx_pending(tx_ring, true) ? packets : -1;
}
}
@@ -319,7 +319,7 @@ static bool iavf_clean_tx_irq(struct iavf_vsi *vsi,
((j / WB_STRIDE) == 0) && (j > 0) &&
!test_bit(__IAVF_VSI_DOWN, vsi->state) &&
(IAVF_DESC_UNUSED(tx_ring) != tx_ring->count))
-   tx_ring->arm_wb = true;
+   tx_ring->flags |= IAVF_TXR_FLAGS_ARM_WB;
}
 
/* notify netdev of completed buffers */
@@ -674,7 +674,7 @@ int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring)
 
tx_ring->next_to_use = 0;
tx_ring->next_to_clean = 0;
-   tx_ring->tx_stats.prev_pkt_ctr = -1;
+   tx_ring->prev_pkt_ctr = -1;
return 0;
 
 err:
@@ -1494,8 +1494,8 @@ int iavf_napi_poll(struct napi_struct *napi, int budget)
clean_complete = false;
continue;
}
-   arm_wb |= ring->arm_wb;
-   ring->arm_wb = false;
+   arm_wb |= !!(ring->flags & IAVF_TXR_FLAGS_ARM_WB);
+   ring->flags &= ~IAVF_TXR_FLAGS_ARM_WB;
}
 
/* Handle case where we are called by netpoll with a budget of 0 */
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.h 
b/drivers/net/ethernet/intel/iavf/iavf_txrx.h
index e01777531635..ed559fa6f214 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.h
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.h
@@ -227,7 +227,6 @@ struct iavf_tx_queue_stats {
u64 tx_done_old;
u64 tx_linearize;
u64 tx_force_wb;
-   int prev_pkt_ctr;
u64 tx_lost_interrupt;
 };
 
@@ -237,12 +236,6 @@ struct iavf_rx_queue_stats {
u64 alloc_buff_failed;
 };
 
-enum iavf_ring_state_t {
-   __IAVF_TX_FDIR_INIT_DONE,
-   __IAVF_TX_XPS_INIT_DONE,
-   __IAVF_RING_STATE_NBITS /* must be last */
-};
-
 /* some useful defines for virtchannel interface, which
  * is the only remaining user of header split
  */
@@ -264,10 +257,8 @@ struct iavf_ring {
struct iavf_tx_buffer *tx_bi;
struct iavf_rx_buffer *rx_bi;
};
-   DECLARE_BITMAP(state, __IAVF_RING_STATE_NBITS);
-   u16 queue_index;/* Queue number of ring */
-   u8 dcb_tc;  /* Traffic class of ring */
u8 __iomem *tail;
+   u16 queue_index;/* Queue number of ring */
 
/* high bit set means dynamic, use accessors routines to read/write.
 * hardware only supports 2us resolution for the ITR registers.
@@ -277,22 +268,14 @@ struct iavf_ring {
u16 itr_setting;
 
u16 count;  /* Number of descriptors */
-   u16 reg_idx;/* HW register index of the ring */
 
/* used in interrupt processing */
u16 next_to_use;
u16 next_to_clean;
 
-   u8 atr_sample_rate;
-   u8 atr_count;
-
-   bool ring_active;   /* is ring online or not */
-   bool arm_wb;/* do something to arm write back */
-   u8 packet_stride;
-
u16 flags;
 #define IAVF_TXR_FLAGS_WB_ON_ITR   BIT(0)
-/

[Intel-wired-lan] [PATCH net-next v7 10/12] iavf: switch to Page Pool

2023-12-13 Thread Alexander Lobakin
Now that the IAVF driver simply uses dev_alloc_page() + free_page() with
no custom recycling logics, it can easily be switched to using Page
Pool / libie API instead.
This allows to removing the whole dancing around headroom, HW buffer
size, and page order. All DMA-for-device is now done in the PP core,
for-CPU -- in the libie helper.
Use skb_mark_for_recycle() to bring back the recycling and restore the
performance. Speaking of performance: on par with the baseline and
faster with the PP optimization series applied. But the memory usage for
1500b MTU is now almost 2x lower (x86_64) thanks to allocating a page
every second descriptor.

Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/iavf/iavf_main.c   |   7 +-
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   | 246 ++
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   |  34 +--
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  10 +-
 4 files changed, 92 insertions(+), 205 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c 
b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 060002d1ebf0..0becfdf2ed99 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2013 - 2018 Intel Corporation. */
 
+#include 
+
 #include "iavf.h"
 #include "iavf_prototype.h"
 /* All iavf tracepoints are defined by the include below, which must
@@ -1605,7 +1607,6 @@ static int iavf_alloc_queues(struct iavf_adapter *adapter)
rx_ring = &adapter->rx_rings[i];
rx_ring->queue_index = i;
rx_ring->netdev = adapter->netdev;
-   rx_ring->dev = &adapter->pdev->dev;
rx_ring->count = adapter->rx_desc_count;
rx_ring->itr_setting = IAVF_ITR_RX_DEF;
}
@@ -2637,9 +2638,9 @@ static void iavf_init_config_adapter(struct iavf_adapter 
*adapter)
iavf_set_ethtool_ops(netdev);
netdev->watchdog_timeo = 5 * HZ;
 
-   /* MTU range: 68 - 9710 */
+   /* MTU range: 68 - 16356 */
netdev->min_mtu = ETH_MIN_MTU;
-   netdev->max_mtu = IAVF_MAX_RXBUFFER - IAVF_PACKET_HDR_PAD;
+   netdev->max_mtu = LIBIE_MAX_MTU;
 
if (!is_valid_ether_addr(adapter->hw.mac.addr)) {
dev_info(&pdev->dev, "Invalid MAC address %pM, using random\n",
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c 
b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index 62f976d322ab..aa86a0f926c6 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -689,9 +689,6 @@ int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring)
  **/
 static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
 {
-   unsigned long bi_size;
-   u16 i;
-
/* ring already cleared, nothing to do */
if (!rx_ring->rx_bi)
return;
@@ -701,40 +698,16 @@ static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
rx_ring->skb = NULL;
}
 
-   /* Free all the Rx ring sk_buffs */
-   for (i = 0; i < rx_ring->count; i++) {
-   struct iavf_rx_buffer *rx_bi = &rx_ring->rx_bi[i];
+   /* Free all the Rx ring buffers */
+   for (u32 i = rx_ring->next_to_clean; i != rx_ring->next_to_use; ) {
+   const struct libie_rx_buffer *rx_bi = &rx_ring->rx_bi[i];
 
-   if (!rx_bi->page)
-   continue;
+   page_pool_put_full_page(rx_ring->pp, rx_bi->page, false);
 
-   /* Invalidate cache lines that may have been written to by
-* device so that we avoid corrupting memory.
-*/
-   dma_sync_single_range_for_cpu(rx_ring->dev,
- rx_bi->dma,
- rx_bi->page_offset,
- IAVF_RXBUFFER_3072,
- DMA_FROM_DEVICE);
-
-   /* free resources associated with mapping */
-   dma_unmap_page_attrs(rx_ring->dev, rx_bi->dma,
-iavf_rx_pg_size(rx_ring),
-DMA_FROM_DEVICE,
-IAVF_RX_DMA_ATTR);
-
-   __free_page(rx_bi->page);
-
-   rx_bi->page = NULL;
-   rx_bi->page_offset = 0;
+   if (unlikely(++i == rx_ring->count))
+   i = 0;
}
 
-   bi_size = sizeof(struct iavf_rx_buffer) * rx_ring->count;
-   memset(rx_ring->rx_bi, 0, bi_size);
-
-   /* Zero out the descriptor ring */
-   memset(rx_ring->desc, 0, rx_ring->size);
-
rx_ring->next_to_clean = 0;
rx_ring->next_to_use = 0;
 }
@@ -747,15 +720,22 @@ static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
  **/
 void iavf_free_rx_resources(struct iavf_ring *rx_ring)
 {
+   struct libie_buf_queue bq = 

[Intel-wired-lan] [PATCH net-next v7 11/12] libie: add common queue stats

2023-12-13 Thread Alexander Lobakin
Next stop, per-queue private stats. They have only subtle differences
from driver to driver and can easily be resolved.
Define common structures, inline helpers and Ethtool helpers to collect,
update and export the statistics. Use u64_stats_t right from the start,
as well as the corresponding helpers to ensure tear-free operations.
For the NAPI parts of both Rx and Tx, also define small onstack
containers to update them in polling loops and then sync the actual
containers once a loop ends.
The drivers will be switched to use this API later on a per-driver
basis, along with conversion to PP.

Signed-off-by: Alexander Lobakin 
---
 drivers/net/ethernet/intel/libie/Makefile |   1 +
 drivers/net/ethernet/intel/libie/stats.c  | 121 +++
 include/linux/net/intel/libie/stats.h | 179 ++
 3 files changed, 301 insertions(+)
 create mode 100644 drivers/net/ethernet/intel/libie/stats.c
 create mode 100644 include/linux/net/intel/libie/stats.h

diff --git a/drivers/net/ethernet/intel/libie/Makefile 
b/drivers/net/ethernet/intel/libie/Makefile
index 95e81d09b474..76f32253481b 100644
--- a/drivers/net/ethernet/intel/libie/Makefile
+++ b/drivers/net/ethernet/intel/libie/Makefile
@@ -4,3 +4,4 @@
 obj-$(CONFIG_LIBIE)+= libie.o
 
 libie-objs += rx.o
+libie-objs += stats.o
diff --git a/drivers/net/ethernet/intel/libie/stats.c 
b/drivers/net/ethernet/intel/libie/stats.c
new file mode 100644
index ..85f5d279406f
--- /dev/null
+++ b/drivers/net/ethernet/intel/libie/stats.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2023 Intel Corporation. */
+
+#include 
+
+#include 
+#include 
+
+/* Rx per-queue stats */
+
+static const char * const libie_rq_stats_str[] = {
+#define act(s) __stringify(s),
+   DECLARE_LIBIE_RQ_STATS(act)
+#undef act
+};
+
+#define LIBIE_RQ_STATS_NUM ARRAY_SIZE(libie_rq_stats_str)
+
+/**
+ * libie_rq_stats_get_sset_count - get the number of Ethtool RQ stats provided
+ *
+ * Return: number of per-queue Rx stats supported by the library.
+ */
+u32 libie_rq_stats_get_sset_count(void)
+{
+   return LIBIE_RQ_STATS_NUM;
+}
+EXPORT_SYMBOL_NS_GPL(libie_rq_stats_get_sset_count, LIBIE);
+
+/**
+ * libie_rq_stats_get_strings - get the name strings of Ethtool RQ stats
+ * @data: reference to the cursor pointing to the output buffer
+ * @qid: RQ number to print in the prefix
+ */
+void libie_rq_stats_get_strings(u8 **data, u32 qid)
+{
+   for (u32 i = 0; i < LIBIE_RQ_STATS_NUM; i++)
+   ethtool_sprintf(data, "rq%u_%s", qid, libie_rq_stats_str[i]);
+}
+EXPORT_SYMBOL_NS_GPL(libie_rq_stats_get_strings, LIBIE);
+
+/**
+ * libie_rq_stats_get_data - get the RQ stats in Ethtool format
+ * @data: reference to the cursor pointing to the output array
+ * @stats: RQ stats container from the queue
+ */
+void libie_rq_stats_get_data(u64 **data, const struct libie_rq_stats *stats)
+{
+   u64 sarr[LIBIE_RQ_STATS_NUM];
+   u32 start;
+
+   do {
+   start = u64_stats_fetch_begin(&stats->syncp);
+
+   for (u32 i = 0; i < LIBIE_RQ_STATS_NUM; i++)
+   sarr[i] = u64_stats_read(&stats->raw[i]);
+   } while (u64_stats_fetch_retry(&stats->syncp, start));
+
+   for (u32 i = 0; i < LIBIE_RQ_STATS_NUM; i++)
+   (*data)[i] += sarr[i];
+
+   *data += LIBIE_RQ_STATS_NUM;
+}
+EXPORT_SYMBOL_NS_GPL(libie_rq_stats_get_data, LIBIE);
+
+/* Tx per-queue stats */
+
+static const char * const libie_sq_stats_str[] = {
+#define act(s) __stringify(s),
+   DECLARE_LIBIE_SQ_STATS(act)
+#undef act
+};
+
+#define LIBIE_SQ_STATS_NUM ARRAY_SIZE(libie_sq_stats_str)
+
+/**
+ * libie_sq_stats_get_sset_count - get the number of Ethtool SQ stats provided
+ *
+ * Return: number of per-queue Tx stats supported by the library.
+ */
+u32 libie_sq_stats_get_sset_count(void)
+{
+   return LIBIE_SQ_STATS_NUM;
+}
+EXPORT_SYMBOL_NS_GPL(libie_sq_stats_get_sset_count, LIBIE);
+
+/**
+ * libie_sq_stats_get_strings - get the name strings of Ethtool SQ stats
+ * @data: reference to the cursor pointing to the output buffer
+ * @qid: SQ number to print in the prefix
+ */
+void libie_sq_stats_get_strings(u8 **data, u32 qid)
+{
+   for (u32 i = 0; i < LIBIE_SQ_STATS_NUM; i++)
+   ethtool_sprintf(data, "sq%u_%s", qid, libie_sq_stats_str[i]);
+}
+EXPORT_SYMBOL_NS_GPL(libie_sq_stats_get_strings, LIBIE);
+
+/**
+ * libie_sq_stats_get_data - get the SQ stats in Ethtool format
+ * @data: reference to the cursor pointing to the output array
+ * @stats: SQ stats container from the queue
+ */
+void libie_sq_stats_get_data(u64 **data, const struct libie_sq_stats *stats)
+{
+   u64 sarr[LIBIE_SQ_STATS_NUM];
+   u32 start;
+
+   do {
+   start = u64_stats_fetch_begin(&stats->syncp);
+
+   for (u32 i = 0; i < LIBIE_SQ_STATS_NUM; i++)
+   sarr[i] = u64_stats_read(&stats->raw[i]);
+   } while (u64_stats_fetch_r

[Intel-wired-lan] [PATCH net-next v7 12/12] iavf: switch queue stats to libie

2023-12-13 Thread Alexander Lobakin
iavf is pretty much ready for using the generic libie stats, so drop all
the custom code and just use generic definitions. The only thing is that
it previously lacked the counter of Tx queue stops. It's present in the
other drivers, so add it here as well.
The rest is straightforward. Note that it makes the ring structure 1 CL
bigger, but no layout changes since the stats structures are 32-byte
aligned.

Signed-off-by: Alexander Lobakin 
---
 .../net/ethernet/intel/iavf/iavf_ethtool.c| 94 +--
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  2 +
 drivers/net/ethernet/intel/iavf/iavf_txrx.c   | 84 +
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   | 28 +-
 4 files changed, 72 insertions(+), 136 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c 
b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
index 4b8b068e9d11..87e514e4d371 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
@@ -2,9 +2,10 @@
 /* Copyright(c) 2013 - 2018 Intel Corporation. */
 
 /* ethtool support for iavf */
-#include "iavf.h"
 
-#include 
+#include 
+
+#include "iavf.h"
 
 /* ethtool statistics helpers */
 
@@ -46,16 +47,6 @@ struct iavf_stats {
.stat_offset = offsetof(_type, _stat) \
 }
 
-/* Helper macro for defining some statistics related to queues */
-#define IAVF_QUEUE_STAT(_name, _stat) \
-   IAVF_STAT(struct iavf_ring, _name, _stat)
-
-/* Stats associated with a Tx or Rx ring */
-static const struct iavf_stats iavf_gstrings_queue_stats[] = {
-   IAVF_QUEUE_STAT("%s-%u.packets", stats.packets),
-   IAVF_QUEUE_STAT("%s-%u.bytes", stats.bytes),
-};
-
 /**
  * iavf_add_one_ethtool_stat - copy the stat into the supplied buffer
  * @data: location to store the stat value
@@ -141,43 +132,6 @@ __iavf_add_ethtool_stats(u64 **data, void *pointer,
 #define iavf_add_ethtool_stats(data, pointer, stats) \
__iavf_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
 
-/**
- * iavf_add_queue_stats - copy queue statistics into supplied buffer
- * @data: ethtool stats buffer
- * @ring: the ring to copy
- *
- * Queue statistics must be copied while protected by
- * u64_stats_fetch_begin, so we can't directly use iavf_add_ethtool_stats.
- * Assumes that queue stats are defined in iavf_gstrings_queue_stats. If the
- * ring pointer is null, zero out the queue stat values and update the data
- * pointer. Otherwise safely copy the stats from the ring into the supplied
- * buffer and update the data pointer when finished.
- *
- * This function expects to be called while under rcu_read_lock().
- **/
-static void
-iavf_add_queue_stats(u64 **data, struct iavf_ring *ring)
-{
-   const unsigned int size = ARRAY_SIZE(iavf_gstrings_queue_stats);
-   const struct iavf_stats *stats = iavf_gstrings_queue_stats;
-   unsigned int start;
-   unsigned int i;
-
-   /* To avoid invalid statistics values, ensure that we keep retrying
-* the copy until we get a consistent value according to
-* u64_stats_fetch_retry. But first, make sure our ring is
-* non-null before attempting to access its syncp.
-*/
-   do {
-   start = !ring ? 0 : u64_stats_fetch_begin(&ring->syncp);
-   for (i = 0; i < size; i++)
-   iavf_add_one_ethtool_stat(&(*data)[i], ring, &stats[i]);
-   } while (ring && u64_stats_fetch_retry(&ring->syncp, start));
-
-   /* Once we successfully copy the stats in, update the data pointer */
-   *data += size;
-}
-
 /**
  * __iavf_add_stat_strings - copy stat strings into ethtool buffer
  * @p: ethtool supplied buffer
@@ -237,8 +191,6 @@ static const struct iavf_stats iavf_gstrings_stats[] = {
 
 #define IAVF_STATS_LEN ARRAY_SIZE(iavf_gstrings_stats)
 
-#define IAVF_QUEUE_STATS_LEN   ARRAY_SIZE(iavf_gstrings_queue_stats)
-
 /**
  * iavf_get_link_ksettings - Get Link Speed and Duplex settings
  * @netdev: network interface device structure
@@ -308,18 +260,22 @@ static int iavf_get_link_ksettings(struct net_device 
*netdev,
  **/
 static int iavf_get_sset_count(struct net_device *netdev, int sset)
 {
-   /* Report the maximum number queues, even if not every queue is
-* currently configured. Since allocation of queues is in pairs,
-* use netdev->real_num_tx_queues * 2. The real_num_tx_queues is set
-* at device creation and never changes.
-*/
+   u32 num;
 
-   if (sset == ETH_SS_STATS)
-   return IAVF_STATS_LEN +
-   (IAVF_QUEUE_STATS_LEN * 2 *
-netdev->real_num_tx_queues);
-   else
+   switch (sset) {
+   case ETH_SS_STATS:
+   /* Per-queue */
+   num = libie_rq_stats_get_sset_count();
+   num += libie_sq_stats_get_sset_count();
+   num *= netdev->real_num_tx_queues;
+
+   /* Global */
+   num += IAVF_STATS_LEN;
+
+

Re: [Intel-wired-lan] [PATCH iwl-net] idpf: enable WB_ON_ITR

2023-12-13 Thread Michal Kubiak
On Tue, Dec 12, 2023 at 05:50:55PM +0100, Paul Menzel wrote:
> Dear Michal, dear Joshua,
> 
> 
> Thank you for your patch.
> 
> On 12/12/23 15:55, Michal Kubiak wrote:
> > From: Joshua Hay 
> > 
> > Tell hardware to writeback completed descriptors even when interrupts
> 
> Should you resend, the verb is spelled with a space: write back.

Sure, I will fix it.

> 
> > are disabled. Otherwise, descriptors might not be written back until
> > the hardware can flush a full cacheline of descriptors. This can cause
> > unnecessary delays when traffic is light (or even trigger Tx queue
> > timeout).
> 
> How can the problem be reproduced and the patch be verified?
> 
> 
> Kind regards,
> 
> Paul
> 
> 

Hi Paul,

To be honest, I have noticed the problem during the implementation of
AF_XDP feature for IDPF driver. In my scenario, I had 2 Tx queues:
 - regular LAN Tx queue
 - and XDP Tx queue
added to the same q_vector attached to the same NAPI, so those 2 Tx
queues were handled in the same NAPI poll loop.
Then, when I started a huge Tx zero-copy trafic using AF_XDP (on the XDP
queue), and, at the same time, tried to xmit a few packets using the second
(non-XDP) queue (e.g. with scapy), I was getting the Tx timeout on that regular
LAN Tx queue.
That is why I decided to upstream this fix. With disabled writebacks,
there is no chance to get the completion descriptor for the queue where
the traffic is much lighter.

I have never tried to reproduce the scenario described by Joshua
in his original patch ("unnecessary delays when traffic is light").

Thanks,
Michal

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [net PATCH] i40e: fix use-after-free in i40e_aqc_add_filters()

2023-12-13 Thread Sokolowski, Jan
>Commit 3116f59c12bd ("i40e: fix use-after-free in
>i40e_sync_filters_subtask()") avoided use-after-free issues,
>by increasing refcount during update the VSI filter list to
>the HW. However, it missed the unicast situation.
>
>When deleting an unicast FDB entry, the i40e driver will release
>the mac_filter, and i40e_service_task will concurrently request
>firmware to add the mac_filter, which will lead to the following
>use-after-free issue.
>
>Fix again for both netdev->uc and netdev->mc.
>
>BUG: KASAN: use-after-free in i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
>Read of size 2 at addr 888eb3452d60 by task kworker/8:7/6379
>
>CPU: 8 PID: 6379 Comm: kworker/8:7 Kdump: loaded Tainted: G
>Workqueue: i40e i40e_service_task [i40e]
>Call Trace:
> dump_stack+0x71/0xab
> print_address_description+0x6b/0x290
> kasan_report+0x14a/0x2b0
> i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
> i40e_sync_vsi_filters+0x1676/0x39c0 [i40e]
> i40e_service_task+0x1397/0x2bb0 [i40e]
> process_one_work+0x56a/0x11f0
> worker_thread+0x8f/0xf40
> kthread+0x2a0/0x390
> ret_from_fork+0x1f/0x40
>
>Allocated by task 21948:
> kasan_kmalloc+0xa6/0xd0
> kmem_cache_alloc_trace+0xdb/0x1c0
> i40e_add_filter+0x11e/0x520 [i40e]
> i40e_addr_sync+0x37/0x60 [i40e]
> __hw_addr_sync_dev+0x1f5/0x2f0
> i40e_set_rx_mode+0x61/0x1e0 [i40e]
> dev_uc_add_excl+0x137/0x190
> i40e_ndo_fdb_add+0x161/0x260 [i40e]
> rtnl_fdb_add+0x567/0x950
> rtnetlink_rcv_msg+0x5db/0x880
> netlink_rcv_skb+0x254/0x380
> netlink_unicast+0x454/0x610
> netlink_sendmsg+0x747/0xb00
> sock_sendmsg+0xe2/0x120
> __sys_sendto+0x1ae/0x290
> __x64_sys_sendto+0xdd/0x1b0
> do_syscall_64+0xa0/0x370
> entry_SYSCALL_64_after_hwframe+0x65/0xca
>
>Freed by task 21948:
> __kasan_slab_free+0x137/0x190
> kfree+0x8b/0x1b0
> __i40e_del_filter+0x116/0x1e0 [i40e]
> i40e_del_mac_filter+0x16c/0x300 [i40e]
> i40e_addr_unsync+0x134/0x1b0 [i40e]
> __hw_addr_sync_dev+0xff/0x2f0
> i40e_set_rx_mode+0x61/0x1e0 [i40e]
> dev_uc_del+0x77/0x90
> rtnl_fdb_del+0x6a5/0x860
> rtnetlink_rcv_msg+0x5db/0x880
> netlink_rcv_skb+0x254/0x380
> netlink_unicast+0x454/0x610
> netlink_sendmsg+0x747/0xb00
> sock_sendmsg+0xe2/0x120
> __sys_sendto+0x1ae/0x290
> __x64_sys_sendto+0xdd/0x1b0
> do_syscall_64+0xa0/0x370
> entry_SYSCALL_64_after_hwframe+0x65/0xca
>
>Fixes: 3116f59c12bd ("i40e: fix use-after-free in i40e_sync_filters_subtask()")
>Fixes: 41c445ff0f48 ("i40e: main driver core")
>Signed-off-by: Ke Xiao 
>Signed-off-by: Ding Hui 
>Cc: Di Zhu 
>---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
>b/drivers/net/ethernet/intel/i40e/i40e_main.c
>index 1ab8dbe2d880..16b574d69843 100644
>--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>@@ -108,11 +108,17 @@ static void netdev_hw_addr_refcnt(struct i40e_mac_filter 
>*f,
> struct net_device *netdev, int delta)
> {
>   struct netdev_hw_addr *ha;
>+  struct netdev_hw_addr_list *ha_list;
> 
>   if (!f || !netdev)
>   return;
> 
>-  netdev_for_each_mc_addr(ha, netdev) {
>+  if (is_unicast_ether_addr(f->macaddr) || 
>is_link_local_ether_addr(f->macaddr))
>+  ha_list = &netdev->uc;
>+  else
>+  ha_list = &netdev->mc;
>+
>+  netdev_hw_addr_list_for_each(ha, ha_list) {
>   if (ether_addr_equal(ha->addr, f->macaddr)) {
>   ha->refcount += delta;
>   if (ha->refcount <= 0)
>-- 
>2.17.1


Reviewed-by: Jan Sokolowski 
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH iwl-net] idpf: enable WB_ON_ITR

2023-12-13 Thread Michal Kubiak
On Wed, Dec 13, 2023 at 02:23:19PM +0100, Michal Kubiak wrote:
> On Tue, Dec 12, 2023 at 05:50:55PM +0100, Paul Menzel wrote:
> > Dear Michal, dear Joshua,
> > 
> > 
> > Thank you for your patch.
> > 
> > On 12/12/23 15:55, Michal Kubiak wrote:
> > > From: Joshua Hay 
> > > 
> > > Tell hardware to writeback completed descriptors even when interrupts
> > 
> > Should you resend, the verb is spelled with a space: write back.
> 
> Sure, I will fix it.

Hi Tony,

Could you please add a space ("writeback" -> "write back") when taking the patch
to your tree?

Thanks,
Michal

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH iwl-next v2 14/15] ice: cleanup inconsistent code

2023-12-13 Thread Kalesh Anakkur Purayil
On Wed, Dec 6, 2023 at 6:32 AM Jesse Brandeburg 
wrote:

> It was found while doing further testing of the previous commit
> fbf32a9bab91 ("ice: field get conversion") that one of the FIELD_GET
> conversions should really be a FIELD_PREP. The previous code was styled
> as a match to the FIELD_GET conversion, which always worked because the
> shift value was 0.  The code makes way more sense as a FIELD_PREP and
> was in fact the only FIELD_GET with two constant arguments in this
> series.
>
> Didn't squash this patch to make it easier to call out the
> (non-impactful) bug.
>
> Signed-off-by: Jesse Brandeburg 
> ---
>  drivers/net/ethernet/intel/ice/ice_dcb.c | 2 +-
>  drivers/net/ethernet/intel/ice/ice_lib.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb.c
> b/drivers/net/ethernet/intel/ice/ice_dcb.c
> index 7f3e00c187b4..74418c445cc4 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb.c
> @@ -967,7 +967,7 @@ void ice_get_dcb_cfg_from_mib_change(struct
> ice_port_info *pi,
>
> mib = (struct ice_aqc_lldp_get_mib *)&event->desc.params.raw;
>
> -   change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M,  mib->type);
> +   change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M, mib->type);
>
[Kalesh]: I did not get what exactly changed here? Is that you just removed
one extra space before mib->type?

> if (change_type == ICE_AQ_LLDP_MIB_REMOTE)
> dcbx_cfg = &pi->qos_cfg.remote_dcbx_cfg;
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c
> b/drivers/net/ethernet/intel/ice/ice_lib.c
> index 8cdd53412748..d1c1e53fe15c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> @@ -974,8 +974,8 @@ static void ice_set_dflt_vsi_ctx(struct ice_hw *hw,
> struct ice_vsi_ctx *ctxt)
> /* Traffic from VSI can be sent to LAN */
> ctxt->info.sw_flags2 = ICE_AQ_VSI_SW_FLAG_LAN_ENA;
> /* allow all untagged/tagged packets by default on Tx */
> -   ctxt->info.inner_vlan_flags =
> FIELD_GET(ICE_AQ_VSI_INNER_VLAN_TX_MODE_M,
> -
>  ICE_AQ_VSI_INNER_VLAN_TX_MODE_ALL);
> +   ctxt->info.inner_vlan_flags =
> FIELD_PREP(ICE_AQ_VSI_INNER_VLAN_TX_MODE_M,
> +
> ICE_AQ_VSI_INNER_VLAN_TX_MODE_ALL);
> /* SVM - by default bits 3 and 4 in inner_vlan_flags are 0's which
>  * results in legacy behavior (show VLAN, DEI, and UP) in
> descriptor.
>  *
> --
> 2.39.3
>
>
>

-- 
Regards,
Kalesh A P


smime.p7s
Description: S/MIME Cryptographic Signature
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [net PATCH] i40e: fix use-after-free in i40e_aqc_add_filters()

2023-12-13 Thread Ke Xiao
Commit 3116f59c12bd ("i40e: fix use-after-free in
i40e_sync_filters_subtask()") avoided use-after-free issues,
by increasing refcount during update the VSI filter list to
the HW. However, it missed the unicast situation.

When deleting an unicast FDB entry, the i40e driver will release
the mac_filter, and i40e_service_task will concurrently request
firmware to add the mac_filter, which will lead to the following
use-after-free issue.

Fix again for both netdev->uc and netdev->mc.

BUG: KASAN: use-after-free in i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
Read of size 2 at addr 888eb3452d60 by task kworker/8:7/6379

CPU: 8 PID: 6379 Comm: kworker/8:7 Kdump: loaded Tainted: G
Workqueue: i40e i40e_service_task [i40e]
Call Trace:
 dump_stack+0x71/0xab
 print_address_description+0x6b/0x290
 kasan_report+0x14a/0x2b0
 i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
 i40e_sync_vsi_filters+0x1676/0x39c0 [i40e]
 i40e_service_task+0x1397/0x2bb0 [i40e]
 process_one_work+0x56a/0x11f0
 worker_thread+0x8f/0xf40
 kthread+0x2a0/0x390
 ret_from_fork+0x1f/0x40

Allocated by task 21948:
 kasan_kmalloc+0xa6/0xd0
 kmem_cache_alloc_trace+0xdb/0x1c0
 i40e_add_filter+0x11e/0x520 [i40e]
 i40e_addr_sync+0x37/0x60 [i40e]
 __hw_addr_sync_dev+0x1f5/0x2f0
 i40e_set_rx_mode+0x61/0x1e0 [i40e]
 dev_uc_add_excl+0x137/0x190
 i40e_ndo_fdb_add+0x161/0x260 [i40e]
 rtnl_fdb_add+0x567/0x950
 rtnetlink_rcv_msg+0x5db/0x880
 netlink_rcv_skb+0x254/0x380
 netlink_unicast+0x454/0x610
 netlink_sendmsg+0x747/0xb00
 sock_sendmsg+0xe2/0x120
 __sys_sendto+0x1ae/0x290
 __x64_sys_sendto+0xdd/0x1b0
 do_syscall_64+0xa0/0x370
 entry_SYSCALL_64_after_hwframe+0x65/0xca

Freed by task 21948:
 __kasan_slab_free+0x137/0x190
 kfree+0x8b/0x1b0
 __i40e_del_filter+0x116/0x1e0 [i40e]
 i40e_del_mac_filter+0x16c/0x300 [i40e]
 i40e_addr_unsync+0x134/0x1b0 [i40e]
 __hw_addr_sync_dev+0xff/0x2f0
 i40e_set_rx_mode+0x61/0x1e0 [i40e]
 dev_uc_del+0x77/0x90
 rtnl_fdb_del+0x6a5/0x860
 rtnetlink_rcv_msg+0x5db/0x880
 netlink_rcv_skb+0x254/0x380
 netlink_unicast+0x454/0x610
 netlink_sendmsg+0x747/0xb00
 sock_sendmsg+0xe2/0x120
 __sys_sendto+0x1ae/0x290
 __x64_sys_sendto+0xdd/0x1b0
 do_syscall_64+0xa0/0x370
 entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 3116f59c12bd ("i40e: fix use-after-free in i40e_sync_filters_subtask()")
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Ke Xiao 
Signed-off-by: Ding Hui 
Cc: Di Zhu 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1ab8dbe2d880..16b574d69843 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -108,11 +108,17 @@ static void netdev_hw_addr_refcnt(struct i40e_mac_filter 
*f,
  struct net_device *netdev, int delta)
 {
struct netdev_hw_addr *ha;
+   struct netdev_hw_addr_list *ha_list;
 
if (!f || !netdev)
return;
 
-   netdev_for_each_mc_addr(ha, netdev) {
+   if (is_unicast_ether_addr(f->macaddr) || 
is_link_local_ether_addr(f->macaddr))
+   ha_list = &netdev->uc;
+   else
+   ha_list = &netdev->mc;
+
+   netdev_hw_addr_list_for_each(ha, ha_list) {
if (ether_addr_equal(ha->addr, f->macaddr)) {
ha->refcount += delta;
if (ha->refcount <= 0)
-- 
2.17.1

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH v2 iwl-next] ice: Fix some null pointer dereference issues in ice_ptp.c

2023-12-13 Thread Simon Horman
On Wed, Dec 13, 2023 at 10:49:10AM +0100, Przemek Kitszel wrote:
> On 12/12/23 03:40, Kunwu Chan wrote:
> > devm_kasprintf() returns a pointer to dynamically allocated memory
> > which can be NULL upon failure.
> > 
> > Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS")
> > Cc: Kunwu Chan 
> > Suggested-by: Przemek Kitszel 
> 
> You found the bug (or some some static analysis tool in that case);
> there is no need to add Suggested-by for every person that suggests
> something during review - the tag is for "person/s that suggested
> making such change in the repo".
> 
> Subject line would be better if less generic, eg:
> ice: avoid null deref of ptp auxbus name
> 
> > Signed-off-by: Kunwu Chan 
> > ---
> >   drivers/net/ethernet/intel/ice/ice_ptp.c | 4 
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c 
> > b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > index e9e59f4b5580..848e3e063e64 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > @@ -2743,6 +2743,8 @@ static int ice_ptp_register_auxbus_driver(struct 
> > ice_pf *pf)
> > name = devm_kasprintf(dev, GFP_KERNEL, "ptp_aux_dev_%u_%u_clk%u",
> >   pf->pdev->bus->number, PCI_SLOT(pf->pdev->devfn),
> >   ice_get_ptp_src_clock_index(&pf->hw));
> > +   if (!name)
> > +   return -ENOMEM;
> > aux_driver->name = name;
> > aux_driver->shutdown = ice_ptp_auxbus_shutdown;
> > @@ -2989,6 +2991,8 @@ static int ice_ptp_create_auxbus_device(struct ice_pf 
> > *pf)
> > name = devm_kasprintf(dev, GFP_KERNEL, "ptp_aux_dev_%u_%u_clk%u",
> >   pf->pdev->bus->number, PCI_SLOT(pf->pdev->devfn),
> >   ice_get_ptp_src_clock_index(&pf->hw));
> > +   if (!name)
> > +   return -ENOMEM;
> > aux_dev->name = name;
> > aux_dev->id = id;
> 
> Reviewed-by: Przemek Kitszel 
> 
> Regarding iwl-next vs iwl-net: this bug is really unlikely to manifest,
> as we take care of both earlier and future mem allocs for ptp auxbus,
> and auxiliary_device_init() checks for null name, so no big deal,
> so: -next is fine

Thanks. FWIIW, this looks good to me too.

Reviewed-by: Simon Horman 

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH iwl-next v2 14/15] ice: cleanup inconsistent code

2023-12-13 Thread Jesse Brandeburg
Please don't use HTML email, your reply was likely dropped by most lists
that filter HTML.

On 12/12/2023 8:06 PM, Kalesh Anakkur Purayil wrote:
> -       change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M,  mib->type);
> +       change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M, mib->type);
> 
> [Kalesh]: I did not get what exactly changed here? Is that you just
> removed one extra space before mib->type?

Yes, there was a whitespace change missed in the GET series. I had
caught it only here. If you feel I need to I can resend to add a comment
to the commit message that this was added here.


___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [tnguy-net-queue:40GbE] BUILD SUCCESS 7ae42ef308ed0f6250b36f43e4eeb182ebbe6215

2023-12-13 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue.git 40GbE
branch HEAD: 7ae42ef308ed0f6250b36f43e4eeb182ebbe6215  iavf: Fix iavf_shutdown 
to call iavf_remove instead iavf_close

elapsed time: 1452m

configs tested: 197
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc defconfig   gcc  
arc nsimosci_hs_defconfig   gcc  
arc   randconfig-001-20231213   gcc  
arc   randconfig-002-20231213   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   gcc  
arm  allyesconfig   gcc  
arm bcm2835_defconfig   clang
arm defconfig   clang
arm   randconfig-001-20231213   clang
arm   randconfig-002-20231213   clang
arm   randconfig-003-20231213   clang
arm   randconfig-004-20231213   clang
armshmobile_defconfig   gcc  
arm64 allnoconfig   gcc  
arm64   defconfig   gcc  
arm64 randconfig-001-20231213   clang
arm64 randconfig-002-20231213   clang
arm64 randconfig-003-20231213   clang
arm64 randconfig-004-20231213   clang
csky  allnoconfig   gcc  
cskydefconfig   gcc  
csky  randconfig-001-20231213   gcc  
csky  randconfig-002-20231213   gcc  
hexagon   allnoconfig   clang
hexagon defconfig   clang
hexagon   randconfig-001-20231213   clang
hexagon   randconfig-002-20231213   clang
i386 allmodconfig   clang
i386  allnoconfig   clang
i386 allyesconfig   clang
i386 buildonly-randconfig-001-20231213   clang
i386 buildonly-randconfig-002-20231213   clang
i386 buildonly-randconfig-003-20231213   clang
i386 buildonly-randconfig-004-20231213   clang
i386 buildonly-randconfig-005-20231213   clang
i386 buildonly-randconfig-006-20231213   clang
i386defconfig   gcc  
i386  randconfig-001-20231213   clang
i386  randconfig-002-20231213   clang
i386  randconfig-003-20231213   clang
i386  randconfig-004-20231213   clang
i386  randconfig-005-20231213   clang
i386  randconfig-006-20231213   clang
i386  randconfig-011-20231213   gcc  
i386  randconfig-011-20231214   clang
i386  randconfig-012-20231213   gcc  
i386  randconfig-012-20231214   clang
i386  randconfig-013-20231213   gcc  
i386  randconfig-013-20231214   clang
i386  randconfig-014-20231213   gcc  
i386  randconfig-014-20231214   clang
i386  randconfig-015-20231213   gcc  
i386  randconfig-015-20231214   clang
i386  randconfig-016-20231213   gcc  
i386  randconfig-016-20231214   clang
loongarchallmodconfig   gcc  
loongarch allnoconfig   gcc  
loongarchallyesconfig   gcc  
loongarch   defconfig   gcc  
loongarch randconfig-001-20231213   gcc  
loongarch randconfig-002-20231213   gcc  
m68k allmodconfig   gcc  
m68k  allnoconfig   gcc  
m68k allyesconfig   gcc  
m68kdefconfig   gcc  
m68k   m5249evb_defconfig   gcc  
m68kmac_defconfig   gcc  
m68k  multi_defconfig   gcc  
m68kstmark2_defconfig   gcc  
microblaze   allmodconfig   gcc  
microblazeallnoconfig   gcc  
microblaze   allyesconfig   gcc  
microblaze  defconfig   gcc  
mips allmodconfig   gcc  
mips  allnoconfig   clang
mips allyesconfig   gcc  
mips cu1000-neo_defconfig   clang
mips cu1830-neo_defconfig   clang
mips

Re: [Intel-wired-lan] [PATCH iwl-net v4] i40e: Restore VF MSI-X state during PCI reset

2023-12-13 Thread Jacob Keller



On 12/12/2023 4:24 AM, Andrii Staikov wrote:
> During a PCI FLR the MSI-X Enable flag in the VF PCI MSI-X capability
> register will be cleared. This can lead to issues when a VF is
> assigned to a VM because in these cases the VF driver receives no
> indication of the PF PCI error/reset and additionally it is incapable
> of restoring the cleared flag in the hypervisor configuration space
> without fully reinitializing the driver interrupt functionality.
> 
> Since the VF driver is unable to easily resolve this condition on its own,
> restore the VF MSI-X flag during the PF PCI reset handling.
> 
> Fixes: 19b7960b2da1 ("i40e: implement split PCI error reset handler")
> Co-developed-by: Karen Ostrowska 
> Signed-off-by: Karen Ostrowska 
> Co-developed-by: Mateusz Palczewski 
> Signed-off-by: Mateusz Palczewski 
> Reviewed-by: Drewek Wojciech 
> Reviewed-by: Kitszel Przemyslaw 
> Signed-off-by: Andrii Staikov 

The ice driver recently started caching the PCI device structure
pointers in their VF structure instead of having to do this sort of
lookup on the fly.

See 31642d2854e2 ("ice: store VF's pci_dev ptr in ice_vf") [1][2]

[1]:
https://lore.kernel.org/intel-wired-lan/20230912115626.105828-1-mateusz.polchlo...@intel.com/
[2]:
https://lore.kernel.org/netdev/20231019173227.3175575-4-jacob.e.kel...@intel.com/

Can we do something similar for i40e?

> ---
> v1 -> v2: Fix signed-off tags
> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20231204131041.3369693-1-andrii.stai...@intel.com/
> 
> v2 -> v3: use @vf_dev in pci_get_device() instead of NULL and remove 
> unnecessary call
> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20231206125127.218350-1-andrii.stai...@intel.com/
> 
> v3 -> v4: wrap the added functionality into the CONFIG_PCI_IOV define as
> this is VF-related functionality
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +++
>  .../ethernet/intel/i40e/i40e_virtchnl_pf.c| 26 +++
>  .../ethernet/intel/i40e/i40e_virtchnl_pf.h|  3 +++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 7bb1f64833eb..bbe2d115fb15 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -16513,6 +16513,9 @@ static void i40e_pci_error_reset_done(struct pci_dev 
> *pdev)
>   return;
>  
>   i40e_reset_and_rebuild(pf, false, false);
> +#ifdef CONFIG_PCI_IOV
> + i40e_restore_all_vfs_msi_state(pdev);
> +#endif /* CONFIG_PCI_IOV */
>  }
>  
>  /**
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
> b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> index 3f99eb198245..d60f5419d6bd 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> @@ -154,6 +154,32 @@ void i40e_vc_notify_reset(struct i40e_pf *pf)
>(u8 *)&pfe, sizeof(struct virtchnl_pf_event));
>  }
>  
> +#ifdef CONFIG_PCI_IOV

Also noticed that i40e_virtchnl_pf.c is compiled always instead of only
conditionally when CONFIG_PCI_IOV is enabled. That's not really the
fault of this change, but I think it would be good cleanup to avoid
needing to compile any of this code if CONFIG_PCI_IOV is disabled.

> +void i40e_restore_all_vfs_msi_state(struct pci_dev *pdev)
> +{
> + u16 vf_id;
> + u16 pos;
> +
> + /* Continue only if this is a PF */
> + if (!pdev->is_physfn)
> + return;
> +

The function could also just pass the pf instead of pdev, and we'd know
its the physical function.

> + if (!pci_num_vf(pdev))
> + return;

If we switch to saving pdevs in our VF structure, this could be a simple
iteration loop that does nothing if the number of VFs is 0.

> +
> + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
> + if (pos) {
> + struct pci_dev *vf_dev = NULL;
> +
> + pci_read_config_word(pdev, pos + PCI_SRIOV_VF_DID, &vf_id);
> + while ((vf_dev = pci_get_device(pdev->vendor, vf_id, vf_dev))) {
> + if (vf_dev->is_virtfn && vf_dev->physfn == pdev)
> + pci_restore_msi_state(vf_dev);
> + }
> + }
> +}
> +#endif /* CONFIG_PCI_IOV */
> +

Thanks,
Jake
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH iwl-net] idpf: enable WB_ON_ITR

2023-12-13 Thread Nguyen, Anthony L



> -Original Message-
> From: Kubiak, Michal 
> Sent: Wednesday, December 13, 2023 5:51 AM
> 
> On Wed, Dec 13, 2023 at 02:23:19PM +0100, Michal Kubiak wrote:
> > On Tue, Dec 12, 2023 at 05:50:55PM +0100, Paul Menzel wrote:
> > > Dear Michal, dear Joshua,
> > >
> > >
> > > Thank you for your patch.
> > >
> > > On 12/12/23 15:55, Michal Kubiak wrote:
> > > > From: Joshua Hay 
> > > >
> > > > Tell hardware to writeback completed descriptors even when
> > > > interrupts
> > >
> > > Should you resend, the verb is spelled with a space: write back.
> >
> > Sure, I will fix it.
> 
> Hi Tony,
> 
> Could you please add a space ("writeback" -> "write back") when taking the
> patch to your tree?

Yep, I can do that.

Thanks,
Tony
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


[Intel-wired-lan] [tnguy-next-queue:1GbE] BUILD SUCCESS bf88f7d920da2bbabef16404a449a4f72cc0ffcd

2023-12-13 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git 1GbE
branch HEAD: bf88f7d920da2bbabef16404a449a4f72cc0ffcd  e1000e: Use 
pcie_capability_read_word() for reading LNKSTA

elapsed time: 1481m

configs tested: 155
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc defconfig   gcc  
arc   randconfig-001-20231213   gcc  
arc   randconfig-002-20231213   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   gcc  
arm  allyesconfig   gcc  
arm axm55xx_defconfig   gcc  
arm defconfig   clang
armhisi_defconfig   gcc  
arm lpc18xx_defconfig   gcc  
arm64allmodconfig   clang
arm64 allnoconfig   gcc  
arm64   defconfig   gcc  
csky allmodconfig   gcc  
csky  allnoconfig   gcc  
csky allyesconfig   gcc  
cskydefconfig   gcc  
csky  randconfig-001-20231213   gcc  
csky  randconfig-002-20231213   gcc  
hexagon  allmodconfig   clang
hexagon   allnoconfig   clang
hexagon  allyesconfig   clang
hexagon defconfig   clang
i386 allmodconfig   clang
i386  allnoconfig   clang
i386 allyesconfig   clang
i386defconfig   gcc  
i386  randconfig-011-20231213   gcc  
i386  randconfig-012-20231213   gcc  
i386  randconfig-013-20231213   gcc  
i386  randconfig-014-20231213   gcc  
i386  randconfig-015-20231213   gcc  
i386  randconfig-016-20231213   gcc  
loongarchallmodconfig   gcc  
loongarch allnoconfig   gcc  
loongarchallyesconfig   gcc  
loongarch   defconfig   gcc  
loongarch randconfig-001-20231213   gcc  
loongarch randconfig-002-20231213   gcc  
m68k allmodconfig   gcc  
m68k  allnoconfig   gcc  
m68k allyesconfig   gcc  
m68kdefconfig   gcc  
m68k   m5249evb_defconfig   gcc  
m68k  multi_defconfig   gcc  
microblaze   allmodconfig   gcc  
microblazeallnoconfig   gcc  
microblaze   allyesconfig   gcc  
microblaze  defconfig   gcc  
mips allmodconfig   gcc  
mips  allnoconfig   clang
mips allyesconfig   gcc  
mips decstation_r4k_defconfig   gcc  
nios2allmodconfig   gcc  
nios2 allnoconfig   gcc  
nios2allyesconfig   gcc  
nios2   defconfig   gcc  
nios2 randconfig-001-20231213   gcc  
nios2 randconfig-002-20231213   gcc  
openrisc allmodconfig   gcc  
openrisc  allnoconfig   gcc  
openrisc allyesconfig   gcc  
openriscdefconfig   gcc  
parisc   allmodconfig   gcc  
pariscallnoconfig   gcc  
parisc   allyesconfig   gcc  
parisc  defconfig   gcc  
pariscrandconfig-001-20231213   gcc  
pariscrandconfig-002-20231213   gcc  
parisc64defconfig   gcc  
powerpc  allmodconfig   clang
powerpc   allnoconfig   gcc  
powerpc  allyesconfig   clang
powerpc asp8347_defconfig   gcc  
powerpc  ppc6xx_defconfig   gcc  
powerpcsam440ep_defconfig   gcc  
powerpc tqm8541_defconfig   gcc  
powerpc  tqm8xx_defconfig   gcc  
riscvallmodconfig   gcc  
riscv

Re: [Intel-wired-lan] [PATCH iwl-net] idpf: enable WB_ON_ITR

2023-12-13 Thread Tony Nguyen




On 12/13/2023 2:22 PM, Nguyen, Anthony L wrote:




-Original Message-
From: Kubiak, Michal 
Sent: Wednesday, December 13, 2023 5:51 AM

On Wed, Dec 13, 2023 at 02:23:19PM +0100, Michal Kubiak wrote:

On Tue, Dec 12, 2023 at 05:50:55PM +0100, Paul Menzel wrote:

Dear Michal, dear Joshua,


Thank you for your patch.

On 12/12/23 15:55, Michal Kubiak wrote:

From: Joshua Hay 

Tell hardware to writeback completed descriptors even when
interrupts


Should you resend, the verb is spelled with a space: write back.


Sure, I will fix it.


Hi Tony,

Could you please add a space ("writeback" -> "write back") when taking the
patch to your tree?


Yep, I can do that.


Actually, looks like you missed updating kdocs

drivers/net/ethernet/intel/idpf/idpf_txrx.h:508: warning: Function 
parameter or member 'dyn_ctl_intena_msk_m' not described in 'idpf_intr_reg'
drivers/net/ethernet/intel/idpf/idpf_txrx.h:508: warning: Function 
parameter or member 'dyn_ctl_wb_on_itr_m' not described in 'idpf_intr_reg'
drivers/net/ethernet/intel/idpf/idpf_txrx.h:561: warning: Function 
parameter or member 'wb_on_itr' not described in 'idpf_q_vector'



Thanks,
Tony


___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH net-next 0/2] idpf: add get/set for Ethtool's header split ringparam

2023-12-13 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski :

On Tue, 12 Dec 2023 15:27:50 +0100 you wrote:
> Currently, the header split feature (putting headers in one smaller
> buffer and then the data in a separate bigger one) is always enabled
> in idpf when supported.
> One may want to not have fragmented frames per each packet, for example,
> to avoid XDP frags. To better optimize setups for particular workloads,
> add ability to switch the header split state on and off via Ethtool's
> ringparams, as well as to query the current status.
> There's currently only GET in the Ethtool Netlink interface for now,
> so add SET first. I suspect idpf is not the only one supporting this.
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] ethtool: add SET for TCP_DATA_SPLIT ringparam
https://git.kernel.org/netdev/net-next/c/50d73710715d
  - [net-next,2/2] idpf: add get/set for Ethtool's header split ringparam
https://git.kernel.org/netdev/net-next/c/9b1aa3ef2328

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH] igb: Add null pointer check to igb_set_fw_version

2023-12-13 Thread Kunwu Chan

Thanks for your reply.

I'll try to use snprintf in v2 patch.


On 2023/12/13 05:26, Jakub Kicinski wrote:

On Mon, 11 Dec 2023 11:13:36 +0800 Kunwu Chan wrote:

kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.

Fixes: 1978d3ead82c ("intel: fix string truncation warnings")
Cc: Kunwu Chan 
Signed-off-by: Kunwu Chan 


The allocation is rather pointless here.
Can you convert this code to use snprintf() instead?

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH net-next v9 0/8] Support symmetric-xor RSS hash

2023-12-13 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski :

On Tue, 12 Dec 2023 17:33:13 -0700 you wrote:
> Patches 1 and 2 modify the get/set_rxh ethtool API to take a pointer to
> struct of parameters instead of individual params. This will allow future
> changes to the uAPI-shared struct ethtool_rxfh without changing the
> drivers' API.
> 
> Patch 3 adds the support at the Kernel level, allowing the user to set a
> symmetric-xor RSS hash for a netdevice via:
> 
> [...]

Here is the summary with links:
  - [net-next,v9,1/8] net: ethtool: pass a pointer to parameters to 
get/set_rxfh ethtool ops
https://git.kernel.org/netdev/net-next/c/fb6e30a72539
  - [net-next,v9,2/8] net: ethtool: get rid of get/set_rxfh_context functions
https://git.kernel.org/netdev/net-next/c/dcd8dbf9e734
  - [net-next,v9,3/8] net: ethtool: add support for symmetric-xor RSS hash
https://git.kernel.org/netdev/net-next/c/13e59344fb9d
  - [net-next,v9,4/8] ice: fix ICE_AQ_VSI_Q_OPT_RSS_* register values
https://git.kernel.org/netdev/net-next/c/20f73b60bb5c
  - [net-next,v9,5/8] ice: refactor RSS configuration
https://git.kernel.org/netdev/net-next/c/dc6e44c9d6d6
  - [net-next,v9,6/8] ice: refactor the FD and RSS flow ID generation
https://git.kernel.org/netdev/net-next/c/b1f5921a99ac
  - [net-next,v9,7/8] ice: enable symmetric-xor RSS for Toeplitz hash function
https://git.kernel.org/netdev/net-next/c/352e9bf23813
  - [net-next,v9,8/8] iavf: enable symmetric-xor RSS for Toeplitz hash function
https://git.kernel.org/netdev/net-next/c/4a3de3fb0eb6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH] iavf: Fix null pointer dereference in iavf_print_link_message

2023-12-13 Thread Kunwu Chan

Thanks for your reply.
Sure, the only thing 'iavf_print_link_message' do is to print a msg by 
netdev_info.


The 'iavf_virtchnl_completion' assume that no errors will be returned.
Whether we could just execute 'netdev_info(netdev, "NIC Link is Up Speed 
is %s Full Duplex\n", speed? speed :"");' when 'speed' is null.



Before commit '1978d3ead82c8', the buffer size is '#define 
IAVF_MAX_SPEED_STRLEN  13', whether we could use a bigger buffer

size to avoid a null pointer.

Such as '#define IAVF_MAX_SPEED_STRLEN 48'.


On 2023/12/13 05:28, Jakub Kicinski wrote:

On Mon, 11 Dec 2023 10:59:27 +0800 Kunwu Chan wrote:

kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.

Fixes: 1978d3ead82c ("intel: fix string truncation warnings")


No need for the allocation here, print to a buffer on the stack.

___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: [Intel-wired-lan] [PATCH iwl-next v2 14/15] ice: cleanup inconsistent code

2023-12-13 Thread Przemek Kitszel

On 12/13/23 19:27, Jesse Brandeburg wrote:

Please don't use HTML email, your reply was likely dropped by most lists
that filter HTML.

On 12/12/2023 8:06 PM, Kalesh Anakkur Purayil wrote:

 -       change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M,  mib->type);
 +       change_type = FIELD_GET(ICE_AQ_LLDP_MIB_TYPE_M, mib->type);

[Kalesh]: I did not get what exactly changed here? Is that you just
removed one extra space before mib->type?


Yes, there was a whitespace change missed in the GET series. I had
caught it only here. If you feel I need to I can resend to add a comment
to the commit message that this was added here.




I guess that NOT sending next revision is our only chance to fix this
particular white space error ;)
___
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan