[Intel-wired-lan] [PATCH iwl-next v1 1/3] ice: Add sync delay for E825C
From: Karol Kolacinski Implement setting GLTSYN_SYNC_DLAY for E825C products. This is the execution delay compensation of SYNC command between PHC and PHY. Also, refactor the code by changing ice_ptp_init_phc_eth56g function name to ice_ptp_init_phc_e825, to be consistent with the naming pattern for other devices. Reviewed-by: Przemek Kitszel Signed-off-by: Karol Kolacinski Signed-off-by: Grzegorz Nitka --- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 14 +++--- drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 3 +++ 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index 3e824f7b30c0..c3dea6d61de4 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -2650,18 +2650,18 @@ static void ice_sb_access_ena_eth56g(struct ice_hw *hw, bool enable) } /** - * ice_ptp_init_phc_eth56g - Perform E82X specific PHC initialization + * ice_ptp_init_phc_e825 - Perform E825 specific PHC initialization * @hw: pointer to HW struct * - * Perform PHC initialization steps specific to E82X devices. + * Perform E825-specific PTP hardware clock initialization steps. * - * Return: - * * %0 - success - * * %other - failed to initialize CGU + * Return: 0 on success, negative error code otherwise. */ -static int ice_ptp_init_phc_eth56g(struct ice_hw *hw) +static int ice_ptp_init_phc_e825(struct ice_hw *hw) { ice_sb_access_ena_eth56g(hw, true); + ice_ptp_cfg_sync_delay(hw, ICE_E825_SYNC_DELAY); + /* Initialize the Clock Generation Unit */ return ice_init_cgu_e82x(hw); } @@ -6123,7 +6123,7 @@ int ice_ptp_init_phc(struct ice_hw *hw) case ICE_MAC_GENERIC: return ice_ptp_init_phc_e82x(hw); case ICE_MAC_GENERIC_3K_E825: - return ice_ptp_init_phc_eth56g(hw); + return ice_ptp_init_phc_e825(hw); default: return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h index 8442d1d60351..10295fa9a383 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h @@ -324,7 +324,10 @@ extern const struct ice_vernier_info_e82x e822_vernier[NUM_ICE_PTP_LNK_SPD]; */ #define ICE_E810_PLL_FREQ 81250 #define ICE_PTP_NOMINAL_INCVAL_E8100x13b13b13bULL + +/* PHC to PHY synchronization delay definitions */ #define ICE_E810_E830_SYNC_DELAY 0 +#define ICE_E825_SYNC_DELAY6 /* Device agnostic functions */ u8 ice_get_ptp_src_clock_index(struct ice_hw *hw); -- 2.39.3
[Intel-wired-lan] [PATCH iwl-next v1 2/3] ice: Refactor E825C PHY registers info struct
From: Karol Kolacinski Simplify ice_phy_reg_info_eth56g struct definition to include base address for the very first quad. Use base address info and 'step' value to determine address for specific PHY quad. Reviewed-by: Przemek Kitszel Signed-off-by: Karol Kolacinski Signed-off-by: Grzegorz Nitka --- .../net/ethernet/intel/ice/ice_ptp_consts.h | 75 --- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 6 +- drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 2 +- 3 files changed, 19 insertions(+), 64 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h index ac46d1183300..003cdfada3ca 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h @@ -10,70 +10,25 @@ /* Constants defined for the PTP 1588 clock hardware. */ const struct ice_phy_reg_info_eth56g eth56g_phy_res[NUM_ETH56G_PHY_RES] = { - /* ETH56G_PHY_REG_PTP */ - { - /* base_addr */ - { - 0x092000, - 0x126000, - 0x1BA000, - 0x24E000, - 0x2E2000, - }, - /* step */ - 0x98, + [ETH56G_PHY_REG_PTP] = { + .base_addr = 0x092000, + .step = 0x98, }, - /* ETH56G_PHY_MEM_PTP */ - { - /* base_addr */ - { - 0x093000, - 0x127000, - 0x1BB000, - 0x24F000, - 0x2E3000, - }, - /* step */ - 0x200, + [ETH56G_PHY_MEM_PTP] = { + .base_addr = 0x093000, + .step = 0x200, }, - /* ETH56G_PHY_REG_XPCS */ - { - /* base_addr */ - { - 0x00, - 0x009400, - 0x128000, - 0x1BC000, - 0x25, - }, - /* step */ - 0x21000, + [ETH56G_PHY_REG_XPCS] = { + .base_addr = 0x00, + .step = 0x21000, }, - /* ETH56G_PHY_REG_MAC */ - { - /* base_addr */ - { - 0x085000, - 0x119000, - 0x1AD000, - 0x241000, - 0x2D5000, - }, - /* step */ - 0x1000, + [ETH56G_PHY_REG_MAC] = { + .base_addr = 0x085000, + .step = 0x1000, }, - /* ETH56G_PHY_REG_GPCS */ - { - /* base_addr */ - { - 0x084000, - 0x118000, - 0x1AC000, - 0x24, - 0x2D4000, - }, - /* step */ - 0x400, + [ETH56G_PHY_REG_GPCS] = { + .base_addr = 0x084000, + .step = 0x400, }, }; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index c3dea6d61de4..5c61bc3a2c25 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -1010,7 +1010,7 @@ static int ice_phy_res_address_eth56g(struct ice_hw *hw, u8 lane, /* Lanes 4..7 are in fact 0..3 on a second PHY */ lane %= hw->ptp.ports_per_phy; - *addr = eth56g_phy_res[res_type].base[0] + + *addr = eth56g_phy_res[res_type].base_addr + lane * eth56g_phy_res[res_type].step + offset; return 0; @@ -1240,7 +1240,7 @@ static int ice_write_quad_ptp_reg_eth56g(struct ice_hw *hw, u8 port, if (port >= hw->ptp.num_lports) return -EIO; - addr = eth56g_phy_res[ETH56G_PHY_REG_PTP].base[0] + offset; + addr = eth56g_phy_res[ETH56G_PHY_REG_PTP].base_addr + offset; return ice_write_phy_eth56g(hw, port, addr, val); } @@ -1265,7 +1265,7 @@ static int ice_read_quad_ptp_reg_eth56g(struct ice_hw *hw, u8 port, if (port >= hw->ptp.num_lports) return -EIO; - addr = eth56g_phy_res[ETH56G_PHY_REG_PTP].base[0] + offset; + addr = eth56g_phy_res[ETH56G_PHY_REG_PTP].base_addr + offset; return ice_read_phy_eth56g(hw, port, addr, val); } diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h index 10295fa9a383..63a63ef64aaa 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h @@ -72,7 +72,7 @@ enum ice_eth56g_link_spd { * ETH56G devices */ struct ice_phy_reg_info_eth56g { - u32 base[NUM_ETH56G_PHY_RES]; + u32 base_addr; u32 step; }; -- 2.39.3
[Intel-wired-lan] [PATCH iwl-next v1 3/3] ice: E825C PHY register cleanup
From: Karol Kolacinski Minor PTP register refactor, including logical grouping E825C 1-step timestamping registers. Remove unused register definitions (PHY_REG_GPCS_BITSLIP, PHY_REG_REVISION). Also, apply preferred GENMASK macro (instead of ICE_M) for register fields definition affected by this patch. Reviewed-by: Przemek Kitszel Signed-off-by: Karol Kolacinski Signed-off-by: Grzegorz Nitka --- drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 31 ++--- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h index 63a63ef64aaa..6ca1561ec5e8 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h @@ -783,36 +783,19 @@ static inline bool ice_is_dual(struct ice_hw *hw) #define PHY_MAC_XIF_TS_SFD_ENA_M ICE_M(0x1, 20) #define PHY_MAC_XIF_GMII_TS_SEL_M ICE_M(0x1, 21) -/* GPCS config register */ -#define PHY_GPCS_CONFIG_REG0 0x268 -#define PHY_GPCS_CONFIG_REG0_TX_THR_M ICE_M(0xF, 24) -#define PHY_GPCS_BITSLIP 0x5C - #define PHY_TS_INT_CONFIG_THRESHOLD_M ICE_M(0x3F, 0) #define PHY_TS_INT_CONFIG_ENA_MBIT(6) -/* 1-step PTP config */ -#define PHY_PTP_1STEP_CONFIG 0x270 -#define PHY_PTP_1STEP_T1S_UP64_M ICE_M(0xF, 4) -#define PHY_PTP_1STEP_T1S_DELTA_M ICE_M(0xF, 8) -#define PHY_PTP_1STEP_PEER_DELAY(_port)(0x274 + 4 * (_port)) -#define PHY_PTP_1STEP_PD_ADD_PD_M ICE_M(0x1, 0) -#define PHY_PTP_1STEP_PD_DELAY_M ICE_M(0x3fff, 1) -#define PHY_PTP_1STEP_PD_DLY_V_M ICE_M(0x1, 31) - /* Macros to derive offsets for TimeStampLow and TimeStampHigh */ #define PHY_TSTAMP_L(x) (((x) * 8) + 0) #define PHY_TSTAMP_U(x) (((x) * 8) + 4) -#define PHY_REG_REVISION 0x85000 - #define PHY_REG_DESKEW_0 0x94 #define PHY_REG_DESKEW_0_RLEVELGENMASK(6, 0) #define PHY_REG_DESKEW_0_RLEVEL_FRAC GENMASK(9, 7) #define PHY_REG_DESKEW_0_RLEVEL_FRAC_W 3 #define PHY_REG_DESKEW_0_VALID GENMASK(10, 10) -#define PHY_REG_GPCS_BITSLIP 0x5C #define PHY_REG_SD_BIT_SLIP(_port_offset) (0x29C + 4 * (_port_offset)) #define PHY_REVISION_ETH56G0x10200 #define PHY_VENDOR_TXLANE_THRESH 0x2000C @@ -832,7 +815,21 @@ static inline bool ice_is_dual(struct ice_hw *hw) #define PHY_MAC_BLOCKTIME 0x50 #define PHY_MAC_MARKERTIME 0x54 #define PHY_MAC_TX_OFFSET 0x58 +#define PHY_GPCS_BITSLIP 0x5C #define PHY_PTP_INT_STATUS 0x7FD140 +/* ETH56G registers shared per quad */ +/* GPCS config register */ +#define PHY_GPCS_CONFIG_REG0 0x268 +#define PHY_GPCS_CONFIG_REG0_TX_THR_M GENMASK(27, 24) +/* 1-step PTP config */ +#define PHY_PTP_1STEP_CONFIG 0x270 +#define PHY_PTP_1STEP_T1S_UP64_M GENMASK(7, 4) +#define PHY_PTP_1STEP_T1S_DELTA_M GENMASK(11, 8) +#define PHY_PTP_1STEP_PEER_DELAY(_quad_lane) (0x274 + 4 * (_quad_lane)) +#define PHY_PTP_1STEP_PD_ADD_PD_M BIT(0) +#define PHY_PTP_1STEP_PD_DELAY_M GENMASK(30, 1) +#define PHY_PTP_1STEP_PD_DLY_V_M BIT(31) + #endif /* _ICE_PTP_HW_H_ */ -- 2.39.3
[Intel-wired-lan] [PATCH iwl-next v1 0/3] E825C PTP cleanup
This patch series simplifies PTP code related to E825C products by simplifying PHY register info definition. Cleanup the code by removing unused register definitions. Also, add sync delay compensation between PHC and PHY for E825C. Karol Kolacinski (3): ice: Add sync delay for E825C ice: Refactor E825C PHY registers info struct ice: E825C PHY register cleanup .../net/ethernet/intel/ice/ice_ptp_consts.h | 75 --- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 20 ++--- drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 36 - 3 files changed, 43 insertions(+), 88 deletions(-) base-commit: 70bdf16570c2c207a562e996833ff196a4bd7029 -- 2.39.3
Re: [Intel-wired-lan] [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote: > Hi Leon, > > On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky wrote: > > > > From: Leon Romanovsky > > > > XFRM offload path is probed even if offload isn't needed at all. Let's > > make sure that x->type_offload pointer stays NULL for such path to > > reduce ambiguity. > > > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.") > > Signed-off-by: Leon Romanovsky > > --- > > include/net/xfrm.h | 12 +++- > > net/xfrm/xfrm_device.c | 14 +- > > net/xfrm/xfrm_state.c | 22 +- > > net/xfrm/xfrm_user.c | 2 +- > > 4 files changed, 30 insertions(+), 20 deletions(-) <...> > > + x->type_offload = xfrm_get_type_offload(x->id.proto, > > x->props.family); > > + if (!x->type_offload) { <...> > > + xfrm_put_type_offload(x->type_offload); > > + x->type_offload = NULL; > > We always set type_offload to NULL. Can we move type_offload = NULL in > xfrm_put_type_offload() ? We can, but it will require change to xfrm_get_type_offload() too, otherwise we will get asymmetrical get/put. Do you want something like that? int xfrm_get_type_offload(struct xfrm_state *x); void xfrm_put_type_offload(struct xfrm_state *x); Thansk > > Thanks > -Bharat > > > /* User explicitly requested packet offload mode and > > configured > > * policy in addition to the XFRM state. So be civil to > > users, > > * and return an error instead of taking fallback path. > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c > > index ad2202fa82f3..568fe8df7741 100644 > > --- a/net/xfrm/xfrm_state.c > > +++ b/net/xfrm/xfrm_state.c > > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct > > xfrm_type_offload *type, > > } > > EXPORT_SYMBOL(xfrm_unregister_type_offload); > > > > -static const struct xfrm_type_offload * > > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load) > > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto, > > + unsigned short family) > > { > > const struct xfrm_type_offload *type = NULL; > > struct xfrm_state_afinfo *afinfo; > > + bool try_load = true; > > > > retry: > > afinfo = xfrm_state_get_afinfo(family); > > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, > > bool try_load) > > > > return type; > > } > > - > > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type) > > -{ > > - module_put(type->owner); > > -} > > +EXPORT_SYMBOL(xfrm_get_type_offload); > > > > static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = { > > [XFRM_MODE_BEET] = { > > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state *x) > > kfree(x->coaddr); > > kfree(x->replay_esn); > > kfree(x->preplay_esn); > > - if (x->type_offload) > > - xfrm_put_type_offload(x->type_offload); > > if (x->type) { > > x->type->destructor(x); > > xfrm_put_type(x->type); > > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x) > > struct xfrm_dev_offload *xso = &x->xso; > > struct net_device *dev = READ_ONCE(xso->dev); > > > > + xfrm_put_type_offload(x->type_offload); > > + x->type_offload = NULL; > > + > > if (dev && dev->xfrmdev_ops) { > > spin_lock_bh(&xfrm_state_dev_gc_lock); > > if (!hlist_unhashed(&x->dev_gclist)) > > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) > > } > > EXPORT_SYMBOL_GPL(xfrm_state_mtu); > > > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload, > > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay, > > struct netlink_ext_ack *extack) > > { > > const struct xfrm_mode *inner_mode; > > @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool > > init_replay, bool offload, > > goto error; > > } > > > > - x->type_offload = xfrm_get_type_offload(x->id.proto, family, > > offload); > > - > > err = x->type->init_state(x, extack); > > if (err) > > goto error; > > @@ -3229,7 +3225,7 @@ int xfrm_init_state(struct xfrm_state *x) > > { > > int err; > > > > - err = __xfrm_init_state(x, true, false, NULL); > > + err = __xfrm_init_state(x, true, NULL); > > if (!err) > > x->km.state = XFRM_STATE_VALID; > > > > diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c > > index 08c6d6f0179f..82a768500999 100644 > > --- a/net/xfrm/xfrm_user.c > > +++ b/net/xfrm/xfrm_user.c > > @@ -907,7 +907,7 @@ static struct xfrm_state *xfrm_state_construct(struct > > net *net, > > goto error; > > } > > > > - err = __x
Re: [Intel-wired-lan] [PATCH bpf-next v9 4/5] igc: Refactor empty frame insertion for launch time support
On Thu, Feb 06, 2025 at 02:04:07PM +0800, Song Yoong Siang wrote: > Refactor the code for inserting an empty frame into a new function > igc_insert_empty_frame(). This change extracts the logic for inserting > an empty packet from igc_xmit_frame_ring() into a separate function, > allowing it to be reused in future implementations, such as the XDP > zero copy transmit function. > > Remove the igc_desc_unused() checking in igc_init_tx_empty_descriptor() > because the number of descriptors needed is guaranteed. > > Ensure that skb allocation and DMA mapping work for the empty frame, > before proceeding to fill in igc_tx_buffer info, context descriptor, > and data descriptor. > > Rate limit the error messages for skb allocation and DMA mapping failures. > > Update the comment to indicate that the 2 descriptors needed by the empty > frame are already taken into consideration in igc_xmit_frame_ring(). > > Handle the case where the insertion of an empty frame fails and explain > the reason behind this handling. > > Signed-off-by: Song Yoong Siang Reviewed-by: Maciej Fijalkowski with one nit below. > --- > drivers/net/ethernet/intel/igc/igc_main.c | 84 ++- > 1 file changed, 52 insertions(+), 32 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c > b/drivers/net/ethernet/intel/igc/igc_main.c > index 84307bb7313e..3df608601a4b 100644 > --- a/drivers/net/ethernet/intel/igc/igc_main.c > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > @@ -1092,7 +1092,9 @@ static int igc_init_empty_frame(struct igc_ring *ring, > > dma = dma_map_single(ring->dev, skb->data, size, DMA_TO_DEVICE); > if (dma_mapping_error(ring->dev, dma)) { > - netdev_err_once(ring->netdev, "Failed to map DMA for TX\n"); > + if (net_ratelimit()) > + netdev_err(ring->netdev, > +"DMA mapping error for empty frame\n"); This is open-coded net_err_ratelimited(), no? > return -ENOMEM; > } > > @@ -1108,20 +1110,12 @@ static int igc_init_empty_frame(struct igc_ring *ring, > return 0; > } > > -static int igc_init_tx_empty_descriptor(struct igc_ring *ring, > - struct sk_buff *skb, > - struct igc_tx_buffer *first) > +static void igc_init_tx_empty_descriptor(struct igc_ring *ring, > + struct sk_buff *skb, > + struct igc_tx_buffer *first) > { > union igc_adv_tx_desc *desc; > u32 cmd_type, olinfo_status; > - int err; > - > - if (!igc_desc_unused(ring)) > - return -EBUSY; > - > - err = igc_init_empty_frame(ring, first, skb); > - if (err) > - return err; > > cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT | > IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD | > @@ -1140,8 +1134,6 @@ static int igc_init_tx_empty_descriptor(struct igc_ring > *ring, > ring->next_to_use++; > if (ring->next_to_use == ring->count) > ring->next_to_use = 0; > - > - return 0; > } > > #define IGC_EMPTY_FRAME_SIZE 60 > @@ -1567,6 +1559,41 @@ static bool igc_request_tx_tstamp(struct igc_adapter > *adapter, struct sk_buff *s > return false; > } > > +static int igc_insert_empty_frame(struct igc_ring *tx_ring) > +{ > + struct igc_tx_buffer *empty_info; > + struct sk_buff *empty_skb; > + void *data; > + int ret; > + > + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; > + empty_skb = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); > + if (unlikely(!empty_skb)) { > + if (net_ratelimit()) > + netdev_err(tx_ring->netdev, > +"skb alloc error for empty frame\n"); ditto > + return -ENOMEM; > + } > + > + data = skb_put(empty_skb, IGC_EMPTY_FRAME_SIZE); > + memset(data, 0, IGC_EMPTY_FRAME_SIZE); > + > + /* Prepare DMA mapping and Tx buffer information */ > + ret = igc_init_empty_frame(tx_ring, empty_info, empty_skb); > + if (unlikely(ret)) { > + dev_kfree_skb_any(empty_skb); > + return ret; > + } > + > + /* Prepare advanced context descriptor for empty packet */ > + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); > + > + /* Prepare advanced data descriptor for empty packet */ > + igc_init_tx_empty_descriptor(tx_ring, empty_skb, empty_info); > + > + return 0; > +} > + > static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, > struct igc_ring *tx_ring) > { > @@ -1586,6 +1613,7 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff > *skb, >* + 1 desc for skb_headlen/IGC_MAX_DATA_PER_TXD, >* + 2 desc gap to keep tail from touching head, >* + 1 desc for context descriptor, > + * + 2 desc for inserting an em
Re: [Intel-wired-lan] [PATCH iwl-next v2] ixgbe: add support for thermal sensor event reception
From: Andrew Lunn Sent: Tuesday, February 4, 2025 2:09 PM >On Tue, Feb 04, 2025 at 08:17:00AM +0100, Jedrzej Jagielski wrote: >> E610 NICs unlike the previous devices utilising ixgbe driver >> are notified in the case of overheatning by the FW ACI event. >> >> In event of overheat when treshold is exceeded, FW suspends all >> traffic and sends overtemp event to the driver. Then driver >> logs appropriate message and closes the adapter instance. >> The card remains in that state until the platform is rebooted. > >There is also an HWMON temp[1-*]_emergency_alarm you can set. I >_think_ that should also cause a udev event, so user space knows the >print^h^h^h^h^hnetwork is on fire. > > Andrew I am not sure whether HWMON is applicable in that case. Driver receives an async notification from the FW that an overheating occurred, so has to handle it. In that case - by printing msg and making the interface disabled for the user. FW is responsible for monitoring temperature itself. There's even no possibility to read temperature by the driver Thanks Jedrek
Re: [Intel-wired-lan] [PATCH iwl-next v2] ixgbe: add support for thermal sensor event reception
On Thu, Feb 06, 2025 at 01:05:27PM +, Jagielski, Jedrzej wrote: > From: Andrew Lunn > Sent: Tuesday, February 4, 2025 2:09 PM > >On Tue, Feb 04, 2025 at 08:17:00AM +0100, Jedrzej Jagielski wrote: > >> E610 NICs unlike the previous devices utilising ixgbe driver > >> are notified in the case of overheatning by the FW ACI event. > >> > >> In event of overheat when treshold is exceeded, FW suspends all > >> traffic and sends overtemp event to the driver. Then driver > >> logs appropriate message and closes the adapter instance. > >> The card remains in that state until the platform is rebooted. > > > >There is also an HWMON temp[1-*]_emergency_alarm you can set. I > >_think_ that should also cause a udev event, so user space knows the > >print^h^h^h^h^hnetwork is on fire. > > > > Andrew > > I am not sure whether HWMON is applicable in that case. > Driver receives an async notification from the FW that an overheating > occurred, so has to handle it. In that case - by printing msg > and making the interface disabled for the user. > FW is responsible for monitoring temperature itself. > There's even no possibility to read temperature by the driver https://elixir.bootlin.com/linux/v6.13.1/source/drivers/net/ethernet/intel/ixgbe/ixgbe_sysfs.c#L27 ixgbe_hwmon_show_temp() is some other temperature sensor? Which you do have HWMON support for? Or is the E610 not really an ixgbe, it has a different architecture, more stuff pushed into firmware, less visibility from the kernel, no temperature monitoring, just a NIC on fire indication? Andrew
Re: [Intel-wired-lan] [PATCH iwl-next v2] ixgbe: add support for thermal sensor event reception
From: Andrew Lunn Sent: Thursday, February 6, 2025 2:59 PM >On Thu, Feb 06, 2025 at 01:05:27PM +, Jagielski, Jedrzej wrote: >> From: Andrew Lunn >> Sent: Tuesday, February 4, 2025 2:09 PM >> >On Tue, Feb 04, 2025 at 08:17:00AM +0100, Jedrzej Jagielski wrote: >> >> E610 NICs unlike the previous devices utilising ixgbe driver >> >> are notified in the case of overheatning by the FW ACI event. >> >> >> >> In event of overheat when treshold is exceeded, FW suspends all >> >> traffic and sends overtemp event to the driver. Then driver >> >> logs appropriate message and closes the adapter instance. >> >> The card remains in that state until the platform is rebooted. >> > >> >There is also an HWMON temp[1-*]_emergency_alarm you can set. I >> >_think_ that should also cause a udev event, so user space knows the >> >print^h^h^h^h^hnetwork is on fire. >> > >> >Andrew >> >> I am not sure whether HWMON is applicable in that case. >> Driver receives an async notification from the FW that an overheating >> occurred, so has to handle it. In that case - by printing msg >> and making the interface disabled for the user. >> FW is responsible for monitoring temperature itself. >> There's even no possibility to read temperature by the driver > >https://elixir.bootlin.com/linux/v6.13.1/source/drivers/net/ethernet/intel/ixgbe/ixgbe_sysfs.c#L27 > >ixgbe_hwmon_show_temp() is some other temperature sensor? Which you do >have HWMON support for? This feature is not supported for E610 which has no support for reading temperature hw->mac.ops.get_thermal_sensor_data() callback used in ixgbe_hwmon_show_temp has no implementation for E610, as there is no such support from the FW side > >Or is the E610 not really an ixgbe, it has a different architecture, ixgbe is used by several adapters, each is slightly different in this case monitoring stuff is pushed into FW >more stuff pushed into firmware, less visibility from the kernel, no >temperature monitoring, just a NIC on fire indication? yeah, right Jedrek
Re: [Intel-wired-lan] [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
Hi Vladimir, Thanks for the quick review, appreciate your help. On 6/2/2025 1:12 am, Vladimir Oltean wrote: On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote: This patch implements the "ethtool --set-mm" callback to trigger the frame preemption verification handshake. Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool to perform the verification handshake for igc. The structure fpe.mmsv is set by mmsv in ethtool and should remain read-only for the driver. igc does not use two mmsv callbacks: a) configure_tx() - igc lacks registers to configure FPE in the transmit direction. Yes, maybe, but it's still important to handle this. It tells you when the preemptible traffic classes should be sent as preemptible on the wire (i.e. when the verification is either disabled, or it succeeded). There is a selftest called manual_failed_verification() which supposedly tests this exact condition: if verification fails, then packets sent to TC0 are supposed to bump the eMAC's TX counters, even though TC0 is configured as preemptible. Otherwise stated: even if the tc program says that a certain traffic class is preemptible, you don't want to actually send preemptible packets if you haven't verified the link partner can handle them, since it will likely drop them on RX otherwise. Even though fpe in tx direction isn't set in igc, it still checks ethtool_mmsv_is_tx_active() before setting a queue as preemptible. This is done in : igc_tsn_enable_offload(struct igc_adapter *adapter) { { if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) && ring->preemptible) txqctl |= IGC_TXQCTL_PREEMPTIBLE; Wouldn't this handle the situation mentioned ? Sorry if I miss something here. The problem with that selftest is that it relies on the driver's ability to report split ethtool counters for eMAC/pMAC, rather than just aggregate. In lack of that, you need to know through some other mechanism whether the link partner received those packets as preemptible or express, and that's kind of hard to add to a selftest. b) configure_pmac() - this callback dynamically controls pmac_enabled at runtime. For example, mmsv calls configure_pmac() and disables pmac_enabled when the link partner goes down, even if the user previously enabled it. The intention is to save power but it is not feasible in igc because it causes an endless adapter reset loop: 1) Board A and Board B complete the verification handshake. Tx mode register for both boards are in TSN mode. 2) Board B link goes down. On Board A: 3) mmsv calls configure_pmac() with pmac_enabled = false. 4) configure_pmac() in igc updates a new field based on pmac_enabled. Driver uses this field in igc_tsn_new_flags() to indicate that the user enabled/disabled FPE. 5) configure_pmac() in igc calls igc_tsn_offload_apply() to check whether an adapter reset is needed. Calls existing logic in igc_tsn_will_tx_mode_change() and igc_tsn_new_flags(). 6) Since pmac_enabled is now disabled and no other TSN feature is active, igc_tsn_will_tx_mode_change() evaluates to true because Tx mode will switch from TSN to Legacy. 7) Driver resets the adapter. 8) Registers are set, and Tx mode switches to Legacy. 9) When link partner is up, steps 3–8 repeat, but this time with pmac_enabled = true, reactivating TSN. igc_tsn_will_tx_mode_change() evaluates to true again, since Tx mode will switch from Legacy to TSN. 10) Driver resets the adapter. 11) Rest adapter completes, registers are set, and Tx mode switches to TSN. On Board B: 12) Adapter reset on Board A at step 10 causes it to detect its link partner as down. Is this using fiber/DAC, or twisted pair (RJ45 PHY)? It's using RJ45 PHY. 13) Repeats steps 3–8. 14) Once reset adapter on Board A is completed at step 11, it detects its link partner as up. 15) Repeats steps 9–11. - this cycle repeats indefinitely. To avoid this issue, IGC only uses mmsv.pmac_enabled to track whether FPE is enabled or disabled. Not that I couldn't eventually tolerate this workaround, but I figure it's worth asking anyway. Isn't there a way to do an adapter reset without losing link in the PHY (assuming it's the PHY that loses link)? Is that a consequence of software design that's not careful enough, or of hardware design? I assume the PHY is a discrete component. Avoiding the need to retrigger auto-negotiation saves a few seconds of waiting, so it's worth pursuing if it's possible at all. I briefly checked the driver code and the i226 SW User Manual. The code calls igc_reset_task() → igc_reset() → igc_reinit_locked() → igc_down() → igc_reset
[Intel-wired-lan] [tnguy-next-queue:for-next] BUILD SUCCESS d67627e7b53203ca150e54723abbed81a0716286
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git for-next branch HEAD: d67627e7b53203ca150e54723abbed81a0716286 ice: init flow director before RDMA elapsed time: 1444m configs tested: 99 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. tested configs: alpha allnoconfiggcc-14.2.0 alphaallyesconfiggcc-14.2.0 arc allmodconfiggcc-13.2.0 arc allnoconfiggcc-13.2.0 arc allyesconfiggcc-13.2.0 arc randconfig-001-20250206gcc-13.2.0 arc randconfig-002-20250206gcc-13.2.0 arm allmodconfiggcc-14.2.0 arm allnoconfigclang-17 arm allyesconfiggcc-14.2.0 arm randconfig-001-20250206clang-21 arm randconfig-002-20250206clang-19 arm randconfig-003-20250206clang-21 arm randconfig-004-20250206gcc-14.2.0 arm64allmodconfigclang-18 arm64 allnoconfiggcc-14.2.0 arm64 randconfig-001-20250206clang-21 arm64 randconfig-002-20250206clang-21 arm64 randconfig-003-20250206clang-15 arm64 randconfig-004-20250206gcc-14.2.0 csky allnoconfiggcc-14.2.0 csky randconfig-001-20250206gcc-14.2.0 csky randconfig-002-20250206gcc-14.2.0 hexagon allmodconfigclang-21 hexagon allnoconfigclang-21 hexagon allyesconfigclang-18 hexagon randconfig-001-20250206clang-21 hexagon randconfig-002-20250206clang-18 i386 allmodconfiggcc-12 i386 allnoconfiggcc-12 i386 allyesconfiggcc-12 i386buildonly-randconfig-001-20250206clang-19 i386buildonly-randconfig-002-20250206gcc-12 i386buildonly-randconfig-003-20250206clang-19 i386buildonly-randconfig-004-20250206clang-19 i386buildonly-randconfig-005-20250206clang-19 i386buildonly-randconfig-006-20250206gcc-12 i386defconfigclang-19 loongarch allnoconfiggcc-14.2.0 loongarch randconfig-001-20250206gcc-14.2.0 loongarch randconfig-002-20250206gcc-14.2.0 m68k allmodconfiggcc-14.2.0 m68k allnoconfiggcc-14.2.0 m68k allyesconfiggcc-14.2.0 microblazeallnoconfiggcc-14.2.0 mips allnoconfiggcc-14.2.0 nios2 allnoconfiggcc-14.2.0 nios2 randconfig-001-20250206gcc-14.2.0 nios2 randconfig-002-20250206gcc-14.2.0 openrisc allnoconfiggcc-14.2.0 openrisc allyesconfiggcc-14.2.0 parisc allmodconfiggcc-14.2.0 pariscallnoconfiggcc-14.2.0 parisc allyesconfiggcc-14.2.0 pariscrandconfig-001-20250206gcc-14.2.0 pariscrandconfig-002-20250206gcc-14.2.0 powerpc allnoconfiggcc-14.2.0 powerpc randconfig-001-20250206clang-21 powerpc randconfig-002-20250206clang-19 powerpc randconfig-003-20250206clang-21 powerpc64 randconfig-001-20250206clang-15 powerpc64 randconfig-002-20250206clang-17 powerpc64 randconfig-003-20250206clang-19 riscv allnoconfiggcc-14.2.0 riscv randconfig-001-20250206clang-19 riscv randconfig-002-20250206clang-17 s390 allmodconfigclang-19 s390 allnoconfigclang-21 s390 allyesconfiggcc-14.2.0 s390 randconfig-001-20250206clang-21 s390 randconfig-002-20250206gcc-14.2.0 sh allmodconfiggcc-14.2.0 shallnoconfiggcc-14.2.0 sh allyesconfiggcc-14.2.0 shrandconfig-001-20250206gcc-14.2.0 shrandconfig-002-20250206gcc-14.2.0 sparcallmodconfiggcc-14.2.0 sparc allnoconfiggcc-14.2.0 sparc randconfig
Re: [Intel-wired-lan] [PATCH bpf-next v9 3/5] net: stmmac: Add launch time support to XDP ZC
On Thu, Feb 06, 2025 at 02:04:06PM +0800, Song Yoong Siang wrote: > Enable launch time (Time-Based Scheduling) support for XDP zero copy via > the XDP Tx metadata framework. > > This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata > on Intel Tiger Lake platform. Below are the test steps and result. > > Test 1: Send a single packet with the launch time set to 1 s in the future. > > Test steps: > 1. On the DUT, start the xdp_hw_metadata selftest application: >$ sudo ./xdp_hw_metadata enp0s30f4 -l 10 -L 1 > > 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 >of the DUT. > > Result: > When the launch time is set to 1 s in the future, the delta between the > launch time and the transmit hardware timestamp is 16.963 us, as shown in > printout of the xdp_hw_metadata application below. > 0x55b5864717a8: rx_desc[4]->addr=88100 addr=88100 comp_addr=88100 EoP > No rx_hash, err=-95 > HW RX-time: 1734579065767717328 (sec:1734579065.7677) > delta to User RX-time sec:0.0004 (375.624 usec) > XDP RX-time: 1734579065768004454 (sec:1734579065.7680) > delta to User RX-time sec:0.0001 (88.498 usec) > No rx_vlan_tci or rx_vlan_proto, err=-95 > 0x55b5864717a8: ping-pong with csum=5619 (want ) > csum_start=34 csum_offset=6 > HW RX-time: 1734579065767717328 (sec:1734579065.7677) > delta to HW Launch-time sec:1. (100.000 usec) > 0x55b5864717a8: complete tx idx=4 addr=4018 > HW Launch-time: 1734579066767717328 (sec:1734579066.7677) > delta to HW TX-complete-time sec:0. (16.963 usec) > HW TX-complete-time: 1734579066767734291 (sec:1734579066.7677) > delta to User TX-complete-time sec:0.0001 > (130.408 usec) > XDP RX-time: 1734579065768004454 (sec:1734579065.7680) > delta to User TX-complete-time sec:0. > (999860.245 usec) > HW RX-time: 1734579065767717328 (sec:1734579065.7677) > delta to HW TX-complete-time sec:1. (116.963 usec) > 0x55b5864717a8: complete rx idx=132 addr=88100 > > Test 2: Send 1000 packets with a 10 ms interval and the launch time set to > 500 us in the future. > > Test steps: > 1. On the DUT, start the xdp_hw_metadata selftest application: >$ sudo chrt -f 99 ./xdp_hw_metadata enp0s30f4 -l 50 -L 1 > \ > /dev/shm/result.log > > 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and >VLAN priority 1 to port 9091 of the DUT. > > Result: > When the launch time is set to 500 us in the future, the average delta > between the launch time and the transmit hardware timestamp is 13.854 us, > as shown in the analysis of /dev/shm/result.log below. The XDP launch time > works correctly in sending 1000 packets continuously. > Min delta: 08.410 us > Avr delta: 13.854 us > Max delta: 17.076 us > Total packets forwarded: 1000 > > Reviewed-by: Choong Yong Liang > Signed-off-by: Song Yoong Siang Sorry haven't looked here in previous review approaches :/ > --- > drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 + > 2 files changed, 15 insertions(+) > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h > b/drivers/net/ethernet/stmicro/stmmac/stmmac.h > index f05cae103d83..925d8b97a42b 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h > @@ -106,6 +106,8 @@ struct stmmac_metadata_request { > struct stmmac_priv *priv; > struct dma_desc *tx_desc; > bool *set_ic; > + struct dma_edesc *edesc; > + int tbs; wanted to comment you're introducing holes here but set_ic is a ptr:) > }; > > struct stmmac_xsk_tx_complete { > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > index d04543e5697b..5e5d24924ce7 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > @@ -2514,9 +2514,20 @@ static u64 stmmac_xsk_fill_timestamp(void *_priv) > return 0; > } > > +static void stmmac_xsk_request_launch_time(u64 launch_time, void *_priv) > +{ > + struct stmmac_metadata_request *meta_req = _priv; > + struct timespec64 ts = ns_to_timespec64(launch_time); nit: Apply reverse christmas tree rule here (order the variable definitions from longest to shortest), but it's not a showstopper I guess... > + > + if (meta_req->tbs & STMMAC_TBS_EN) > + stmmac_set_desc_tbs(meta_req->priv, meta_req->edesc, ts.tv_sec, > + ts.tv_nsec); > +} > + > static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { > .tmo_request_timestamp = stmmac_xsk_request_timestamp, > .tmo_fill_timestamp
Re: [Intel-wired-lan] [PATCH bpf-next v9 5/5] igc: Add launch time support to XDP ZC
On Thu, Feb 06, 2025 at 02:04:08PM +0800, Song Yoong Siang wrote: > Enable Launch Time Control (LTC) support for XDP zero copy via XDP Tx > metadata framework. > > This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata > on Intel I225-LM Ethernet controller. Below are the test steps and result. > > Test 1: Send a single packet with the launch time set to 1 s in the future. > > Test steps: > 1. On the DUT, start the xdp_hw_metadata selftest application: >$ sudo ./xdp_hw_metadata enp2s0 -l 10 -L 1 > > 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 >of the DUT. > > Result: > When the launch time is set to 1 s in the future, the delta between the > launch time and the transmit hardware timestamp is 0.016 us, as shown in > printout of the xdp_hw_metadata application below. > 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP > rx_hash: 0xE343384 with RSS type:0x1 > HW RX-time: 1734578015467548904 (sec:1734578015.4675) > delta to User RX-time sec:0.0002 (183.103 usec) > XDP RX-time: 1734578015467651698 (sec:1734578015.4677) > delta to User RX-time sec:0.0001 (80.309 usec) > No rx_vlan_tci or rx_vlan_proto, err=-95 > 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) > csum_start=34 csum_offset=6 > HW RX-time: 1734578015467548904 (sec:1734578015.4675) > delta to HW Launch-time sec:1. (100.000 usec) > 0x562ff5dc8880: complete tx idx=4 addr=4018 > HW Launch-time: 1734578016467548904 (sec:1734578016.4675) > delta to HW TX-complete-time sec:0. (0.016 usec) > HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) > delta to User TX-complete-time sec:0. > (32.546 usec) > XDP RX-time: 1734578015467651698 (sec:1734578015.4677) > delta to User TX-complete-time sec:0. > (29.768 usec) > HW RX-time: 1734578015467548904 (sec:1734578015.4675) > delta to HW TX-complete-time sec:1. (100.016 usec) > 0x562ff5dc8880: complete rx idx=132 addr=84110 > > Test 2: Send 1000 packets with a 10 ms interval and the launch time set to > 500 us in the future. > > Test steps: > 1. On the DUT, start the xdp_hw_metadata selftest application: >$ sudo chrt -f 99 ./xdp_hw_metadata enp2s0 -l 50 -L 1 > \ > /dev/shm/result.log > > 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and >VLAN priority 1 to port 9091 of the DUT. > > Result: > When the launch time is set to 500 us in the future, the average delta > between the launch time and the transmit hardware timestamp is 0.016 us, > as shown in the analysis of /dev/shm/result.log below. The XDP launch time > works correctly in sending 1000 packets continuously. > Min delta: 0.005 us > Avr delta: 0.016 us > Max delta: 0.031 us > Total packets forwarded: 1000 > > Signed-off-by: Song Yoong Siang Reviewed-by: Maciej Fijalkowski also with one nit. Thanks! > --- > drivers/net/ethernet/intel/igc/igc.h | 1 + > drivers/net/ethernet/intel/igc/igc_main.c | 57 ++- > 2 files changed, 56 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igc/igc.h > b/drivers/net/ethernet/intel/igc/igc.h > index b8111ad9a9a8..cd1d7b6c1782 100644 > --- a/drivers/net/ethernet/intel/igc/igc.h > +++ b/drivers/net/ethernet/intel/igc/igc.h > @@ -579,6 +579,7 @@ struct igc_metadata_request { > struct xsk_tx_metadata *meta; > struct igc_ring *tx_ring; > u32 cmd_type; > + u16 used_desc; > }; > > struct igc_q_vector { > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c > b/drivers/net/ethernet/intel/igc/igc_main.c > index 3df608601a4b..f239f744247d 100644 > --- a/drivers/net/ethernet/intel/igc/igc_main.c > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > @@ -2973,9 +2973,44 @@ static u64 igc_xsk_fill_timestamp(void *_priv) > return *(u64 *)_priv; > } > > +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) > +{ > + struct igc_metadata_request *meta_req = _priv; > + struct igc_ring *tx_ring = meta_req->tx_ring; > + __le32 launch_time_offset; > + bool insert_empty = false; > + bool first_flag = false; > + > + if (!tx_ring->launchtime_enable) > + return; > + > + launch_time_offset = igc_tx_launchtime(tx_ring, > +ns_to_ktime(launch_time), > +&first_flag, &insert_empty); > + if (insert_empty) { > + /* Disregard the launch time request if the required empty frame > + * fails to be inserted. > + */ > + if (igc_insert_empty_frame(tx_ring)) > + return; > + > + meta_req->tx_buffer = > + &tx_ring->
Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12
On Thu, 6 Feb 2025 15:17:00 +0200 "Lifshits, Vitaly" wrote: > On 2/6/2025 6:13 AM, Stephen Hemminger wrote: > > On Wed, 5 Feb 2025 12:36:31 +0200 > > "Lifshits, Vitaly" wrote: > > > >> On 1/31/2025 3:21 AM, Stephen Hemminger wrote: > >>> On Thu, 30 Jan 2025 21:17:30 +0200 > >>> "Lifshits, Vitaly" wrote: > >>> > On 1/30/2025 7:11 PM, Stephen Hemminger wrote: > > I am using: > > > > 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller > > I226-LM (rev 04) > > Subsystem: Intel Corporation Device > > Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group > > 20 > > Memory at 6c50 (32-bit, non-prefetchable) [size=1M] > > Memory at 6c60 (32-bit, non-prefetchable) [size=16K] > > Capabilities: [40] Power Management version 3 > > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > > Capabilities: [70] MSI-X: Enable+ Count=5 Masked- > > Capabilities: [a0] Express Endpoint, IntMsgNum 0 > > Capabilities: [100] Advanced Error Reporting > > Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d > > Capabilities: [1c0] Latency Tolerance Reporting > > Capabilities: [1f0] Precision Time Measurement > > Capabilities: [1e0] L1 PM Substates > > Kernel driver in use: igc > > Kernel modules: igc > > > > > > Using both Debian testing and my own kernel built from 6.12, the igc > > driver appears broken after resume. > > From which system state are you resuming? > > > > > After resuming the device is down and no address present. > > Attempts to set link up manually fail. > > Did you get any errors in the dmesg log? > What is the firmware version on your device (you can get it by running > ethtool -i)? > > > If I do rmmod/modprobe of igc it comes back. > > > > Doing a bit of bisectting but it is slow going. > > Meanwhile, we'll also try to reproduce this issue in our lab. Could you > share more details about your system so we can create a similar setup? > >>> > >>> Given that error reported is -ENODEV, might be a generic netdev problem > >>> not > >>> just for igc device. > >>> > >> > >> We weren't able to reproduce this issue on our systems, even though we > >> tried several suspend-resume cycles on different kernels and different > >> systems. > >> > >> However, a few days ago we received a comment in a BZ about an issue > >> similar to yours. In there adding a short delay in igc_resume function > >> https://bugzilla.kernel.org/show_bug.cgi?id=219143 > >> https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 > >> > >> > >> > >> Can you try to see if it fixes your issue as well? > > > > I tried the proposed delay and it had no impact. > > Any idea of other things to instrument? > > > > > Has the adapter worked with a different kernel? Can you try to reproduce > the issue over kernel 6.9? > > Is the LAN cable connected to the igc adapter? Does it maintain link > during suspend? > > Also, I saw that on your board you have three more adapters, I assume > that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also > happen to enp87s0? This is a new machine, and not sure if it ever worked. I can boot some older distro via USB if that helps. The LAN cable is always connected (it is a desktop box), and the 10G NIC's are not used; they are connected by a loopback cable and used for DPDK testing occasionally. It does work in Windows...
Re: [Intel-wired-lan] suspend/resume broken of igc driver broken on 6.12
On 2/6/2025 6:13 AM, Stephen Hemminger wrote: On Wed, 5 Feb 2025 12:36:31 +0200 "Lifshits, Vitaly" wrote: On 1/31/2025 3:21 AM, Stephen Hemminger wrote: On Thu, 30 Jan 2025 21:17:30 +0200 "Lifshits, Vitaly" wrote: On 1/30/2025 7:11 PM, Stephen Hemminger wrote: I am using: 5a:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-LM (rev 04) Subsystem: Intel Corporation Device Flags: bus master, fast devsel, latency 0, IRQ 19, IOMMU group 20 Memory at 6c50 (32-bit, non-prefetchable) [size=1M] Memory at 6c60 (32-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=5 Masked- Capabilities: [a0] Express Endpoint, IntMsgNum 0 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 58-47-ca-ff-ff-7a-98-3d Capabilities: [1c0] Latency Tolerance Reporting Capabilities: [1f0] Precision Time Measurement Capabilities: [1e0] L1 PM Substates Kernel driver in use: igc Kernel modules: igc Using both Debian testing and my own kernel built from 6.12, the igc driver appears broken after resume. From which system state are you resuming? After resuming the device is down and no address present. Attempts to set link up manually fail. Did you get any errors in the dmesg log? What is the firmware version on your device (you can get it by running ethtool -i)? If I do rmmod/modprobe of igc it comes back. Doing a bit of bisectting but it is slow going. Meanwhile, we'll also try to reproduce this issue in our lab. Could you share more details about your system so we can create a similar setup? Given that error reported is -ENODEV, might be a generic netdev problem not just for igc device. We weren't able to reproduce this issue on our systems, even though we tried several suspend-resume cycles on different kernels and different systems. However, a few days ago we received a comment in a BZ about an issue similar to yours. In there adding a short delay in igc_resume function https://bugzilla.kernel.org/show_bug.cgi?id=219143 https://bugzilla.kernel.org/show_bug.cgi?id=219143#c123 Can you try to see if it fixes your issue as well? I tried the proposed delay and it had no impact. Any idea of other things to instrument? Has the adapter worked with a different kernel? Can you try to reproduce the issue over kernel 6.9? Is the LAN cable connected to the igc adapter? Does it maintain link during suspend? Also, I saw that on your board you have three more adapters, I assume that enp2s0f0np0 and enp2s0f0np1 are i40e adapters. Does this issue also happen to enp87s0?
Re: [Intel-wired-lan] [PATCH iwl-next v1 01/13] ixgbe: add initial devlink support
On 2/3/2025 7:03 AM, Jedrzej Jagielski wrote: Add an initial support for devlink interface to ixgbe driver. Similarly to i40e driver the implementation doesn't enable devlink to manage device-wide configuration. Devlink instance is created for each physical function of PCIe device. Create separate directory for devlink related ixgbe files and use naming scheme similar to the one used in the ice driver. Add a stub for Documentation, to be extended by further patches. Reviewed-by: Mateusz Polchlopek Signed-off-by: Jedrzej Jagielski ... @@ -11283,6 +11284,10 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_ioremap; } + err = ixgbe_allocate_devlink(adapter); + if (err) + goto err_ioremap; Looks like the unroll has an issue. Smatch reports: ../drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:11844 ixgbe_probe() warn: 'hw->hw_addr' from ioremap() not released on lines: 11844. Thanks, Tony + netdev->netdev_ops = &ixgbe_netdev_ops; ixgbe_set_ethtool_ops(netdev); netdev->watchdog_timeo = 5 * HZ;
Re: [Intel-wired-lan] [PATCH iwl-net v2] ice: health.c: fix compilation on gcc 7.5
> From: Kitszel, Przemyslaw > [...] > Subject: [PATCH iwl-net v2] ice: health.c: fix compilation on gcc 7.5 > > GCC 7 is not as good as GCC 8+ in telling what is a compile-time const, and > thus could be used for static storage. > Fortunately keeping strings as const arrays is enough to make old gcc happy. > > Excerpt from the report: > My GCC is: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0. > > CC [M] drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.o > drivers/net/ethernet/intel/ice/devlink/health.c:35:3: error: initializer > element > is not constant >ice_common_port_solutions, {ice_port_number_label}}, >^ > drivers/net/ethernet/intel/ice/devlink/health.c:35:3: note: (near > initialization > for 'ice_health_status_lookup[0].solution') > drivers/net/ethernet/intel/ice/devlink/health.c:35:31: error: initializer > element is not constant >ice_common_port_solutions, {ice_port_number_label}}, >^ > drivers/net/ethernet/intel/ice/devlink/health.c:35:31: note: (near > initialization > for 'ice_health_status_lookup[0].data_label[0]') > drivers/net/ethernet/intel/ice/devlink/health.c:37:46: error: initializer > element is not constant >"Change or replace the module or cable.", {ice_port_number_label}}, > ^ > drivers/net/ethernet/intel/ice/devlink/health.c:37:46: note: (near > initialization > for 'ice_health_status_lookup[1].data_label[0]') > drivers/net/ethernet/intel/ice/devlink/health.c:39:3: error: initializer > element > is not constant >ice_common_port_solutions, {ice_port_number_label}}, >^ > > Fixes: 85d6164ec56d ("ice: add fw and port health reporters") > Reported-by: Qiuxu Zhuo > Closes: > https://lore.kernel.org/netdev/CY8PR11MB7134BF7A46D71E50D25FA7A989F > 7...@cy8pr11mb7134.namprd11.prod.outlook.com > Reviewed-by: Michal Swiatkowski > Suggested-by: Simon Horman > Signed-off-by: Przemek Kitszel Tested-by: Qiuxu Zhuo Thanks!
[Intel-wired-lan] [PATCH bpf-next v10 5/5] igc: Add launch time support to XDP ZC
Enable Launch Time Control (LTC) support for XDP zero copy via XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel I225-LM Ethernet controller. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp2s0 -l 10 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in printout of the xdp_hw_metadata application below. 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP rx_hash: 0xE343384 with RSS type:0x1 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX-time sec:0.0002 (183.103 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX-time sec:0.0001 (80.309 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 csum_offset=6 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW Launch-time sec:1. (100.000 usec) 0x562ff5dc8880: complete tx idx=4 addr=4018 HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW TX-complete-time sec:0. (0.016 usec) HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta to User TX-complete-time sec:0. (32.546 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX-complete-time sec:0. (29.768 usec) HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX-complete-time sec:1. (100.016 usec) 0x562ff5dc8880: complete rx idx=132 addr=84110 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp2s0 -l 50 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 0.005 us Avr delta: 0.016 us Max delta: 0.031 us Total packets forwarded: 1000 Reviewed-by: Maciej Fijalkowski Signed-off-by: Song Yoong Siang --- drivers/net/ethernet/intel/igc/igc.h | 1 + drivers/net/ethernet/intel/igc/igc_main.c | 61 ++- 2 files changed, 60 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index b8111ad9a9a8..cd1d7b6c1782 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -579,6 +579,7 @@ struct igc_metadata_request { struct xsk_tx_metadata *meta; struct igc_ring *tx_ring; u32 cmd_type; + u16 used_desc; }; struct igc_q_vector { diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 1bfa71545e37..3044392e8ded 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2971,9 +2971,48 @@ static u64 igc_xsk_fill_timestamp(void *_priv) return *(u64 *)_priv; } +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) +{ + struct igc_metadata_request *meta_req = _priv; + struct igc_ring *tx_ring = meta_req->tx_ring; + __le32 launch_time_offset; + bool insert_empty = false; + bool first_flag = false; + u16 used_desc = 0; + + if (!tx_ring->launchtime_enable) + return; + + launch_time_offset = igc_tx_launchtime(tx_ring, + ns_to_ktime(launch_time), + &first_flag, &insert_empty); + if (insert_empty) { + /* Disregard the launch time request if the required empty frame +* fails to be inserted. +*/ + if (igc_insert_empty_frame(tx_ring)) + return; + + meta_req->tx_buffer = + &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + /* Inserting an empty packet requires two descriptors: +* one data descriptor and one context descriptor. +*/ + used_desc += 2; + } + + /* Us
[Intel-wired-lan] [PATCH iwl-net v2] ice: health.c: fix compilation on gcc 7.5
GCC 7 is not as good as GCC 8+ in telling what is a compile-time const, and thus could be used for static storage. Fortunately keeping strings as const arrays is enough to make old gcc happy. Excerpt from the report: My GCC is: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0. CC [M] drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.o drivers/net/ethernet/intel/ice/devlink/health.c:35:3: error: initializer element is not constant ice_common_port_solutions, {ice_port_number_label}}, ^ drivers/net/ethernet/intel/ice/devlink/health.c:35:3: note: (near initialization for 'ice_health_status_lookup[0].solution') drivers/net/ethernet/intel/ice/devlink/health.c:35:31: error: initializer element is not constant ice_common_port_solutions, {ice_port_number_label}}, ^ drivers/net/ethernet/intel/ice/devlink/health.c:35:31: note: (near initialization for 'ice_health_status_lookup[0].data_label[0]') drivers/net/ethernet/intel/ice/devlink/health.c:37:46: error: initializer element is not constant "Change or replace the module or cable.", {ice_port_number_label}}, ^ drivers/net/ethernet/intel/ice/devlink/health.c:37:46: note: (near initialization for 'ice_health_status_lookup[1].data_label[0]') drivers/net/ethernet/intel/ice/devlink/health.c:39:3: error: initializer element is not constant ice_common_port_solutions, {ice_port_number_label}}, ^ Fixes: 85d6164ec56d ("ice: add fw and port health reporters") Reported-by: Qiuxu Zhuo Closes: https://lore.kernel.org/netdev/cy8pr11mb7134bf7a46d71e50d25fa7a989...@cy8pr11mb7134.namprd11.prod.outlook.com Reviewed-by: Michal Swiatkowski Suggested-by: Simon Horman Signed-off-by: Przemek Kitszel --- v2: use static const char[] instead of #define - Simon +added RB tag from Michal, but not adding TB tag from Qiuxu v1: https://lore.kernel.org/netdev/20250205104252.30464-2-przemyslaw.kits...@intel.com CC: Kees Cook CC: Jiri Slaby --- drivers/net/ethernet/intel/ice/devlink/health.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c b/drivers/net/ethernet/intel/ice/devlink/health.c index ea40f7941259..19c3d37aa768 100644 --- a/drivers/net/ethernet/intel/ice/devlink/health.c +++ b/drivers/net/ethernet/intel/ice/devlink/health.c @@ -25,10 +25,10 @@ struct ice_health_status { * The below lookup requires to be sorted by code. */ -static const char *const ice_common_port_solutions = +static const char ice_common_port_solutions[] = "Check your cable connection. Change or replace the module or cable. Manually set speed and duplex."; -static const char *const ice_port_number_label = "Port Number"; -static const char *const ice_update_nvm_solution = "Update to the latest NVM image."; +static const char ice_port_number_label[] = "Port Number"; +static const char ice_update_nvm_solution[] = "Update to the latest NVM image."; static const struct ice_health_status ice_health_status_lookup[] = { {ICE_AQC_HEALTH_STATUS_ERR_UNKNOWN_MOD_STRICT, "An unsupported module was detected.", base-commit: 4241a702e0d0c2ca9364cfac08dbf134264962de -- 2.46.0
[Intel-wired-lan] [PATCH bpf-next v10 1/5] xsk: Add launch time hardware offload support to XDP Tx metadata
Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev Acked-by: Stanislav Fomichev Signed-off-by: Song Yoong Siang --- Documentation/netlink/specs/netdev.yaml | 4 ++ Documentation/networking/xsk-tx-metadata.rst | 62 include/net/xdp_sock.h | 10 include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 10 include/uapi/linux/netdev.h | 3 + net/core/netdev-genl.c | 2 + net/xdp/xsk.c| 3 + tools/include/uapi/linux/if_xdp.h| 10 tools/include/uapi/linux/netdev.h| 3 + 10 files changed, 108 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index cbb544bd6c84..901b5afb3df0 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -70,6 +70,10 @@ definitions: name: tx-checksum doc: L3 checksum HW offload is supported by the driver. + - +name: tx-launch-time-fifo +doc: + Launch time HW offload is supported by the driver. - name: queue-type type: enum diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index e76b0cfc32f7..df53a10ccac3 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -50,6 +50,10 @@ The flags field enables the particular offload: checksum. ``csum_start`` specifies byte offset of where the checksumming should start and ``csum_offset`` specifies byte offset where the device should store the computed checksum. +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the + packet for transmission at a pre-determined time called launch time. The + value of launch time is indicated by ``launch_time`` field of + ``union xsk_tx_metadata``. Besides the flags above, in order to trigger the offloads, the first packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` @@ -65,6 +69,63 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum is calculated on the CPU. Do not enable this option in production because it will negatively affect performance. +Launch Time +=== + +The value of the requested launch time should be based on the device's PTP +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path +compared to the ETF queuing discipline, which organizes packets and delays +their transmission. Instead, AF_XDP immediately hands off the packets to +the device driver without rearranging their order or holding them prior to +transmission. Since the driver maintains FIFO behavior and does not perform +packet reordering, a packet with a launch time request will block other +packets in the same Tx Queue until it is sent. Therefore, it is recommended +to allocate separate queue for scheduling traffic that is intended for +future transmission. + +In scenarios where the launch time offload feature is disabled, the device +driver is expected to disregard the launch time request. For correct +interpretation and meaningful operation, the launch time should never be +set to a value larger than the farthest programmable time in the future +(the horizon). Different devices have different hardware limitations on the +launch time offload feature. + +stmmac driver +- + +For stmmac, TSO and launch time (TBS) features are mutually exclusive for +each individual Tx Queue. By default, the driver configures Tx Queue 0 to +support TSO and the rest of the Tx Queues to support TBS. The launch time +hardware offload feature can be enabled or disabled by using the tc-etf +command to call the driver's ndo_setup_tc() callback. + +The value of the launch time that is programmed in the Enhanced Normal +Transmit Descriptors is a 32-bit value, where the most significant 8 bits +represent the time in seconds and the remaining 24 bits represent the time +in 256 ns increments. The programmed launch time is compared against the +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the +future. + +igc driver +-- + +For igc, all four Tx Queues support the launch time feature. The launch +time hardware offload feature can be enabled or disabled by using the +tc-etf command to call the driver's ndo_setup_tc() callback. When entering +TSN mode, the igc driver will reset the device and create a default Qbv +schedule with a 1-second cycle time, with all Tx Queues
[Intel-wired-lan] [PATCH bpf-next v10 0/5] xsk: TX metadata Launch Time support
This series expands the XDP TX metadata framework to allow user applications to pass per packet 64-bit launch time directly to the kernel driver, requesting launch time hardware offload support. The XDP TX metadata framework will not perform any clock conversion or packet reordering. Please note that the role of Tx metadata is just to pass the launch time, not to enable the offload feature. Users will need to enable the launch time hardware offload feature of the device by using the respective command, such as the tc-etf command. Although some devices use the tc-etf command to enable their launch time hardware offload feature, xsk packets will not go through the etf qdisc. Therefore, in my opinion, the launch time should always be based on the PTP Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the clock source. To simplify the test steps, I modified the xdp_hw_metadata bpf self-test tool in such a way that it will set the launch time based on the offset provided by the user and the value of the Receive Hardware Timestamp, which is against the PHC. This will eliminate the need to discipline System Clock with the PHC and then use clock_gettime() to get the time. Please note that AF_XDP lacks a feedback mechanism to inform the application if the requested launch time is invalid. So, users are expected to familiar with the horizon of the launch time of the device they use and not request a launch time that is beyond the horizon. Otherwise, the driver might interpret the launch time incorrectly and react wrongly. For stmmac and igc, where modulo computation is used, a launch time larger than the horizon will cause the device to transmit the packet earlier that the requested launch time. Although there is no feedback mechanism for the launch time request for now, user still can check whether the requested launch time is working or not, by requesting the Transmit Completion Hardware Timestamp. v10: - use net_err_ratelimited(), instead of net_ratelimit() (Maciej) - accumulate the amount of used descs in local variable and update the igc_metadata_request::used_desc once (Maciej) - Ensure reverse christmas tree rule (Maciej) V9: https://lore.kernel.org/netdev/20250206060408.808325-1-yoong.siang.s...@intel.com/ - Remove the igc_desc_unused() checking (Maciej) - Ensure that skb allocation and DMA mapping work before proceeding to fill in igc_tx_buffer info, context desc, and data desc (Maciej) - Rate limit the error messages (Maciej) - Update the comment to indicate that the 2 descriptors needed by the empty frame are already taken into consideration (Maciej) - Handle the case where the insertion of an empty frame fails and explain the reason behind (Maciej) - put self SOB tag as last tag (Maciej) V8: https://lore.kernel.org/netdev/20250205024116.798862-1-yoong.siang.s...@intel.com/ - check the number of used descriptor in xsk_tx_metadata_request() by using used_desc of struct igc_metadata_request, and then decreases the budget with it (Maciej) - submit another bug fix patch to set the buffer type for empty frame (Maciej): https://lore.kernel.org/netdev/20250205023603.798819-1-yoong.siang.s...@intel.com/ V7: https://lore.kernel.org/netdev/20250204004907.789330-1-yoong.siang.s...@intel.com/ - split the refactoring code of igc empty packet insertion into a separate commit (Faizal) - add explanation on why the value "4" is used as igc transmit budget (Faizal) - perform a stress test by sending 1000 packets with 10ms interval and launch time set to 500us in the future (Faizal & Yong Liang) V6: https://lore.kernel.org/netdev/20250116155350.555374-1-yoong.siang.s...@intel.com/ - fix selftest build errors by using asprintf() and realloc(), instead of managing the buffer sizes manually (Daniel, Stanislav) V5: https://lore.kernel.org/netdev/20250114152718.120588-1-yoong.siang.s...@intel.com/ - change netdev feature name from tx-launch-time to tx-launch-time-fifo to explicitly state the FIFO behaviour (Stanislav) - improve the looping of xdp_hw_metadata app to wait for packet tx completion to be more readable by using clock_gettime() (Stanislav) - add launch time setup steps into xdp_hw_metadata app (Stanislav) V4: https://lore.kernel.org/netdev/20250106135506.9687-1-yoong.siang.s...@intel.com/ - added XDP launch time support to the igc driver (Jesper & Florian) - added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper) - added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub) - added step to enable launch time in the commit message (Jesper & Willem) - explicitly documented the type of launch_time and which clock source it is against (Willem) V3: https://lore.kernel.org/netdev/20231203165129.1740512-1-yoong.siang.s...@intel.com/ - renamed to use launch time (Jesper & Willem) - changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s beca
[Intel-wired-lan] [PATCH bpf-next v10 2/5] selftests/bpf: Add launch time request to xdp_hw_metadata
Add launch time hardware offload request to xdp_hw_metadata. Users can configure the delta of launch time relative to HW RX-time using the "-l" argument. By default, the delta is set to 0 ns, which means the launch time is disabled. By setting the delta to a non-zero value, the launch time hardware offload feature will be enabled and requested. Additionally, users can configure the Tx Queue to be enabled with the launch time hardware offload using the "-L" argument. By default, Tx Queue 0 will be used. Acked-by: Stanislav Fomichev Signed-off-by: Song Yoong Siang --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 168 +- 1 file changed, 163 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 6f7b15d6c6ed..3d8de0d4c96a 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -13,6 +13,7 @@ * - UDP 9091 packets trigger TX reply * - TX HW timestamp is requested and reported back upon completion * - TX checksum is requested + * - TX launch time HW offload is requested for transmission */ #include @@ -37,6 +38,15 @@ #include #include #include +#include +#include +#include +#include +#include +#include +#include +#include +#include #include "xdp_metadata.h" @@ -64,6 +74,18 @@ int rxq; bool skip_tx; __u64 last_hw_rx_timestamp; __u64 last_xdp_rx_timestamp; +__u64 last_launch_time; +__u64 launch_time_delta_to_hw_rx_timestamp; +int launch_time_queue; + +#define run_command(cmd, ...) \ +({ \ + char command[1024]; \ + memset(command, 0, sizeof(command));\ + snprintf(command, sizeof(command), cmd, ##__VA_ARGS__); \ + fprintf(stderr, "Running: %s\n", command); \ + system(command);\ +}) void test__fail(void) { /* for network_helpers.c */ } @@ -298,6 +320,12 @@ static bool complete_tx(struct xsk *xsk, clockid_t clock_id) if (meta->completion.tx_timestamp) { __u64 ref_tstamp = gettime(clock_id); + if (launch_time_delta_to_hw_rx_timestamp) { + print_tstamp_delta("HW Launch-time", + "HW TX-complete-time", + last_launch_time, + meta->completion.tx_timestamp); + } print_tstamp_delta("HW TX-complete-time", "User TX-complete-time", meta->completion.tx_timestamp, ref_tstamp); print_tstamp_delta("XDP RX-time", "User TX-complete-time", @@ -395,6 +423,17 @@ static void ping_pong(struct xsk *xsk, void *rx_packet, clockid_t clock_id) xsk, ntohs(udph->check), ntohs(want_csum), meta->request.csum_start, meta->request.csum_offset); + /* Set the value of launch time */ + if (launch_time_delta_to_hw_rx_timestamp) { + meta->flags |= XDP_TXMD_FLAGS_LAUNCH_TIME; + meta->request.launch_time = last_hw_rx_timestamp + + launch_time_delta_to_hw_rx_timestamp; + last_launch_time = meta->request.launch_time; + print_tstamp_delta("HW RX-time", "HW Launch-time", + last_hw_rx_timestamp, + meta->request.launch_time); + } + memcpy(data, rx_packet, len); /* don't share umem chunk for simplicity */ tx_desc->options |= XDP_TX_METADATA; tx_desc->len = len; @@ -407,6 +446,7 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t const struct xdp_desc *rx_desc; struct pollfd fds[rxq + 1]; __u64 comp_addr; + __u64 deadline; __u64 addr; __u32 idx = 0; int ret; @@ -477,9 +517,15 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t if (ret) printf("kick_tx ret=%d\n", ret); - for (int j = 0; j < 500; j++) { + /* wait 1 second + cover launch time */ + deadline = gettime(clock_id) + + NANOSEC_PER_SEC + + launch_time_delta_to_hw_rx_timestamp; + while (true) { if (complete_tx(xsk, clock_id)) break; + if (gettime(clock_id) >= deadline) +
[Intel-wired-lan] [PATCH bpf-next v10 3/5] net: stmmac: Add launch time support to XDP ZC
Enable launch time (Time-Based Scheduling) support for XDP zero copy via the XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp0s30f4 -l 10 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 16.963 us, as shown in printout of the xdp_hw_metadata application below. 0x55b5864717a8: rx_desc[4]->addr=88100 addr=88100 comp_addr=88100 EoP No rx_hash, err=-95 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to User RX-time sec:0.0004 (375.624 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User RX-time sec:0.0001 (88.498 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x55b5864717a8: ping-pong with csum=5619 (want ) csum_start=34 csum_offset=6 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW Launch-time sec:1. (100.000 usec) 0x55b5864717a8: complete tx idx=4 addr=4018 HW Launch-time: 1734579066767717328 (sec:1734579066.7677) delta to HW TX-complete-time sec:0. (16.963 usec) HW TX-complete-time: 1734579066767734291 (sec:1734579066.7677) delta to User TX-complete-time sec:0.0001 (130.408 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User TX-complete-time sec:0. (999860.245 usec) HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW TX-complete-time sec:1. (116.963 usec) 0x55b5864717a8: complete rx idx=132 addr=88100 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp0s30f4 -l 50 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 13.854 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 08.410 us Avr delta: 13.854 us Max delta: 17.076 us Total packets forwarded: 1000 Reviewed-by: Choong Yong Liang Signed-off-by: Song Yoong Siang --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 + 2 files changed, 15 insertions(+) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index f05cae103d83..925d8b97a42b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -106,6 +106,8 @@ struct stmmac_metadata_request { struct stmmac_priv *priv; struct dma_desc *tx_desc; bool *set_ic; + struct dma_edesc *edesc; + int tbs; }; struct stmmac_xsk_tx_complete { diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index d04543e5697b..ff67dc4ecf72 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2514,9 +2514,20 @@ static u64 stmmac_xsk_fill_timestamp(void *_priv) return 0; } +static void stmmac_xsk_request_launch_time(u64 launch_time, void *_priv) +{ + struct timespec64 ts = ns_to_timespec64(launch_time); + struct stmmac_metadata_request *meta_req = _priv; + + if (meta_req->tbs & STMMAC_TBS_EN) + stmmac_set_desc_tbs(meta_req->priv, meta_req->edesc, ts.tv_sec, + ts.tv_nsec); +} + static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { .tmo_request_timestamp = stmmac_xsk_request_timestamp, .tmo_fill_timestamp = stmmac_xsk_fill_timestamp, + .tmo_request_launch_time= stmmac_xsk_request_launch_time, }; static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) @@ -2600,6 +2611,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) meta_req.priv = priv; meta_req.tx_desc = tx_desc; meta_req.set_ic = &set_ic; + meta_req.tbs = tx_q->tbs; + meta_req.edesc = &tx_q->dma_entx[entry];
[Intel-wired-lan] [PATCH bpf-next v10 4/5] igc: Refactor empty frame insertion for launch time support
Refactor the code for inserting an empty frame into a new function igc_insert_empty_frame(). This change extracts the logic for inserting an empty packet from igc_xmit_frame_ring() into a separate function, allowing it to be reused in future implementations, such as the XDP zero copy transmit function. Remove the igc_desc_unused() checking in igc_init_tx_empty_descriptor() because the number of descriptors needed is guaranteed. Ensure that skb allocation and DMA mapping work for the empty frame, before proceeding to fill in igc_tx_buffer info, context descriptor, and data descriptor. Rate limit the error messages for skb allocation and DMA mapping failures. Update the comment to indicate that the 2 descriptors needed by the empty frame are already taken into consideration in igc_xmit_frame_ring(). Handle the case where the insertion of an empty frame fails and explain the reason behind this handling. Reviewed-by: Maciej Fijalkowski Signed-off-by: Song Yoong Siang --- drivers/net/ethernet/intel/igc/igc_main.c | 82 ++- 1 file changed, 50 insertions(+), 32 deletions(-) diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 84307bb7313e..1bfa71545e37 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1092,7 +1092,8 @@ static int igc_init_empty_frame(struct igc_ring *ring, dma = dma_map_single(ring->dev, skb->data, size, DMA_TO_DEVICE); if (dma_mapping_error(ring->dev, dma)) { - netdev_err_once(ring->netdev, "Failed to map DMA for TX\n"); + net_err_ratelimited("%s: DMA mapping error for empty frame\n", + netdev_name(ring->netdev)); return -ENOMEM; } @@ -1108,20 +1109,12 @@ static int igc_init_empty_frame(struct igc_ring *ring, return 0; } -static int igc_init_tx_empty_descriptor(struct igc_ring *ring, - struct sk_buff *skb, - struct igc_tx_buffer *first) +static void igc_init_tx_empty_descriptor(struct igc_ring *ring, +struct sk_buff *skb, +struct igc_tx_buffer *first) { union igc_adv_tx_desc *desc; u32 cmd_type, olinfo_status; - int err; - - if (!igc_desc_unused(ring)) - return -EBUSY; - - err = igc_init_empty_frame(ring, first, skb); - if (err) - return err; cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT | IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD | @@ -1140,8 +1133,6 @@ static int igc_init_tx_empty_descriptor(struct igc_ring *ring, ring->next_to_use++; if (ring->next_to_use == ring->count) ring->next_to_use = 0; - - return 0; } #define IGC_EMPTY_FRAME_SIZE 60 @@ -1567,6 +1558,40 @@ static bool igc_request_tx_tstamp(struct igc_adapter *adapter, struct sk_buff *s return false; } +static int igc_insert_empty_frame(struct igc_ring *tx_ring) +{ + struct igc_tx_buffer *empty_info; + struct sk_buff *empty_skb; + void *data; + int ret; + + empty_info = &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + empty_skb = alloc_skb(IGC_EMPTY_FRAME_SIZE, GFP_ATOMIC); + if (unlikely(!empty_skb)) { + net_err_ratelimited("%s: skb alloc error for empty frame\n", + netdev_name(tx_ring->netdev)); + return -ENOMEM; + } + + data = skb_put(empty_skb, IGC_EMPTY_FRAME_SIZE); + memset(data, 0, IGC_EMPTY_FRAME_SIZE); + + /* Prepare DMA mapping and Tx buffer information */ + ret = igc_init_empty_frame(tx_ring, empty_info, empty_skb); + if (unlikely(ret)) { + dev_kfree_skb_any(empty_skb); + return ret; + } + + /* Prepare advanced context descriptor for empty packet */ + igc_tx_ctxtdesc(tx_ring, 0, false, 0, 0, 0); + + /* Prepare advanced data descriptor for empty packet */ + igc_init_tx_empty_descriptor(tx_ring, empty_skb, empty_info); + + return 0; +} + static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, struct igc_ring *tx_ring) { @@ -1586,6 +1611,7 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, * + 1 desc for skb_headlen/IGC_MAX_DATA_PER_TXD, * + 2 desc gap to keep tail from touching head, * + 1 desc for context descriptor, +* + 2 desc for inserting an empty packet for launch time, * otherwise try next time */ for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) @@ -1605,24 +1631,16 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty); if
Re: [Intel-wired-lan] [PATCH net-next v7 1/5] net: move ARFS rmap management to core
On Tue, 4 Feb 2025 15:06:18 -0700 Ahmed Zaki wrote: > +void netif_napi_set_irq_locked(struct napi_struct *napi, int irq) > +{ > + int rc; > + > + /* Remove existing rmap entries */ > + if (napi->dev->rx_cpu_rmap_auto && > + napi->irq != irq && napi->irq > 0) this condition gets a bit hairy by the end of the series. could you add a napi state bit that indicates that a notifier is installed? Then here: if (napi->irq == irq) return; if (test_and_clear_bit(NAPI_STATE_HAS_NOTIFIER, &napi->state)) irq_set_affinity_notifier(napi->irq, NULL); if (irq < 0) return; And you can similarly simplify napi_disable_locked(). Speaking of which, why do the auto-removal in napi_disable() rather than netif_napi_del() ? We don't reinstall on napi_enable() and doing a disable() + enable() is fairly common during driver reconfig. > + irq_set_affinity_notifier(napi->irq, NULL); > + > + napi->irq = irq; > + if (irq > 0) { > + rc = napi_irq_cpu_rmap_add(napi, irq); > + if (rc) > + netdev_warn(napi->dev, "Unable to update ARFS map > (%d)\n", nit: not sure I'd grasp this message as a user, maybe: "Unable to install aRFS CPU to Rx queue mapping" ? Not great either, I guess.
Re: [Intel-wired-lan] [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
On Wed, 5 Feb 2025 08:20:20 -0700 Ahmed Zaki wrote: > >> + if (napi->dev->rx_cpu_rmap_auto) { > >>rc = napi_irq_cpu_rmap_add(napi, irq); > >>if (rc) > >>netdev_warn(napi->dev, "Unable to update ARFS map > >> (%d)\n", > >>rc); > >> + } else if (napi->config && napi->dev->irq_affinity_auto) { > >> + napi->notify.notify = netif_napi_irq_notify; > >> + napi->notify.release = netif_napi_affinity_release; > >> + > >> + rc = irq_set_affinity_notifier(irq, &napi->notify); > >> + if (rc) > >> + netdev_warn(napi->dev, "Unable to set IRQ notifier > >> (%d)\n", > >> + rc); > >>} > > > > Should there be a WARN_ON or WARN_ON_ONCE in here somewhere if the > > driver calls netif_napi_set_irq_locked but did not link NAPI config > > with a call to netif_napi_add_config? > > > > It seems like in that case the driver is buggy and a warning might > > be helpful. > > > > I think that is a good idea, if there is a new version I can add this in > the second part of the if: > > > if (WARN_ON_ONCE(!napi->config)) > return; To be clear, this will make it illegal to set IRQ on a NAPI instance before it's listed. Probably for the best if we also have auto-remove in netif_napi_del().
Re: [Intel-wired-lan] [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
On Tue, 4 Feb 2025 15:06:19 -0700 Ahmed Zaki wrote: > + * @irq_affinity_auto: driver wants the core to manage the IRQ affinity. "manage" is probably too strong? "store" or "remember" ? Your commit message explains it quite nicely. > + * Set by netif_enable_irq_affinity(), then driver must > + * create persistent napi by netif_napi_add_config() > + * and finally bind napi to IRQ (netif_napi_set_irq). > + * > * @rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap. > * Set by calling netif_enable_cpu_rmap(). > * > @@ -2402,6 +2406,7 @@ struct net_device { > struct lock_class_key *qdisc_tx_busylock; > boolproto_down; > boolthreaded; > + boolirq_affinity_auto; > boolrx_cpu_rmap_auto; > > /* priv_flags_slow, ungrouped to save space */ > @@ -2662,6 +2667,11 @@ static inline void netdev_set_ml_priv(struct > net_device *dev, > dev->ml_priv_type = type; > } > > +static inline void netif_enable_irq_affinity(struct net_device *dev) Similar here, "enable affinity" is a bit strong. netif_remember_irq_affinity() would be more accurate IMHO > +{ > + dev->irq_affinity_auto = true; > +}
[Intel-wired-lan] [tnguy-next-queue:100GbE] BUILD SUCCESS d67c484647606b83c9e5cc7d445ff7d7d2012fa2
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git 100GbE branch HEAD: d67c484647606b83c9e5cc7d445ff7d7d2012fa2 ice: Implement PTP support for E830 devices elapsed time: 1448m configs tested: 93 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. tested configs: alpha allnoconfiggcc-14.2.0 alphaallyesconfiggcc-14.2.0 arc allmodconfiggcc-13.2.0 arc allnoconfiggcc-13.2.0 arc allyesconfiggcc-13.2.0 arc randconfig-001-20250206gcc-13.2.0 arc randconfig-002-20250206gcc-13.2.0 arm allmodconfiggcc-14.2.0 arm allnoconfigclang-17 arm allyesconfiggcc-14.2.0 arm randconfig-001-20250206clang-21 arm randconfig-002-20250206clang-19 arm randconfig-003-20250206clang-21 arm randconfig-004-20250206gcc-14.2.0 arm64allmodconfigclang-18 arm64 allnoconfiggcc-14.2.0 arm64 randconfig-001-20250206clang-21 arm64 randconfig-002-20250206clang-21 arm64 randconfig-003-20250206clang-15 arm64 randconfig-004-20250206gcc-14.2.0 csky allnoconfiggcc-14.2.0 csky randconfig-001-20250206gcc-14.2.0 csky randconfig-002-20250206gcc-14.2.0 hexagon allmodconfigclang-21 hexagon allnoconfigclang-21 hexagon allyesconfigclang-18 hexagon randconfig-001-20250206clang-21 hexagon randconfig-002-20250206clang-18 i386 allmodconfiggcc-12 i386 allnoconfiggcc-12 i386 allyesconfiggcc-12 i386buildonly-randconfig-001-20250206clang-19 i386buildonly-randconfig-002-20250206gcc-12 i386buildonly-randconfig-003-20250206clang-19 i386buildonly-randconfig-004-20250206clang-19 i386buildonly-randconfig-005-20250206clang-19 i386buildonly-randconfig-006-20250206gcc-12 i386defconfigclang-19 loongarch allnoconfiggcc-14.2.0 loongarch randconfig-001-20250206gcc-14.2.0 loongarch randconfig-002-20250206gcc-14.2.0 m68k allnoconfiggcc-14.2.0 microblazeallnoconfiggcc-14.2.0 mips allnoconfiggcc-14.2.0 nios2 allnoconfiggcc-14.2.0 nios2 randconfig-001-20250206gcc-14.2.0 nios2 randconfig-002-20250206gcc-14.2.0 openrisc allyesconfiggcc-14.2.0 openriscdefconfiggcc-14.2.0 parisc allmodconfiggcc-14.2.0 parisc allyesconfiggcc-14.2.0 parisc defconfiggcc-14.2.0 pariscrandconfig-001-20250206gcc-14.2.0 pariscrandconfig-002-20250206gcc-14.2.0 powerpc randconfig-001-20250206clang-21 powerpc randconfig-002-20250206clang-19 powerpc randconfig-003-20250206clang-21 powerpc64 randconfig-001-20250206clang-15 powerpc64 randconfig-002-20250206clang-17 powerpc64 randconfig-003-20250206clang-19 riscv randconfig-001-20250206clang-19 riscv randconfig-002-20250206clang-17 s390 allmodconfigclang-19 s390 allyesconfiggcc-14.2.0 s390 randconfig-001-20250206clang-21 s390 randconfig-002-20250206gcc-14.2.0 sh allmodconfiggcc-14.2.0 shallnoconfiggcc-14.2.0 sh allyesconfiggcc-14.2.0 shrandconfig-001-20250206gcc-14.2.0 shrandconfig-002-20250206gcc-14.2.0 sparcallmodconfiggcc-14.2.0 sparc allnoconfiggcc-14.2.0 sparc randconfig-001-20250206gcc-14.2.0 sparc randconfig-002-20250206gcc-14.2.0 sparc64 randconfig-001-20250206gcc-14.2.0 sparc64 randconfig-002-20250206gcc-14.2.0 um allmodconfigclang-21 um
Re: [Intel-wired-lan] [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config
On Tue, Feb 04, 2025 at 03:06:17PM -0700, Ahmed Zaki wrote: > Drivers usually need to re-apply the user-set IRQ affinity to their IRQs > after reset. However, since there can be only one IRQ affinity notifier > for each IRQ, registering IRQ notifiers conflicts with the ARFS rmap > management in the core (which also registers separate IRQ affinity > notifiers). > > Move the IRQ affinity management to the napi struct. This way we can have > a unified IRQ notifier to re-apply the user-set affinity and also manage > the ARFS rmaps. The first patch moves the ARFS rmap management to CORE. > The second patch adds the IRQ affinity mask to napi_config and re-applies > the mask after reset. Patches 3-5 use the new API for bnxt, ice and idpf > drivers. If there's another version maybe adding this to netdevsim might be good? Was just thinking that if one day in the distant future netdev-genl was extended to expose the per NAPI affinity mask, a test could probably be written.
Re: [Intel-wired-lan] [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
On Thu, Feb 6, 2025 at 2:24 PM Leon Romanovsky wrote: > > On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote: > > Hi Leon, > > > > On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky wrote: > > > > > > From: Leon Romanovsky > > > > > > XFRM offload path is probed even if offload isn't needed at all. Let's > > > make sure that x->type_offload pointer stays NULL for such path to > > > reduce ambiguity. > > > > > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.") > > > Signed-off-by: Leon Romanovsky > > > --- > > > include/net/xfrm.h | 12 +++- > > > net/xfrm/xfrm_device.c | 14 +- > > > net/xfrm/xfrm_state.c | 22 +- > > > net/xfrm/xfrm_user.c | 2 +- > > > 4 files changed, 30 insertions(+), 20 deletions(-) > > <...> > > > > + x->type_offload = xfrm_get_type_offload(x->id.proto, > > > x->props.family); > > > + if (!x->type_offload) { > > <...> > > > > + xfrm_put_type_offload(x->type_offload); > > > + x->type_offload = NULL; > > > > We always set type_offload to NULL. Can we move type_offload = NULL in > > xfrm_put_type_offload() ? > > We can, but it will require change to xfrm_get_type_offload() too, > otherwise we will get asymmetrical get/put. "x->type_offload = NULL" is always set after the put() function. so I thought that maybe moving "x->type_offload = NULL" to the put() function would simplify. Yes, get/put will be asymmetric. Maybe setting "x->type_offload" can be done in get/put(). Anyway it is not a major comment. ignore if this does not simplify. Thanks -Bharat > > Do you want something like that? > int xfrm_get_type_offload(struct xfrm_state *x); > void xfrm_put_type_offload(struct xfrm_state *x); > > Thansk > > > > > Thanks > > -Bharat > > > > > /* User explicitly requested packet offload mode and > > > configured > > > * policy in addition to the XFRM state. So be civil to > > > users, > > > * and return an error instead of taking fallback path. > > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c > > > index ad2202fa82f3..568fe8df7741 100644 > > > --- a/net/xfrm/xfrm_state.c > > > +++ b/net/xfrm/xfrm_state.c > > > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct > > > xfrm_type_offload *type, > > > } > > > EXPORT_SYMBOL(xfrm_unregister_type_offload); > > > > > > -static const struct xfrm_type_offload * > > > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load) > > > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto, > > > + unsigned short > > > family) > > > { > > > const struct xfrm_type_offload *type = NULL; > > > struct xfrm_state_afinfo *afinfo; > > > + bool try_load = true; > > > > > > retry: > > > afinfo = xfrm_state_get_afinfo(family); > > > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short > > > family, bool try_load) > > > > > > return type; > > > } > > > - > > > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type) > > > -{ > > > - module_put(type->owner); > > > -} > > > +EXPORT_SYMBOL(xfrm_get_type_offload); > > > > > > static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = { > > > [XFRM_MODE_BEET] = { > > > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state > > > *x) > > > kfree(x->coaddr); > > > kfree(x->replay_esn); > > > kfree(x->preplay_esn); > > > - if (x->type_offload) > > > - xfrm_put_type_offload(x->type_offload); > > > if (x->type) { > > > x->type->destructor(x); > > > xfrm_put_type(x->type); > > > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x) > > > struct xfrm_dev_offload *xso = &x->xso; > > > struct net_device *dev = READ_ONCE(xso->dev); > > > > > > + xfrm_put_type_offload(x->type_offload); > > > + x->type_offload = NULL; > > > + > > > if (dev && dev->xfrmdev_ops) { > > > spin_lock_bh(&xfrm_state_dev_gc_lock); > > > if (!hlist_unhashed(&x->dev_gclist)) > > > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) > > > } > > > EXPORT_SYMBOL_GPL(xfrm_state_mtu); > > > > > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool > > > offload, > > > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay, > > > struct netlink_ext_ack *extack) > > > { > > > const struct xfrm_mode *inner_mode; > > > @@ -3178,8 +3176,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool > > > init_replay, bool offload, > > > goto error; > > > } > > > > > > - x->type_offload = xfrm_get_type_offload(x->id.proto, family, > > > offload); > > > - > > > err = x->type->init_state(x, extack); > > > if (err) > > >
Re: [Intel-wired-lan] [PATCH iwl-net] ice: health.c: fix compilation on gcc 7.5
On Wed, 5 Feb 2025 20:45:46 + Simon Horman wrote: > + Jiri > > On Wed, Feb 05, 2025 at 11:42:12AM +0100, Przemek Kitszel wrote: > > GCC 7 is not as good as GCC 8+ in telling what is a compile-time const, > > and thus could be used for static storage. So we could not use variables > > for that, no matter how much "const" keyword is sprinkled around. > > > > Excerpt from the report: > > My GCC is: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0. > > > > CC [M] drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.o > > drivers/net/ethernet/intel/ice/devlink/health.c:35:3: error: initializer > > element is not constant > >ice_common_port_solutions, {ice_port_number_label}}, > >^ > > drivers/net/ethernet/intel/ice/devlink/health.c:35:3: note: (near > > initialization for 'ice_health_status_lookup[0].solution') > > drivers/net/ethernet/intel/ice/devlink/health.c:35:31: error: initializer > > element is not constant > >ice_common_port_solutions, {ice_port_number_label}}, > >^ > > drivers/net/ethernet/intel/ice/devlink/health.c:35:31: note: (near > > initialization for 'ice_health_status_lookup[0].data_label[0]') > > drivers/net/ethernet/intel/ice/devlink/health.c:37:46: error: initializer > > element is not constant > >"Change or replace the module or cable.", {ice_port_number_label}}, > > ^ > > drivers/net/ethernet/intel/ice/devlink/health.c:37:46: note: (near > > initialization for 'ice_health_status_lookup[1].data_label[0]') > > drivers/net/ethernet/intel/ice/devlink/health.c:39:3: error: initializer > > element is not constant > >ice_common_port_solutions, {ice_port_number_label}}, > >^ > > > > Fixes: 85d6164ec56d ("ice: add fw and port health reporters") > > Reported-by: Qiuxu Zhuo > > Closes: > > https://lore.kernel.org/netdev/cy8pr11mb7134bf7a46d71e50d25fa7a989...@cy8pr11mb7134.namprd11.prod.outlook.com > > Signed-off-by: Przemek Kitszel > > --- > > I would really like to bump min gcc to 8.5 (RH 8 family), > > instead of supporting old Ubuntu. However SLES 15 is also stuck with gcc > > 7.5 :( > > > > CC: Linus Torvalds > > CC: Kees Cook > > CC: Nick Desaulniers > > Hi Prezemek, > > I ran into a similar problem not so long ago and I'm wondering if > the following, based on a suggestion by Jiri Slaby, resolves your > problem. I'm sure I remember from somewhere that although the variables are 'static const' they have to be real variables because they can still be patched. Which stops you using their contents as initialisers. Maybe I'm mis-remembering it. David > > diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c > b/drivers/net/ethernet/intel/ice/devlink/health.c > index ea40f7941259..19c3d37aa768 100644 > --- a/drivers/net/ethernet/intel/ice/devlink/health.c > +++ b/drivers/net/ethernet/intel/ice/devlink/health.c > @@ -25,10 +25,10 @@ struct ice_health_status { > * The below lookup requires to be sorted by code. > */ > > -static const char *const ice_common_port_solutions = > +static const char ice_common_port_solutions[] = > "Check your cable connection. Change or replace the module or cable. > Manually set speed and duplex."; > -static const char *const ice_port_number_label = "Port Number"; > -static const char *const ice_update_nvm_solution = "Update to the latest NVM > image."; > +static const char ice_port_number_label[] = "Port Number"; > +static const char ice_update_nvm_solution[] = "Update to the latest NVM > image."; > > static const struct ice_health_status ice_health_status_lookup[] = { > {ICE_AQC_HEALTH_STATUS_ERR_UNKNOWN_MOD_STRICT, "An unsupported module > was detected.", > > > Link: > https://lore.kernel.org/netdev/485dbc5a-a04b-40c2-9481-955eaa5ce...@kernel.org/ > Link: https://git.kernel.org/netdev/net-next/c/36fb51479e3c >
Re: [Intel-wired-lan] [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
Hi Leon, On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky wrote: > > From: Leon Romanovsky > > XFRM offload path is probed even if offload isn't needed at all. Let's > make sure that x->type_offload pointer stays NULL for such path to > reduce ambiguity. > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.") > Signed-off-by: Leon Romanovsky > --- > include/net/xfrm.h | 12 +++- > net/xfrm/xfrm_device.c | 14 +- > net/xfrm/xfrm_state.c | 22 +- > net/xfrm/xfrm_user.c | 2 +- > 4 files changed, 30 insertions(+), 20 deletions(-) > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h > index ed4b83696c77..28355a5be5b9 100644 > --- a/include/net/xfrm.h > +++ b/include/net/xfrm.h > @@ -464,6 +464,16 @@ struct xfrm_type_offload { > > int xfrm_register_type_offload(const struct xfrm_type_offload *type, > unsigned short family); > void xfrm_unregister_type_offload(const struct xfrm_type_offload *type, > unsigned short family); > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto, > + unsigned short family); > +static inline void xfrm_put_type_offload(const struct xfrm_type_offload > *type) > +{ > + if (!type) > + return; > + > + module_put(type->owner); > +} > + > > /** > * struct xfrm_mode_cbs - XFRM mode callbacks > @@ -1760,7 +1770,7 @@ void xfrm_spd_getinfo(struct net *net, struct > xfrmk_spdinfo *si); > u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq); > int xfrm_init_replay(struct xfrm_state *x, struct netlink_ext_ack *extack); > u32 xfrm_state_mtu(struct xfrm_state *x, int mtu); > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload, > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay, > struct netlink_ext_ack *extack); > int xfrm_init_state(struct xfrm_state *x); > int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type); > diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c > index d1fa94e52cea..e01a7f5a4c75 100644 > --- a/net/xfrm/xfrm_device.c > +++ b/net/xfrm/xfrm_device.c > @@ -244,11 +244,6 @@ int xfrm_dev_state_add(struct net *net, struct > xfrm_state *x, > xfrm_address_t *daddr; > bool is_packet_offload; > > - if (!x->type_offload) { > - NL_SET_ERR_MSG(extack, "Type doesn't support offload"); > - return -EINVAL; > - } > - > if (xuo->flags & > ~(XFRM_OFFLOAD_IPV6 | XFRM_OFFLOAD_INBOUND | > XFRM_OFFLOAD_PACKET)) { > NL_SET_ERR_MSG(extack, "Unrecognized flags in offload > request"); > @@ -310,6 +305,13 @@ int xfrm_dev_state_add(struct net *net, struct > xfrm_state *x, > return -EINVAL; > } > > + x->type_offload = xfrm_get_type_offload(x->id.proto, x->props.family); > + if (!x->type_offload) { > + NL_SET_ERR_MSG(extack, "Type doesn't support offload"); > + dev_put(dev); > + return -EINVAL; > + } > + > xso->dev = dev; > netdev_tracker_alloc(dev, &xso->dev_tracker, GFP_ATOMIC); > xso->real_dev = dev; > @@ -332,6 +334,8 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state > *x, > netdev_put(dev, &xso->dev_tracker); > xso->type = XFRM_DEV_OFFLOAD_UNSPECIFIED; > > + xfrm_put_type_offload(x->type_offload); > + x->type_offload = NULL; We always set type_offload to NULL. Can we move type_offload = NULL in xfrm_put_type_offload() ? Thanks -Bharat > /* User explicitly requested packet offload mode and > configured > * policy in addition to the XFRM state. So be civil to users, > * and return an error instead of taking fallback path. > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c > index ad2202fa82f3..568fe8df7741 100644 > --- a/net/xfrm/xfrm_state.c > +++ b/net/xfrm/xfrm_state.c > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct > xfrm_type_offload *type, > } > EXPORT_SYMBOL(xfrm_unregister_type_offload); > > -static const struct xfrm_type_offload * > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load) > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto, > + unsigned short family) > { > const struct xfrm_type_offload *type = NULL; > struct xfrm_state_afinfo *afinfo; > + bool try_load = true; > > retry: > afinfo = xfrm_state_get_afinfo(family); > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short family, > bool try_load) > > return type; > } > - > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type) > -{ > - module_put(type->owner); > -} > +EXPORT_SYMBOL(xfrm_get_type_offload); > > static const struct xfrm_mode xfrm4_mode_map[XFRM_MOD
Re: [Intel-wired-lan] [PATCH] igc: Fix HW RX timestamp when passed by ZC XDP
On 28/01/2025 14:26, Zdenek Bouska wrote: Fixes HW RX timestamp in the following scenario: - AF_PACKET socket with enabled HW RX timestamps is created - AF_XDP socket with enabled zero copy is created - frame is forwarded to the BPF program, where the timestamp should still be readable (extracted by igc_xdp_rx_timestamp(), kfunc behind bpf_xdp_metadata_rx_timestamp()) - the frame got XDP_PASS from BPF program, redirecting to the stack - AF_PACKET socket receives the frame with HW RX timestamp Moves the skb timestamp setting from igc_dispatch_skb_zc() to igc_construct_skb_zc() so that igc_construct_skb_zc() is similar to igc_construct_skb(). This issue can also be reproduced by running: # tools/testing/selftests/bpf/xdp_hw_metadata enp1s0 When a frame with the wrong port 9092 (instead of 9091) is used: # echo -n xdp | nc -u -q1 192.168.10.9 9092 then the RX timestamp is missing and xdp_hw_metadata prints: skb hwtstamp is not found! With this fix or when copy mode is used: # tools/testing/selftests/bpf/xdp_hw_metadata -c enp1s0 then RX timestamp is found and xdp_hw_metadata prints: found skb hwtstamp = 1736509937.852786132 Fixes: 069b142f5819 ("igc: Add support for PTP .getcyclesx64()") Signed-off-by: Zdenek Bouska Acked-by: Vinicius Costa Gomes Reviewed-by: Simon Horman Reviewed-by: Florian Bezdeka Reviewed-by: Song Yoong Siang --- drivers/net/ethernet/intel/igc/igc_main.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) Tested-by: Mor Bar-Gabay
Re: [Intel-wired-lan] [PATCH ipsec-next 1/5] xfrm: delay initialization of offload path till its actually requested
On Thu, Feb 06, 2025 at 07:29:13PM +0530, Bharat Bhushan wrote: > On Thu, Feb 6, 2025 at 2:24 PM Leon Romanovsky wrote: > > > > On Thu, Feb 06, 2025 at 02:16:08PM +0530, Bharat Bhushan wrote: > > > Hi Leon, > > > > > > On Wed, Feb 5, 2025 at 11:50 PM Leon Romanovsky wrote: > > > > > > > > From: Leon Romanovsky > > > > > > > > XFRM offload path is probed even if offload isn't needed at all. Let's > > > > make sure that x->type_offload pointer stays NULL for such path to > > > > reduce ambiguity. > > > > > > > > Fixes: 9d389d7f84bb ("xfrm: Add a xfrm type offload.") > > > > Signed-off-by: Leon Romanovsky > > > > --- > > > > include/net/xfrm.h | 12 +++- > > > > net/xfrm/xfrm_device.c | 14 +- > > > > net/xfrm/xfrm_state.c | 22 +- > > > > net/xfrm/xfrm_user.c | 2 +- > > > > 4 files changed, 30 insertions(+), 20 deletions(-) > > > > <...> > > > > > > + x->type_offload = xfrm_get_type_offload(x->id.proto, > > > > x->props.family); > > > > + if (!x->type_offload) { > > > > <...> > > > > > > + xfrm_put_type_offload(x->type_offload); > > > > + x->type_offload = NULL; > > > > > > We always set type_offload to NULL. Can we move type_offload = NULL in > > > xfrm_put_type_offload() ? > > > > We can, but it will require change to xfrm_get_type_offload() too, > > otherwise we will get asymmetrical get/put. > > "x->type_offload = NULL" is always set after the put() function. so I > thought that maybe moving "x->type_offload = NULL" to the put() > function would simplify. > Yes, get/put will be asymmetric. Maybe setting "x->type_offload" can > be done in get/put(). > Anyway it is not a major comment. ignore if this does not simplify. Thanks, let's wait for other comments. If I need respin the series, I'll change the functions to the proposed below. Thanks > > Thanks > -Bharat > > > > > Do you want something like that? > > int xfrm_get_type_offload(struct xfrm_state *x); > > void xfrm_put_type_offload(struct xfrm_state *x); > > > > Thansk > > > > > > > > Thanks > > > -Bharat > > > > > > > /* User explicitly requested packet offload mode and > > > > configured > > > > * policy in addition to the XFRM state. So be civil to > > > > users, > > > > * and return an error instead of taking fallback path. > > > > diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c > > > > index ad2202fa82f3..568fe8df7741 100644 > > > > --- a/net/xfrm/xfrm_state.c > > > > +++ b/net/xfrm/xfrm_state.c > > > > @@ -424,11 +424,12 @@ void xfrm_unregister_type_offload(const struct > > > > xfrm_type_offload *type, > > > > } > > > > EXPORT_SYMBOL(xfrm_unregister_type_offload); > > > > > > > > -static const struct xfrm_type_offload * > > > > -xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load) > > > > +const struct xfrm_type_offload *xfrm_get_type_offload(u8 proto, > > > > + unsigned short > > > > family) > > > > { > > > > const struct xfrm_type_offload *type = NULL; > > > > struct xfrm_state_afinfo *afinfo; > > > > + bool try_load = true; > > > > > > > > retry: > > > > afinfo = xfrm_state_get_afinfo(family); > > > > @@ -456,11 +457,7 @@ xfrm_get_type_offload(u8 proto, unsigned short > > > > family, bool try_load) > > > > > > > > return type; > > > > } > > > > - > > > > -static void xfrm_put_type_offload(const struct xfrm_type_offload *type) > > > > -{ > > > > - module_put(type->owner); > > > > -} > > > > +EXPORT_SYMBOL(xfrm_get_type_offload); > > > > > > > > static const struct xfrm_mode xfrm4_mode_map[XFRM_MODE_MAX] = { > > > > [XFRM_MODE_BEET] = { > > > > @@ -609,8 +606,6 @@ static void ___xfrm_state_destroy(struct xfrm_state > > > > *x) > > > > kfree(x->coaddr); > > > > kfree(x->replay_esn); > > > > kfree(x->preplay_esn); > > > > - if (x->type_offload) > > > > - xfrm_put_type_offload(x->type_offload); > > > > if (x->type) { > > > > x->type->destructor(x); > > > > xfrm_put_type(x->type); > > > > @@ -784,6 +779,9 @@ void xfrm_dev_state_free(struct xfrm_state *x) > > > > struct xfrm_dev_offload *xso = &x->xso; > > > > struct net_device *dev = READ_ONCE(xso->dev); > > > > > > > > + xfrm_put_type_offload(x->type_offload); > > > > + x->type_offload = NULL; > > > > + > > > > if (dev && dev->xfrmdev_ops) { > > > > spin_lock_bh(&xfrm_state_dev_gc_lock); > > > > if (!hlist_unhashed(&x->dev_gclist)) > > > > @@ -3122,7 +3120,7 @@ u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) > > > > } > > > > EXPORT_SYMBOL_GPL(xfrm_state_mtu); > > > > > > > > -int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool > > > > offload, > > > > +int __xfrm_init_state(struct xfrm_state *x, bool init_replay, > > > >
Re: [Intel-wired-lan] [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
On Thu, Feb 06, 2025 at 10:40:11PM +0800, Abdul Rahim, Faizal wrote: > > Hi Vladimir, > > Thanks for the quick review, appreciate your help. > > On 6/2/2025 1:12 am, Vladimir Oltean wrote: > > On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote: > > > This patch implements the "ethtool --set-mm" callback to trigger the > > > frame preemption verification handshake. > > > > > > Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool > > > to perform the verification handshake for igc. > > > The structure fpe.mmsv is set by mmsv in ethtool and should remain > > > read-only for the driver. > > > > > > igc does not use two mmsv callbacks: > > > a) configure_tx() > > > - igc lacks registers to configure FPE in the transmit direction. > > > > Yes, maybe, but it's still important to handle this. It tells you when > > the preemptible traffic classes should be sent as preemptible on the wire > > (i.e. when the verification is either disabled, or it succeeded). > > > > There is a selftest called manual_failed_verification() which supposedly > > tests this exact condition: if verification fails, then packets sent to > > TC0 are supposed to bump the eMAC's TX counters, even though TC0 is > > configured as preemptible. Otherwise stated: even if the tc program says > > that a certain traffic class is preemptible, you don't want to actually > > send preemptible packets if you haven't verified the link partner can > > handle them, since it will likely drop them on RX otherwise. > > Even though fpe in tx direction isn't set in igc, it still checks > ethtool_mmsv_is_tx_active() before setting a queue as preemptible. > > This is done in : > igc_tsn_enable_offload(struct igc_adapter *adapter) { > { > > if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) && > ring->preemptible) > txqctl |= IGC_TXQCTL_PREEMPTIBLE; > > > Wouldn't this handle the situation mentioned ? > Sorry if I miss something here. And what if tx_active becomes true after you had already configured the queues with tc (and the above check caused IGC_TXQCTL_PREEMPTIBLE to not be set)? Shouldn't you set IGC_TXQCTL_PREEMPTIBLE now? Isn't ethtool_mmsv_configure_tx() exactly the function that notifies you of changes to tx_active, and hence, aren't you interested in setting up a callback for it? Or is igc_tsn_reset() -> igc_tsn_enable_offload() called through some other path, after verification succeeds, that I'm not seeing? I don't think so. Maybe coincidentally, but not guaranteed. > I briefly checked the driver code and the i226 SW User Manual. > > The code calls igc_reset_task() → igc_reset() → igc_reinit_locked() → > igc_down() → igc_reset() → igc_power_down_phy_copper_base(). > > I suspect igc_power_down_phy_copper_base() contributes to the link loss, but > there are likely other factors as well. > > The SW User Manual states that a software reset (CTRL.DEV_RST) affects > several components, including "Port Configuration Autoload from NVM, Data > Path, On-die Memories, MAC, PCS, Auto Negotiation and other Port-related > Logic, Function Queue Enable, Function Interrupt and Statistics registers, > Wake-up Management Registers, and Memory Configuration Registers." > > Given this, right now, I’m unsure about the feasibility of making this > change, though I see the benefits mentioned. > Validating this would require significant exploration—i.e., investigating > the code, running experiments, reviewing commit history, confirming hardware > expectations from the SW manual, and consulting other hardware SMEs. > > Resetting the adapter is a common mechanism in igc that relies on shared > functions, which can be triggered in various scenarios. Modifying this > behavior (if feasible) could introduce some risks and would likely require > additional testing across those scenarios. Focusing on this right now might > delay the current series, which is primarily aimed at enabling Qbu on > i225/6. > > Would it be alright if I explore this separately from this Qbu series? Ok - as I said, it's not as if I couldn't eventually tolerate the workaround you chose. We'd be putting a dependency of this feature on some unrelated thing with a high degree of uncertainty. For example, I asked if this is fiber or a twisted pair PHY, because while for the copper PHY the issue might be more tractable (clause 28 autoneg on the media side is completely independent of what the host side NIC is doing - it may reset, it may do whatever - unless the MAC provides a clock that is important for PHY operation), I do wonder if anything at all could be done to avoid the link partner seeing a link loss over fiber, because there, the connection is direct PCS-to-PCS. If you have to reset, you have to reset, and then pick up the pieces, I understand. It's good you're committing to look into this in more depth, though.
[Intel-wired-lan] [PATCH iwl-net] ixgbe: fix media cage present detection for E610 device
The commit 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") introduced incorrect checking of media cage presence for E610 device. Fix it. Fixes: 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/e7d73b32-f12a-49d1-8b60-1ef83359ec13@stanley.mountain/ Reviewed-by: Michal Swiatkowski Reviewed-by: Przemek Kitszel Signed-off-by: Piotr Kwapulinski --- drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c index 683c668..cb07ecd 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c @@ -1122,7 +1122,7 @@ static bool ixgbe_is_media_cage_present(struct ixgbe_hw *hw) * returns error (ENOENT), then no cage present. If no cage present then * connection type is backplane or BASE-T. */ - return ixgbe_aci_get_netlist_node(hw, cmd, NULL, NULL); + return !ixgbe_aci_get_netlist_node(hw, cmd, NULL, NULL); } /** -- 2.43.0