> -----Original Message----- > From: Marcin Szycik <marcin.szy...@linux.intel.com> > Sent: Wednesday, April 3, 2024 11:21 AM > To: Loktionov, Aleksandr <aleksandr.loktio...@intel.com>; intel- > wired-...@lists.osuosl.org > Cc: Drewek, Wojciech <wojciech.dre...@intel.com>; Wang, Liang-min > <liang-min.w...@intel.com>; net...@vger.kernel.org; Chmielewski, > Pawel <pawel.chmielew...@intel.com>; Nguyen, Anthony L > <anthony.l.ngu...@intel.com>; ho...@kernel.org; Kitszel, Przemyslaw > <przemyslaw.kits...@intel.com> > Subject: Re: [Intel-wired-lan] [PATCH iwl-next v4] ice: Reset VF on > Tx MDD event > > > > On 03.04.2024 08:37, Loktionov, Aleksandr wrote: > > > > > >> -----Original Message----- > >> From: Intel-wired-lan <intel-wired-lan-boun...@osuosl.org> On > Behalf > >> Of Marcin Szycik > >> Sent: Tuesday, April 2, 2024 6:52 PM > >> To: intel-wired-...@lists.osuosl.org > >> Cc: Drewek, Wojciech <wojciech.dre...@intel.com>; Wang, Liang- > min > >> <liang-min.w...@intel.com>; net...@vger.kernel.org; Chmielewski, > >> Pawel <pawel.chmielew...@intel.com>; Marcin Szycik > >> <marcin.szy...@linux.intel.com>; Nguyen, Anthony L > >> <anthony.l.ngu...@intel.com>; ho...@kernel.org; Kitszel, > Przemyslaw > >> <przemyslaw.kits...@intel.com> > >> Subject: [Intel-wired-lan] [PATCH iwl-next v4] ice: Reset VF on > Tx > >> MDD event > >> > > Please state in the title explicitly the purpose of the patch: > > Do you fix a bug? say fix > > Do you add functionality? say add > > Do you refactor? say refactor > > IMO it's neither of those. > > It's not a bugfix because the driver itself doesn't do anything > wrong; the patch provides the user with a kind of optional > workaround for an userspace app that's behaving incorrectly. > > It's not really new functionality, since the mdd-auto-reset-vf flag > was already implemented and auto reset was already happening for Rx > events. I'm just extending this to Tx as well. For me it looks like adding new feature, but I'm not sure yet.
> > And it's not strictly a refactor because I'm changing behaviour. Please describe explicitly what behavior do you change, and the motivation, especially if it's not a bugfix. > Please also note that I've picked up a patch not originally made by > me [2], and kept the original title and most of the description. > I'm open to suggestions to title though. > > >> In cases when VF sends malformed packets that are classified as > >> malicious, sometimes it causes Tx queue to freeze. This frozen > queue > >> can be stuck for several minutes being unusable. This behavior > can be > >> reproduced with a faulty userspace app running on VF. > >> > >> When any Malicious Driver Detection event occurs and the mdd- > auto- > >> reset-vf private flag is set, perform a graceful VF reset to > quickly > >> bring VF back to operational state. Add a log message to notify > about > >> the cause of the reset. Add a helper for this to be reused for > both > >> TX and RX events. > > Please describe your changes explicitly: > > Do you just add a helper function without adding a new > functionality? > > Do you add functionality? > > Do you fix? > > Please explain what are the changes in driver behavior and the > changes in the driver sources you make. > > Thank you > > I'll try to make the description more explicit. Thank you > > Thanks, > Marcin > > >> Reviewed-by: Wojciech Drewek <wojciech.dre...@intel.com> > >> Reviewed-by: Przemek Kitszel <przemyslaw.kits...@intel.com> > >> Co-developed-by: Liang-Min Wang <liang-min.w...@intel.com> > >> Signed-off-by: Liang-Min Wang <liang-min.w...@intel.com> > >> Signed-off-by: Marcin Szycik <marcin.szy...@linux.intel.com> > >> --- > >> v4: Only perform auto-reset once per VF > >> v3 [1]: Only auto reset VF if the mdd-auto-reset-vf flag is set > >> v2 [2]: Revert an unneeded formatting change, fix commit > message, fix > >> a log > >> message with a correct event name > >> > >> [1] https://lore.kernel.org/intel-wired- > lan/20240326164455.735739- > >> 1-marcin.szy...@linux.intel.com > >> [2] https://lore.kernel.org/netdev/20231102155149.2574209-1- > >> pawel.chmielew...@intel.com > >> --- > >> drivers/net/ethernet/intel/ice/ice_main.c | 57 > +++++++++++++++++- > >> ---- drivers/net/ethernet/intel/ice/ice_sriov.c | 25 +++++++--- > >> drivers/net/ethernet/intel/ice/ice_sriov.h | 2 + > >> 3 files changed, 67 insertions(+), 17 deletions(-) > >> > >> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c > >> b/drivers/net/ethernet/intel/ice/ice_main.c > >> index 185c9b13efcf..80bc83f6e1ab 100644 > >> --- a/drivers/net/ethernet/intel/ice/ice_main.c > >> +++ b/drivers/net/ethernet/intel/ice/ice_main.c > >> @@ -1745,6 +1745,39 @@ static void ice_service_timer(struct > >> timer_list *t) > >> ice_service_task_schedule(pf); > >> } > >> > >> +/** > >> + * ice_mdd_maybe_reset_vf - reset VF after MDD event > >> + * @pf: pointer to the PF structure > >> + * @vf: pointer to the VF structure > >> + * @reset_vf_tx: whether Tx MDD has occurred > >> + * @reset_vf_rx: whether Rx MDD has occurred > >> + * > >> + * Since the queue can get stuck on VF MDD events, the PF can > be > >> +configured to > >> + * automatically reset the VF by enabling the private ethtool > flag > >> + * mdd-auto-reset-vf. > >> + */ > >> +static void ice_mdd_maybe_reset_vf(struct ice_pf *pf, struct > >> ice_vf *vf, > >> + bool reset_vf_tx, bool reset_vf_rx) { > >> + struct device *dev = ice_pf_to_dev(pf); > >> + > >> + if (!test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf->flags)) > >> + return; > >> + > >> + /* VF MDD event counters will be cleared by reset, so print > >> the event > >> + * prior to reset. > >> + */ > >> + if (reset_vf_tx) > >> + ice_print_vf_tx_mdd_event(vf); > >> + > >> + if (reset_vf_rx) > >> + ice_print_vf_rx_mdd_event(vf); > >> + > >> + dev_info(dev, "PF-to-VF reset on PF %d VF %d due to MDD > >> event\n", > >> + pf->hw.pf_id, vf->vf_id); > >> + ice_reset_vf(vf, ICE_VF_RESET_NOTIFY | ICE_VF_RESET_LOCK); } > >> + > >> /** > >> * ice_handle_mdd_event - handle malicious driver detect event > >> * @pf: pointer to the PF structure > >> @@ -1838,6 +1871,8 @@ static void ice_handle_mdd_event(struct > ice_pf > >> *pf) > >> */ > >> mutex_lock(&pf->vfs.table_lock); > >> ice_for_each_vf(pf, bkt, vf) { > >> + bool reset_vf_tx = false, reset_vf_rx = false; > >> + > >> reg = rd32(hw, VP_MDET_TX_PQM(vf->vf_id)); > >> if (reg & VP_MDET_TX_PQM_VALID_M) { > >> wr32(hw, VP_MDET_TX_PQM(vf->vf_id), 0xFFFF); @@ - > >> 1846,6 +1881,8 @@ static void ice_handle_mdd_event(struct ice_pf > >> *pf) > >> if (netif_msg_tx_err(pf)) > >> dev_info(dev, "Malicious Driver Detection > event TX_PQM detected > >> on VF %d\n", > >> vf->vf_id); > >> + > >> + reset_vf_tx = true; > >> } > >> > >> reg = rd32(hw, VP_MDET_TX_TCLAN(vf->vf_id)); @@ -1856,6 > >> +1893,8 @@ static void ice_handle_mdd_event(struct ice_pf *pf) > >> if (netif_msg_tx_err(pf)) > >> dev_info(dev, "Malicious Driver Detection > event TX_TCLAN > >> detected on VF %d\n", > >> vf->vf_id); > >> + > >> + reset_vf_tx = true; > >> } > >> > >> reg = rd32(hw, VP_MDET_TX_TDPU(vf->vf_id)); @@ -1866,6 > >> +1905,8 @@ static void ice_handle_mdd_event(struct ice_pf *pf) > >> if (netif_msg_tx_err(pf)) > >> dev_info(dev, "Malicious Driver Detection > event TX_TDPU detected > >> on VF %d\n", > >> vf->vf_id); > >> + > >> + reset_vf_tx = true; > >> } > >> > >> reg = rd32(hw, VP_MDET_RX(vf->vf_id)); @@ -1877,18 > >> +1918,12 @@ static void ice_handle_mdd_event(struct ice_pf *pf) > >> dev_info(dev, "Malicious Driver Detection > event RX detected on > >> VF %d\n", > >> vf->vf_id); > >> > >> - /* Since the queue is disabled on VF Rx MDD > >> events, the > >> - * PF can be configured to reset the VF through > >> ethtool > >> - * private flag mdd-auto-reset-vf. > >> - */ > >> - if (test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf- > >>> flags)) { > >> - /* VF MDD event counters will be cleared by > >> - * reset, so print the event prior to > >> reset. > >> - */ > >> - ice_print_vf_rx_mdd_event(vf); > >> - ice_reset_vf(vf, ICE_VF_RESET_LOCK); > >> - } > >> + reset_vf_rx = true; > >> } > >> + > >> + if (reset_vf_tx || reset_vf_rx) > >> + ice_mdd_maybe_reset_vf(pf, vf, reset_vf_tx, > >> + reset_vf_rx); > >> } > >> mutex_unlock(&pf->vfs.table_lock); > >> > >> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c > >> b/drivers/net/ethernet/intel/ice/ice_sriov.c > >> index fb2e96db647e..a60dacf8942a 100644 > >> --- a/drivers/net/ethernet/intel/ice/ice_sriov.c > >> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c > >> @@ -1861,6 +1861,24 @@ void ice_print_vf_rx_mdd_event(struct > ice_vf > >> *vf) > >> ? "on" : "off"); > >> } > >> > >> +/** > >> + * ice_print_vf_tx_mdd_event - print VF Tx malicious driver > detect > >> +event > >> + * @vf: pointer to the VF structure > >> + */ > >> +void ice_print_vf_tx_mdd_event(struct ice_vf *vf) { > >> + struct ice_pf *pf = vf->pf; > >> + struct device *dev; > >> + > >> + dev = ice_pf_to_dev(pf); > >> + > >> + dev_info(dev, "%d Tx Malicious Driver Detection events > >> detected on PF %d VF %d MAC %pM. mdd-auto-reset-vfs=%s\n", > >> + vf->mdd_tx_events.count, pf->hw.pf_id, vf->vf_id, > >> + vf->dev_lan_addr, > >> + test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf->flags) > >> + ? "on" : "off"); > >> +} > >> + > >> /** > >> * ice_print_vfs_mdd_events - print VFs malicious driver detect > >> event > >> * @pf: pointer to the PF structure > >> @@ -1869,8 +1887,6 @@ void ice_print_vf_rx_mdd_event(struct > ice_vf > >> *vf) > >> */ > >> void ice_print_vfs_mdd_events(struct ice_pf *pf) { > >> - struct device *dev = ice_pf_to_dev(pf); > >> - struct ice_hw *hw = &pf->hw; > >> struct ice_vf *vf; > >> unsigned int bkt; > >> > >> @@ -1897,10 +1913,7 @@ void ice_print_vfs_mdd_events(struct > ice_pf > >> *pf) > >> if (vf->mdd_tx_events.count != vf- > >>> mdd_tx_events.last_printed) { > >> vf->mdd_tx_events.last_printed = > >> vf->mdd_tx_events.count; > >> - > >> - dev_info(dev, "%d Tx Malicious Driver Detection > >> events detected on PF %d VF %d MAC %pM.\n", > >> - vf->mdd_tx_events.count, hw->pf_id, vf- > >>> vf_id, > >> - vf->dev_lan_addr); > >> + ice_print_vf_tx_mdd_event(vf); > >> } > >> } > >> mutex_unlock(&pf->vfs.table_lock); > >> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.h > >> b/drivers/net/ethernet/intel/ice/ice_sriov.h > >> index 4ba8fb53aea1..8f22313474d6 100644 > >> --- a/drivers/net/ethernet/intel/ice/ice_sriov.h > >> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.h > >> @@ -58,6 +58,7 @@ void > >> ice_vf_lan_overflow_event(struct ice_pf *pf, struct > >> ice_rq_event_info *event); void ice_print_vfs_mdd_events(struct > >> ice_pf *pf); void ice_print_vf_rx_mdd_event(struct ice_vf *vf); > >> +void ice_print_vf_tx_mdd_event(struct ice_vf *vf); > >> bool > >> ice_vc_validate_pattern(struct ice_vf *vf, struct > >> virtchnl_proto_hdrs *proto); > >> u32 ice_sriov_get_vf_total_msix(struct pci_dev *pdev); @@ -69,6 > >> +70,7 @@ static inline void ice_vf_lan_overflow_event(struct > >> ice_pf *pf, struct ice_rq_event_info *event) { } static inline > void > >> ice_print_vfs_mdd_events(struct ice_pf *pf) { } static inline > void > >> ice_print_vf_rx_mdd_event(struct ice_vf *vf) { } > >> +static inline void ice_print_vf_tx_mdd_event(struct ice_vf *vf) > { > >> } > >> static inline void ice_restore_all_vfs_msi_state(struct ice_pf > >> *pf) { } > >> > >> static inline int > >> -- > >> 2.41.0 > >