From: Manoj Vishwanathan <manojvi...@google.com> Date: Wed, 7 Aug 2024 06:58:59 -0700
> Thanks Przemek & Olek for your quick feedback and responses. > Hi Olek, > I can add more details about the issue we faced in the commit message. > The bug we had here was a virtchnl delay leading to the xn->salt > mismatch. This could be due to several factors including default CPU > bounded kworker workqueue for virtchnl message processing being That's why I always tell people to stop creating more and more private workqueues and just use the system ones, there's a whole collection for every need... > starved by aggressive userspace load causing the virtchnl to be > delayed. While debugging this issue, this locking order appeared like > a potential issue, hence the change was made. > But, this change is more a clean up we felt based on concurrent access > to the virtchnl transaction struct and does not fix the issue. This is > more of the patch to do the right thing before we access the "xn". > I wanted to start with a first patch to the community for acceptance > followed by a series of other patches that are general clean up or > improvements to IDPF in general. Will follow with with [PATCH v3] Maybe you'd prepare a full series then right away? I hope it won't conflict much with my tree (but you always can double-check[0]) (Chapter II is already posted here on IWL and netdev@) > > Thanks, > Manoj > > On Wed, Aug 7, 2024 at 4:05 AM Alexander Lobakin > <aleksander.loba...@intel.com> wrote: >> >> From: Manoj Vishwanathan <manojvi...@google.com> >> Date: Mon, 5 Aug 2024 18:21:59 +0000 >> >>> The transaction salt was being accessed before acquiring the >>> idpf_vc_xn_lock when idpf has to forward the virtchnl reply. >> >> You need to explain in details here what issue you have faced due to >> that, how to reproduce it and how this fix does help. >> Otherwise, it's impossible to suggest what is happening and how to test >> whether the fix is correct. >> >>> >>> Fixes: 34c21fa894a1a (“idpf: implement virtchnl transaction manager”) >>> Signed-off-by: Manoj Vishwanathan <manojvi...@google.com> >>> --- >>> drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> index 70986e12da28..30eec674d594 100644 >>> --- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> +++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c >>> @@ -612,14 +612,15 @@ idpf_vc_xn_forward_reply(struct idpf_adapter *adapter, >>> return -EINVAL; >>> } >>> xn = &adapter->vcxn_mngr->ring[xn_idx]; >>> + idpf_vc_xn_lock(xn); >>> salt = FIELD_GET(IDPF_VC_XN_SALT_M, msg_info); >> >> The lock can be taken here after the FIELD_GET(), not before, to reduce >> the critical/locked section execution time. >> >>> if (xn->salt != salt) { >>> dev_err_ratelimited(&adapter->pdev->dev, "Transaction salt >>> does not match (%02x != %02x)\n", >>> xn->salt, salt); >>> + idpf_vc_xn_unlock(xn); >>> return -EINVAL; >>> } >>> >>> - idpf_vc_xn_lock(xn); >>> switch (xn->state) { >>> case IDPF_VC_XN_WAITING: >>> /* success */ >> >> Thanks, >> Olek [0] https://github.com/alobakin/linux/commits/idpf-libie-new Thanks, Olek