Hi, Honnappa

I'm sorry for delay - I had to play with patch, to compile this one with 
assembly listings
and check what code is actually generated on x86/ARM.

On x86 there is no difference at all (with and w/o patch), so no objection from 
my side.
On ARM we have:
w/o patch: dmb oshld
with patch: dmb ishld

What is the purpose of the barrier - to not allow the CQE read access 
reordering.
On x86, "compiler barrier" is quite enough (due to strong load/store ordering).
On ARM load/store might be reordered, AFAIU.

CQE resides in the host memory and can be directly written by the NIC via PCIe.
(In my understanding it can cause core cache(s) invalidations). 
I have a question - should we consider this as outer sharable domain?
Is it safe to have a barrier for inner domain only in our case?

We have updated the cqe_check() routine, sorry for this. Could you, please,
update the patch and send v3?

With best regards,
Slava

> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Sent: вторник, 15 ноября 2022 г. 03:46
> To: Slava Ovsiienko <viachesl...@nvidia.com>; dev@dpdk.org; Ruifeng Wang
> <ruifeng.w...@arm.com>; Matan Azrad <ma...@nvidia.com>; Shahaf Shuler
> <shah...@nvidia.com>
> Cc: nd <n...@arm.com>; Matan Azrad <ma...@nvidia.com>; sta...@dpdk.org;
> nd <n...@arm.com>
> Subject: RE: [PATCH v2] net/mlx5: use just sufficient barrier for Arm 
> platforms
> 
> <snip>
> 
> >
> > >
> > > Hi, Honnappa
> > Hi Slava, thanks for the feedback.
> >
> > >
> > > We discussed the barrier here:
> > > http://patches.dpdk.org/project/dpdk/patch/20210606164948.35997-1-
> > > honnappa.nagaraha...@arm.com/
> > Yes, I have changed the patch according to the discussion. i.e.
> > barrier is needed, but different (inner sharable domain) barrier is 
> > required.
> >
> > >
> > > (BTW, it is good practice to keep the reference to previous patch
> > > versions below Commit Message of the next ones).
> > >
> > > This barrier is not about compiler ordering, it is about external HW
> > > agent memory action completions.
> > > So, I'm not sure the rte_atomic_thread_fence() is safe for x86 -
> > > patch impacts
> > > x86 as well.
> > The earlier barrier 'rte_io_rmb()', resolves to a compiler barrier on x86 
> > [1].
> > The rte_atomic_thread_fence(__ATOMIC_ACQUIRE) on x86 also acts as a
> > compiler barrier. So, there is no change for x86.
> >
> >
> > [1]
> > https://github.com/DPDK/dpdk/blob/main/lib/eal/x86/include/rte_atomic.
> > h#
> > L80
> Hi Slava, any more comments on this?
> 
> >
> > >
> > > With best regards,
> > > Slava
> > >
> > > > -----Original Message-----
> > > > From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > > Sent: Tuesday, August 30, 2022 23:01
> > > > To: dev@dpdk.org; honnappa.nagaraha...@arm.com;
> > > ruifeng.w...@arm.com;
> > > > Matan Azrad <ma...@nvidia.com>; Shahaf Shuler
> > <shah...@nvidia.com>;
> > > > Slava Ovsiienko <viachesl...@nvidia.com>
> > > > Cc: n...@arm.com; Matan Azrad <ma...@nvidia.com>; sta...@dpdk.org
> > > > Subject: [PATCH v2] net/mlx5: use just sufficient barrier for Arm
> > > > platforms
> > > >
> > > > cqe->op_own indicates if the CQE is owned by the NIC. The rest of
> > > > the fields in CQE should be read only after op_own is read. On Arm
> > > > platforms using "dmb ishld" is sufficient to enforce this.
> > > >
> > > > Fixes: 88c0733535d6 ("net/mlx5: extend Rx completion with error
> > > > handling")
> > > > Cc: ma...@mellanox.com
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> > > > ---
> > > >  drivers/common/mlx5/mlx5_common.h | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/common/mlx5/mlx5_common.h
> > > > b/drivers/common/mlx5/mlx5_common.h
> > > > index 5028a05b49..ac2e85b15f 100644
> > > > --- a/drivers/common/mlx5/mlx5_common.h
> > > > +++ b/drivers/common/mlx5/mlx5_common.h
> > > > @@ -195,7 +195,11 @@ check_cqe(volatile struct mlx5_cqe *cqe,
> > > > const uint16_t cqes_n,
> > > >
> > > >         if (unlikely((op_owner != (!!(idx))) || (op_code ==
> > > > MLX5_CQE_INVALID)))
> > > >                 return MLX5_CQE_STATUS_HW_OWN;
> > > > -       rte_io_rmb();
> > > > +       /* Prevent speculative reading of other fields in CQE until
> > > > +        * CQE is valid.
> > > > +        */
> > > > +       rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> > > > +
> > > >         if (unlikely(op_code == MLX5_CQE_RESP_ERR ||
> > > >                      op_code == MLX5_CQE_REQ_ERR))
> > > >                 return MLX5_CQE_STATUS_ERR;
> > > > --
> > > > 2.17.1

Reply via email to