On Wed, 2019-02-06 at 09:15 -0800, Cong Wang wrote: > On Wed, Feb 6, 2019 at 8:55 AM Saeed Mahameed <sae...@mellanox.com> > wrote: > > Hi Cong, > > > > The patch is ok to me, but i really doubt that you can hit a > > contention > > on latest upstream driver, since we already have spinlock per EQ, > > which > > means spinlock per core, each EQ (core) msix handler can only > > access > > one spinlock (its own), so I am surprised how you got the > > contention, > > Maybe you are not running on latest upstream driver ? > > We are running 4.14 stable release. Which commit changes the game > here? We can consider to backport it unless it is complicated. >
Ok, so there is no issue upstream, you are just missing the following patch: commit 02d92f7903647119e125b24f5470f96cee0d4b4b Author: Saeed Mahameed <sae...@mellanox.com> Date: Fri Jan 19 16:13:01 2018 -0800 net/mlx5: CQ Database per EQ Before this patch the driver had one CQ database protected via one spinlock, this spinlock is meant to synchronize between CQ adding/removing and CQ IRQ interrupt handling. [...] > Also, if you don't like this patch, we are happy to carry it for our > own, > sometimes it isn't worth the time to push into upstream. I Do like it and it always worth it to push upstream, we all get to learn cool new stuff. > > > what is the workload ? > > > > It's a memcached RPC performance test, that is all I can tell. cool, thanks, so the missing commit should fix your issue. > (Apparently I have almost zero knowledge about memcached.) > > > > > > In fact, radix_tree_lookup() is perfectly fine with RCU read > > > > lock, > > > > we don't have to take a spinlock on this hot path. It is pretty > > > > much > > > > similar to commit 291c566a2891 > > > > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow > > > > paths > > > > are still serialized with the spinlock, and with > > > > synchronize_irq() > > > > it should be safe to just move the fast path to RCU read lock. > > > > > > > > This patch itself reduces the latency by about 50% with our > > > > workload. > > > > > > > > Cc: Saeed Mahameed <sae...@mellanox.com> > > > > Cc: Tariq Toukan <tar...@mellanox.com> > > > > Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> > > > > --- > > > > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++----- > > > > - > > > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c > > > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c > > > > index ee04aab65a9f..7092457705a2 100644 > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c > > > > @@ -114,11 +114,11 @@ static struct mlx5_core_cq > > > > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn) > > > > struct mlx5_cq_table *table = &eq->cq_table; > > > > struct mlx5_core_cq *cq = NULL; > > > > > > > > - spin_lock(&table->lock); > > > > + rcu_read_lock(); > > > > cq = radix_tree_lookup(&table->tree, cqn); > > > > if (likely(cq)) > > > > mlx5_cq_hold(cq); > > > > - spin_unlock(&table->lock); > > > > + rcu_read_unlock(); > > > > > > Thanks for you patch. > > > > > > I think we can improve it further, by taking the if statement out > > > of > > > the > > > critical section. > > > > > > > No, mlx5_cq_hold must stay under RCU read, otherwise cq might get > > freed > > before the irq gets a change to increment ref count on it. > > > > Agreed. > Cool, I will ack the patch.. > > Thanks.