On Wed, 2019-02-06 at 12:02 +0000, Tariq Toukan wrote:
> 
> On 2/6/2019 2:35 AM, Cong Wang wrote:
> > mlx5_eq_cq_get() is called in IRQ handler, the spinlock inside
> > gets a lot of contentions when we test some heavy workload
> > with 60 RX queues and 80 CPU's, and it is clearly shown in the
> > flame graph.
> > 


Hi Cong,

The patch is ok to me, but i really doubt that you can hit a contention
on latest upstream driver, since we already have spinlock per EQ, which
means spinlock per core,  each EQ (core) msix handler can only access
one spinlock (its own), so I am surprised how you got the contention,
Maybe you are not running on latest upstream driver ?

what is the workload ? 

> > In fact, radix_tree_lookup() is perfectly fine with RCU read lock,
> > we don't have to take a spinlock on this hot path. It is pretty
> > much
> > similar to commit 291c566a2891
> > ("net/mlx4_core: Fix racy CQ (Completion Queue) free"). Slow paths
> > are still serialized with the spinlock, and with synchronize_irq()
> > it should be safe to just move the fast path to RCU read lock.
> > 
> > This patch itself reduces the latency by about 50% with our
> > workload.
> > 
> > Cc: Saeed Mahameed <sae...@mellanox.com>
> > Cc: Tariq Toukan <tar...@mellanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com>
> > ---
> >   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > index ee04aab65a9f..7092457705a2 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> > @@ -114,11 +114,11 @@ static struct mlx5_core_cq
> > *mlx5_eq_cq_get(struct mlx5_eq *eq, u32 cqn)
> >     struct mlx5_cq_table *table = &eq->cq_table;
> >     struct mlx5_core_cq *cq = NULL;
> >   
> > -   spin_lock(&table->lock);
> > +   rcu_read_lock();
> >     cq = radix_tree_lookup(&table->tree, cqn);
> >     if (likely(cq))
> >             mlx5_cq_hold(cq);
> > -   spin_unlock(&table->lock);
> > +   rcu_read_unlock();
> 
> Thanks for you patch.
> 
> I think we can improve it further, by taking the if statement out of
> the 
> critical section.
> 

No, mlx5_cq_hold must stay under RCU read, otherwise cq might get freed
before the irq gets a change to increment ref count on it.

another way to do it is not to do any refcounting in the irq handler
and fence cq removal via synchronize_irq(eq->irqn) on mlx5_eq_del_cq.
But let's keep one approach (refcounting), synchronize_irq/rcu can be
heavy sometimes especially on RDMA workloads with many create/destroy
cq in loops.

> Other than that, patch LGTM.
> 
> Regards,
> Tariq
> 
> >   
> >     return cq;
> >   }
> > @@ -371,9 +371,9 @@ int mlx5_eq_add_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> >     struct mlx5_cq_table *table = &eq->cq_table;
> >     int err;
> >   
> > -   spin_lock_irq(&table->lock);
> > +   spin_lock(&table->lock);
> >     err = radix_tree_insert(&table->tree, cq->cqn, cq);
> > -   spin_unlock_irq(&table->lock);
> > +   spin_unlock(&table->lock);
> >   
> >     return err;
> >   }
> > @@ -383,9 +383,9 @@ int mlx5_eq_del_cq(struct mlx5_eq *eq, struct
> > mlx5_core_cq *cq)
> >     struct mlx5_cq_table *table = &eq->cq_table;
> >     struct mlx5_core_cq *tmp;
> >   
> > -   spin_lock_irq(&table->lock);
> > +   spin_lock(&table->lock);
> >     tmp = radix_tree_delete(&table->tree, cq->cqn);
> > -   spin_unlock_irq(&table->lock);
> > +   spin_unlock(&table->lock);
> >   
> >     if (!tmp) {
> >             mlx5_core_warn(eq->dev, "cq 0x%x not found in eq 0x%x
> > tree\n", eq->eqn, cq->cqn);
> > 

Reply via email to