On Fri, Jul 24, 2015 at 10:18 AM, Jinpu Wang <jinpu.w...@profitbricks.com> wrote: > Hi all, > > I hit bug in OFED, I report to link below: > > http://marc.info/?l=linux-rdma&m=143634872328553&w=2 > I checked latest mainline Linux 4.2-rc3, it has similar bug. > Here is the patch against Linux 4.2-rc3, compile test only. > > I add one copy as attachment in case mail client break the patch format. > > From a9fbc1ff0768acdb260e57e3324798fc0082d194 Mon Sep 17 00:00:00 2001 > From: Jack Wang <jinpu.w...@profitbricks.com> > Date: Thu, 23 Jul 2015 18:58:08 +0200 > Subject: [PATCH] mlx4_core: fix possible use-after-free in cq_completion > > It's possible during mlx4_cq_free, there are new cq_completion come, > and there is no spin_lock protection for cq_completion, also no > refcount protection, it will lead to use after free. So add the > spin_lock and refcount protection in cq_completion. > > Signed-off-by: Jack Wang <jinpu.w...@profitbricks.com> > --- > drivers/net/ethernet/mellanox/mlx4/cq.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c > b/drivers/net/ethernet/mellanox/mlx4/cq.c > index 3348e64..8d7f405 100644 > --- a/drivers/net/ethernet/mellanox/mlx4/cq.c > +++ b/drivers/net/ethernet/mellanox/mlx4/cq.c > @@ -99,10 +99,15 @@ static void mlx4_add_cq_to_tasklet(struct mlx4_cq *cq) > > void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn) > { > + struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table; > struct mlx4_cq *cq; > > - cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree, > - cqn & (dev->caps.num_cqs - 1)); > + spin_lock(&cq_table->lock); > + cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1)); > + if (cq) > + atomic_inc(&cq->refcount); > + > + spin_unlock(&cq_table->lock); > if (!cq) { > mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn); > return; > @@ -111,6 +116,8 @@ void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn) > ++cq->arm_sn; > > cq->comp(cq); > + if (atomic_dec_and_test(&cq->refcount)) > + complete(&cq->free); > } > > void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type) > -- > 1.9.1 >
Found almost same patch as what I did, but 3 years ago :) http://linux-rdma.vger.kernel.narkive.com/NSyWFRkW/patch-rfc-for-next-net-mlx4-core-fix-racy-flow-in-the-driver-cq-completion-handler Could you consider to apply the patch, it fix real PANIC? Thanks Jack -- Mit freundlichen Grüßen, Best Regards, Jack Wang Linux Kernel Developer Storage ProfitBricks GmbH The IaaS-Company. ProfitBricks GmbH Greifswalder Str. 207 D - 10405 Berlin Tel: +49 30 5770083-42 Fax: +49 30 5770085-98 Email: jinpu.w...@profitbricks.com URL: http://www.profitbricks.de Sitz der Gesellschaft: Berlin. Registergericht: Amtsgericht Charlottenburg, HRB 125506 B. Geschäftsführer: Andreas Gauger, Achim Weiss. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html