On Thu, Sep 13, 2018 at 11:36 PM, Jesper Dangaard Brouer <bro...@redhat.com> wrote: > On Thu, 13 Sep 2018 15:55:29 -0700 > Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote: > >> On Thu, Aug 30, 2018 at 1:35 AM, Tariq Toukan <tar...@mellanox.com> wrote: >> > >> > >> > On 29/08/2018 6:05 PM, Jesper Dangaard Brouer wrote: >> >> >> >> Hi Saeed, >> >> >> >> I'm having issues loading mlx5 driver on v4.19 kernels (tested both >> >> net-next and bpf-next), while kernel v4.18 seems to work. It happens >> >> with a Mellanox ConnectX-5 NIC (and also a CX4-Lx but I removed that >> >> from the system now). >> >> >> > >> > Hi Jesper, >> > >> > Thanks for your report! >> > >> > We are working to analyze and debug the issue. >> >> looks like serious issue to me... while no news in 2 weeks. >> any update? > > Mellanox took it offlist, and Sep 6th found that this is a regression > introduced by commit 269d26f47f6f ("net/mlx5: Reduce command polling > interval"), but only if CONFIG_PREEMPT is on. > > I can confirm that reverting this commit fixed the issue (and not the > firmware upgrade I also did). > > I think Moshe (Cc) is responsible for this case, and I expect to soon > see a revert or alternative solution to this!? > > Thanks for the kick Alexei :-)
Thanks you Alexei and Jesper for following up, the fix is already being tested [1] and will be submitted tomorrow, as Jesper pointed out the issue happens only with 269d26f47f6f ("net/mlx5: Reduce command polling interval"), and only if CONFIG_PREEMPT is on. the only affected kernel is 4.19 which is not GA yet. [1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=net-mlx5 > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer