čt 4. 4. 2024 v 20:17 odesílatel Jaroslav Pulchart
<jaroslav.pulch...@gooddata.com> napsal:
>
> čt 4. 4. 2024 v 15:37 odesílatel Jakub Kicinski <k...@kernel.org> napsal:
> >
> > On Thu, 4 Apr 2024 07:42:45 +0200 Jaroslav Pulchart wrote:
> > > We do not have much progress
> >
> > Random thought - do you have KFENCE enabled?
> > It's sufficiently low overhead to run in production and maybe it could
> > help catch the bug? You also hit some inexplicable bug in the Intel
> > driver, IIRC, there may be something odd going on.. (it's not all
> > happening on a single machine, right?)
>
> We have KFENCE enabled.
>
> Issue was observed at multiple servers. It is not a problem to reproduce it
> everywhere where we deploy Loki service. The trigger is: I click
> once/twice "run query" (LogQL) button by Grafana UI. the Loki is
> starting to load data from the minio cluster at a speed of ~2GB/s and
> almost immediately it crashes.
>
> The Intel ICE driver is in my suspicion as well, it will not be for
> the first time when we are hitting some bugs there. I will try one
> testing server where we have different NIC vendor later.

I run the setup on a server with a different network card than E810, I
used BCM57414 NetXtreme-E + driver bnxt_en. The issue is not
reproducible there. So it looks to be connected with Intel's ice
driver for E810 network card and introduced in 6.3.

Reply via email to