Hello

We are still facing the memory issue with Intel 810 NICs (even on latest
6.15.y).

Our current stabilization and solution is to move everything to a new
INTEL-FREE server and get rid of last Intel sights there (after Intel's CPU
vulnerabilities fuckups NICs are next step).

Any help welcomed,
Jaroslav P.



st 4. 6. 2025 v 10:42 odesílatel Jaroslav Pulchart <
[email protected]> napsal:

> >
> > čt 17. 4. 2025 v 19:52 odesílatel Keller, Jacob E
> > <[email protected]> napsal:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jakub Kicinski <[email protected]>
> > > > Sent: Wednesday, April 16, 2025 5:13 PM
> > > > To: Keller, Jacob E <[email protected]>
> > > > Cc: Jaroslav Pulchart <[email protected]>; Kitszel,
> Przemyslaw
> > > > <[email protected]>; Damato, Joe <[email protected]>;
> intel-wired-
> > > > [email protected]; [email protected]; Nguyen, Anthony L
> > > > <[email protected]>; Igor Raits <[email protected]>;
> Daniel Secik
> > > > <[email protected]>; Zdenek Pesek <[email protected]
> >;
> > > > Dumazet, Eric <[email protected]>; Martin Karsten
> > > > <[email protected]>; Zaki, Ahmed <[email protected]>;
> Czapnik,
> > > > Lukasz <[email protected]>; Michal Swiatkowski
> > > > <[email protected]>
> > > > Subject: Re: [Intel-wired-lan] Increased memory usage on NUMA nodes
> with ICE
> > > > driver after upgrade to 6.13.y (regression in commit 492a044508ad)
> > > >
> > > > On Wed, 16 Apr 2025 22:57:10 +0000 Keller, Jacob E wrote:
> > > > > > > And you're reverting just and exactly 492a044508ad13 ?
> > > > > > > The memory for persistent config is allocated in
> alloc_netdev_mqs()
> > > > > > > unconditionally. I'm lost as to how this commit could make any
> > > > > > > difference :(
> > > > > >
> > > > > > Yes, reverted the 492a044508ad13.
> > > > >
> > > > > Struct napi_config *is* 1056 bytes
> > > >
> > > > You're probably looking at 6.15-rcX kernels. Yes, the affinity mask
> > > > can be large depending on the kernel config. But report is for 6.13,
> > > > AFAIU. In 6.13 and 6.14 napi_config was tiny.
> > >
> > > Regardless, it should still be ~64KB even in that case which is a far
> cry from eating all available memory. Something else must be going on....
> > >
> > > Thanks,
> > > Jake
> >
> > Hello
> >
> > Some observation, this "problem" still exists with the latest 6.14.y
> > and there must be multiple issues, the memory utilization is slowly
> > going down, from 3GB to 100MB in 10-20days. at home NUMA nodes where
> > intel x810 NIC are (looks like some memory leak related to
> > networking).
> >
> > So without the revert the kawadX usage is observed asap like till
> > 1-2d, with revert of mentioned commit kswadX starts to consume
> > resources later like in ~10d-20d later. It is almost impossible to use
> > servers with Intel X810 cards (ice driver) with recent linux kernels.
> >
> > Were you able to reproduce the memory problems in your testbed?
> >
> > Best,
> > Jaroslav
>
> Hello
>
> I deployed linux 6.15.0 to our servers 7d ago and observed the
> behaviour of memory utilization of NUMA home nodes of Intel X810
> 1/ there is no need to revert the commit as before,
> 2/ the memory is continuously consumed (like memory leak),
> see attached "7d_memory_usage_per_numa_linux6.15.0.png" screenshot 8x
> numa nodes, (NUMA0 + NUMA1 are local for X810 nics). BTW: We do not
> see this memory utilization pattern on server s using Broadcom
> Netxtreme-E NICs
>


-- 
Jaroslav Pulchart
Sr. Principal SW Engineer
GoodData

Reply via email to