> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of 
> Jaroslav
> Pulchart
> Sent: Wednesday, April 16, 2025 9:04 AM
> To: Jakub Kicinski <[email protected]>
> Cc: Kitszel, Przemyslaw <[email protected]>; Damato, Joe
> <[email protected]>; [email protected]; 
> [email protected];
> Nguyen, Anthony L <[email protected]>; Igor Raits
> <[email protected]>; Daniel Secik <[email protected]>; Zdenek Pesek
> <[email protected]>; Dumazet, Eric <[email protected]>; Martin
> Karsten <[email protected]>; Zaki, Ahmed <[email protected]>;
> Czapnik, Lukasz <[email protected]>; Michal Swiatkowski
> <[email protected]>
> Subject: Re: [Intel-wired-lan] Increased memory usage on NUMA nodes with ICE
> driver after upgrade to 6.13.y (regression in commit 492a044508ad)
> 
> >
> > On Wed, 16 Apr 2025 09:13:23 +0200 Jaroslav Pulchart wrote:
> > > By "traced" I mean using the kernel and checking memory situation on
> > > numa nodes with and without production load.  Numa nodes, with X810
> > > NIC, showing a quite less available memory with default queue length
> > > (num of all cpus) and it needs to be lowered to 1-2 (for unused
> > > interfaces) and up-to-count of numa node cores on used interfaces to
> > > make the memory allocation reasonable and server avoiding "kswapd"...
> > >
> > > See "MemFree" on numa 0 + 1 on different/smaller but utilized (running
> > > VMs + using network) host server with 8 numa nodes (32GB RAM each, 28G
> > > in Hugepase for VMs and 4GB for host os):
> >
> > FWIW you can also try the tools/net/ynl/samples/page-pool
> > application, not sure if Intel NICs init page pools appropriately
> > but this will show you exactly how much memory is sitting on Rx rings
> > of the driver (and in net socket buffers).
> 
> I'm not familiar with the page-pool tool, I try to build it, run it
> and nothing is shown. Any hint/menual how to use it?
> 
> >
> > > 6.13.y vanilla (lot of kswapd0 in background):
> > >     NUMA nodes:     0       1       2       3       4       5       6     
> > >   7
> > >     HPTotalGiB:     28      28      28      28      28      28      28    
> > >   28
> > >     HPFreeGiB:      0       0       0       0       0       0       0     
> > >   0
> > >     MemTotal:       32220   32701   32701   32686   32701   32701
> > > 32701   32696
> > >     MemFree:        274     254     1327    1928    1949    2683    2624  
> > >   2769
> > > 6.13.y + Revert (no memory issues at all):
> > >     NUMA nodes: 0 1 2 3 4 5 6 7
> > >     HPTotalGiB: 28 28 28 28 28 28 28 28
> > >     HPFreeGiB: 0 0 0 0 0 0 0 0
> > >     MemTotal: 32220 32701 32701 32686 32701 32701 32701 32696
> > >     MemFree: 2213 2438 3402 3108 2846 2672 2592 3063
> > >
> > > We need to lower the queue on all X810 interfaces from default (64 in
> > > this case), to ensure we have memory available for host OS services.
> > >     ethtool -L em2 combined 1
> > >     ethtool -L p3p2 combined 1
> > >     ethtool -L em1 combined 6
> > >     ethtool -L p3p1 combined 6
> > > This trick "does not work" without the revert.
> >
> > And you're reverting just and exactly 492a044508ad13 ?
> > The memory for persistent config is allocated in alloc_netdev_mqs()
> > unconditionally. I'm lost as to how this commit could make any
> > difference :(
> 
> Yes, reverted the 492a044508ad13.

Struct napi_config *is* 1056 bytes, or about 1Kb, and we will allocate one per 
max queue with this change, resulting in 1KB per CPU.. if there is a 64 CPU 
system this should be at most 64KB... That seems unlikely to be the root cause 
of memory outage like this is just the napi_config structure....

Perhaps something that netif_napi_restore_config is somehow causing us to end 
up with more allocated memory? Or some interaction with our ethtool callback to 
reduce the number of rings is not working properly..?

Reply via email to