On 7/11/2025 11:16 AM, Jaroslav Pulchart wrote: >> >> >> >> On 7/9/2025 2:04 PM, Jaroslav Pulchart wrote: >>>> >>>> >>>> On 7/8/2025 5:50 PM, Jacob Keller wrote: >>>>> >>>>> >>>>> On 7/7/2025 3:03 PM, Jacob Keller wrote: >>>>>> Bad news: my hypothesis was incorrect. >>>>>> >>>>>> Good news: I can immediately see the problem if I set MTU to 9K and >>>>>> start an iperf3 session and just watch the count of allocations from >>>>>> ice_alloc_mapped_pages(). It goes up consistently, so I can quickly tell >>>>>> if a change is helping. >>>>>> >>>>>> I ported the stats from i40e for tracking the page allocations, and I >>>>>> can see that we're allocating new pages despite not actually performing >>>>>> releases. >>>>>> >>>>>> I don't yet have a good understanding of what causes this, and the logic >>>>>> in ice is pretty hard to track... >>>>>> >>>>>> I'm going to try the page pool patches myself to see if this test bed >>>>>> triggers the same problems. Unfortunately I think I need someone else >>>>>> with more experience with the hotpath code to help figure out whats >>>>>> going wrong here... >>>>> >>>>> I believe I have isolated this and figured out the issue: With 9K MTU, >>>>> sometimes the hardware posts a multi-buffer frame with an extra >>>>> descriptor that has a size of 0 bytes with no data in it. When this >>>>> happens, our logic for tracking buffers fails to free this buffer. We >>>>> then later overwrite the page because we failed to either free or re-use >>>>> the page, and our overwriting logic doesn't verify this. >>>>> >>>>> I will have a fix with a more detailed description posted tomorrow. >>>> >>>> @Jaroslav, I've posted a fix which I believe should resolve your issue: >>>> >>>> https://lore.kernel.org/intel-wired-lan/[email protected]/T/#u >>>> >>>> I am reasonably confident it should resolve the issue you reported. If >>>> possible, it would be appreciated if you could test it and report back >>>> to confirm. >>> >>> @Jacob that’s excellent news! >>> >>> I’ve built and installed 6.15.5 with your patch on one of our servers >>> (strange that I had to disable CONFIG_MEM_ALLOC_PROFILING with this >>> patch or the kernel wouldn’t boot) and started a VM running our >>> production traffic. I’ll let it run for a day-two, observe the memory >>> utilization per NUMA node and report back. >> >> Great! A bit odd you had to disable CONFIG_MEM_ALLOC_PROFILING. I didn't >> have trouble on my kernel with it enabled. > > Status update after ~45h of uptime. So far so good, I do not see > continuous memory consumption increase on home numa nodes like before. > See attached "status_before_after_45h_uptime.png" comparison.
Great news! Would you like your "Tested-by" being added to the commit message when we submit the fix to netdev?
OpenPGP_signature.asc
Description: OpenPGP digital signature
