On 7/7/2025 3:03 PM, Jacob Keller wrote:
> Bad news: my hypothesis was incorrect.
> 
> Good news: I can immediately see the problem if I set MTU to 9K and
> start an iperf3 session and just watch the count of allocations from
> ice_alloc_mapped_pages(). It goes up consistently, so I can quickly tell
> if a change is helping.
> 
> I ported the stats from i40e for tracking the page allocations, and I
> can see that we're allocating new pages despite not actually performing
> releases.
> 
> I don't yet have a good understanding of what causes this, and the logic
> in ice is pretty hard to track...
> 
> I'm going to try the page pool patches myself to see if this test bed
> triggers the same problems. Unfortunately I think I need someone else
> with more experience with the hotpath code to help figure out whats
> going wrong here...

I believe I have isolated this and figured out the issue: With 9K MTU,
sometimes the hardware posts a multi-buffer frame with an extra
descriptor that has a size of 0 bytes with no data in it. When this
happens, our logic for tracking buffers fails to free this buffer. We
then later overwrite the page because we failed to either free or re-use
the page, and our overwriting logic doesn't verify this.

I will have a fix with a more detailed description posted tomorrow.

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to