Hi Bjoern,

> What is the drm code in question?  ttm_pool_alloc -> ttm_pool_alloc_page()?
> As all other uses of __GFP_NORETRY in 6.1 (ignoring drm_printf.c) seem to be
> in i915.

Yes, this is indeed targeted at ttm_pool_alloc_page() which tries to allocate 
contiguous pages to fill up the pool (but not in 5.10).  TTM pools are used by 
the amdgpu driver to build its translation tables.

Calls to functions other than (linux_)alloc_pages*() are unaffected by the 
change, and if you dig through all the references of __GFP_NORETRY/GFP_RETRY 
(including those from files under 'selftests/'), you'll see the built GFP flags 
are never used with (linux_)alloc_pages*(), except for the only reference you 
mentioned.
 
> Are you sure?
> 
> i915_gem_object_get_pages_internal() in drm-6.1 at least seems to
> conditionally pass it down:
> 
>       17 #define QUIET (__GFP_NORETRY | __GFP_NOWARN)
>       ...
>       74                         page = alloc_pages(gfp | (order ? QUIET : 
> MAYFAIL),
> 
> Seems it can deal with allocation failures, lowering order or returning
> -ENOMEM from the function so should be fine hopefully.

Yes, I was aware of this piece of code, but obviously it cannot cause any 
problem.

All calls to Linux's alloc_pages*() can fail *whatever* the passed GFP flags 
except for GFP_NOFAIL (and that's the only exception).  Callers always have to 
cope, and specifically when specifying __GFP_NORETRY it would be foolish not 
too (and that wouldn't be allowed in Linus' tree anyway).

If it wasn't for that, i915_gem_object_get_pages_internal() does the same 
lowering that ttm_pool_alloc_page() does anyway, as you noticed.

My sentence was indeed too strong, as I was still swapping in context for this 
work which was done months ago now.  I reviewed all callers not only for 
GFP_NORETRY but also for most others GFP flags (I have tweaked grep files for 
all of them and over multiple Linux versions), as I started some work to 
document what the Linux guarantees/behaviors really are and then some other 
work to rationalize how we translate them in FreeBSD (there seems to be several 
possible improvements here).  Unfortunately, I have stalled that last work for 
weeks now, and probably will for a significant while.

Given Linux's contract on __GFP_NORETRY, it is arguably not reasonable to spend 
time compacting memory on such calls, that's a deviation from what drivers are 
supposed to expect.

Oh, and the rest of the commit message also doesn't mention that I also tested 
this change on machines using the i915 driver, without observing any problem or 
change in behavior.

Thanks and regards.

-- 
Olivier Certner

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to