Am 10.04.25 um 19:20 schrieb Boris Brezillon: > [SNIP] >>>> Faulting is only legal when you have something like HMM, SVM or >>>> whatever you call it. And then you can just use a plain shmem >>>> object to provide you with backing pages. >>>> >>>> I mean we could in theory allow faulting on GEM objects as well, >>>> but we would need to take very strict precautions on that we >>>> currently don't have as far as I know. >>> We only use this mechanism for very specific allocations: tiler >>> memory whose maximum size can't be guessed upfront because tile >>> binning is by nature unpredictible. >>> >>>> So could you explain how this works in the first place? >>> I can explain you how this works in Panthor, yes. You get an initial >>> amount of memory that the tiler can use, when it runs out of >>> memory, it will first ask the system for more memory, if the >>> allocation fails, it will fallback to what they call "incremental >>> rendering", where the already binned primitives are flushed to the >>> FB in order to free memory, and the rendering starts over from >>> there, with the memory that has been freed. >>> >>> In Panthor, this on-demand allocation scheme is something that >>> allows us to speed-up rendering when there's memory available, but >>> we can make progress when that's not the case, hence the failable >>> allocation scheme I'm proposing here. >> Puh, that sounds like it can potentially work but is also very very >> fragile. >> >> You must have a code comment when you select the GFP flags how and >> why that works. > + /* We want non-blocking allocations, if we're OOM, we just fail the job > + * to unblock things. > + */ > + ret = drm_gem_shmem_sparse_populate_range(&bo->base, page_offset, > + NUM_FAULT_PAGES, page_gfp, > + __GFP_NORETRY | __GFP_NOWARN); > > That's what I have right now in the failable allocation path. The > tiler chunk allocation uses the same flags, but doesn't have a > comment explaining that a fallback exists when the allocation fails.
We agreed to use GFP_NOWAIT for such types of allocations to at least wake up kswapd on the low water mark. IIRC even using __GFP_NORETRY here was illegal, but I need to double check the discussions again. >From the comment this at minimum needs an explanation what influence this has >on the submission and that you can still guarantee fence forward progress. >>> In Panfrost and Lima, we don't have this concept of "incremental >>> rendering", so when we fail the allocation, we just fail the GPU job >>> with an unhandled GPU fault. >> To be honest I think that this is enough to mark those two drivers as >> broken. It's documented that this approach is a no-go for upstream >> drivers. >> >> How widely is that used? > It exists in lima and panfrost, and I wouldn't be surprised if a similar > mechanism was used in other drivers for tiler-based GPUs (etnaviv, > freedreno, powervr, ...), because ultimately that's how tilers work: > the amount of memory needed to store per-tile primitives (and metadata) > depends on what the geometry pipeline feeds the tiler with, and that > can't be predicted. If you over-provision, that's memory the system won't > be able to use while rendering takes place, even though only a small > portion might actually be used by the GPU. If your allocation is too > small, it will either trigger a GPU fault (for HW not supporting an > "incremental rendering" mode) or under-perform (because flushing > primitives has a huge cost on tilers). > > Calling these drivers broken without having a plan to fix them is > also not option. Yeah, agree we need to have some kind of alternative. It's just that at the moment I can't think of any. >>> And that's how it is today, the >>> alloc-on-fault mechanism is being used in at least 3 drivers, and >>> all I'm trying to do here is standardize it and try to document the >>> constraints that comes with this model, constraint that are >>> currently being ignored. Like the fact allocations in the fault >>> handler path shouldn't block so we're guaranteed to signal the job >>> fence in finite time, and we don't risk a deadlock between the >>> driver shrinker and the job triggering the fault. >> Well on one hand it's good to that we document the pitfalls, but I >> clearly think that we should *not* spread that into common code. > Ack on not encouraging people to use that; but having a clear path > describing how this should be done for HW that don't have other > options, with helpers and extensive doc is IMHO better than letting > new drivers reproduce the mistake that were done in the past. > Because, if you tell people "don't do that", but don't have an > alternative to propose, they'll end up doing it anyway. Just to be clear: We have already completely rejected code from going upstream because of that! And we are not talking about anything small, but rather a full blown framework and every developed by a major player. Additional to that both i915 and amdgpu had code which used this approach as well and we reverted back because it simply doesn't work reliable. >> That would give the impression that this actually works and to be >> honest I should start to charge money to rejecting stuff like that. >> It would probably get me rich. >> >>> I'm well aware of the implications of what I'm proposing here, but >>> ignoring the fact some drivers already violate the rules don't make >>> them disappear. So I'd rather work with you and others to clearly >>> state what the alloc-in-fault-path rules are, and enforce them in >>> some helper functions, than pretend these ugly things don't exist. >>> :D >> Yeah, but this kind of says to others it's ok to do this which >> clearly isn't as far as I can see. > Not really. At least not if we properly review any driver that use > these APIs, and clearly state in the doc that this is provided as a > last resort for HW that can't do without on-fault-allocation, because > they are designed to work this way. > >> What we should do instead is to add this as not ok approaches to the >> documentation and move on. > Well, that's what happened with panfrost, lima and panthor, and see > where we are now: 3 drivers that you consider broken (probably > rightfully), and more to come if we don't come up with a plan for HW > that have the same requirements (as I said, I wouldn't be surprised > if most tilers were relying on the same kind of on-demand-allocation). Well we have HW features in major drivers which we simply don't use because of that. > As always, I appreciate your valuable inputs, and would be happy to > try anything you think might be more adapted than what is proposed > here, but saying "this is broken HW/driver, so let's ignore it" is > not really an option, at least if you don't want the bad design > pattern to spread. Yeah, I'm not sure what to do either. We *know* from experience that this approach will fail. We have already tried that. Regards, Christian. > > Regards, > > Boris