What about direct submit from the kernel where the process still has write access to the GPU ring buffer but doesn't use it? I think that solves your preemption example, but leaves a potential backdoor for a process to overwrite the signal commands, which shouldn't be a problem since we are OK with timeouts.
Marek On Mon, May 3, 2021 at 11:23 AM Jason Ekstrand <ja...@jlekstrand.net> wrote: > On Mon, May 3, 2021 at 10:16 AM Bas Nieuwenhuizen > <b...@basnieuwenhuizen.nl> wrote: > > > > On Mon, May 3, 2021 at 5:00 PM Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > > > > > Sorry for the top-post but there's no good thing to reply to here... > > > > > > One of the things pointed out to me recently by Daniel Vetter that I > > > didn't fully understand before is that dma_buf has a very subtle > > > second requirement beyond finite time completion: Nothing required > > > for signaling a dma-fence can allocate memory. Why? Because the act > > > of allocating memory may wait on your dma-fence. This, as it turns > > > out, is a massively more strict requirement than finite time > > > completion and, I think, throws out all of the proposals we have so > > > far. > > > > > > Take, for instance, Marek's proposal for userspace involvement with > > > dma-fence by asking the kernel for a next serial and the kernel > > > trusting userspace to signal it. That doesn't work at all if > > > allocating memory to trigger a dma-fence can blow up. There's simply > > > no way for the kernel to trust userspace to not do ANYTHING which > > > might allocate memory. I don't even think there's a way userspace can > > > trust itself there. It also blows up my plan of moving the fences to > > > transition boundaries. > > > > > > Not sure where that leaves us. > > > > Honestly the more I look at things I think userspace-signalable fences > > with a timeout sound like they are a valid solution for these issues. > > Especially since (as has been mentioned countless times in this email > > thread) userspace already has a lot of ways to cause timeouts and or > > GPU hangs through GPU work already. > > > > Adding a timeout on the signaling side of a dma_fence would ensure: > > > > - The dma_fence signals in finite time > > - If the timeout case does not allocate memory then memory allocation > > is not a blocker for signaling. > > > > Of course you lose the full dependency graph and we need to make sure > > garbage collection of fences works correctly when we have cycles. > > However, the latter sounds very doable and the first sounds like it is > > to some extent inevitable. > > > > I feel like I'm missing some requirement here given that we > > immediately went to much more complicated things but can't find it. > > Thoughts? > > Timeouts are sufficient to protect the kernel but they make the fences > unpredictable and unreliable from a userspace PoV. One of the big > problems we face is that, once we expose a dma_fence to userspace, > we've allowed for some pretty crazy potential dependencies that > neither userspace nor the kernel can sort out. Say you have marek's > "next serial, please" proposal and a multi-threaded application. > Between time time you ask the kernel for a serial and get a dma_fence > and submit the work to signal that serial, your process may get > preempted, something else shoved in which allocates memory, and then > we end up blocking on that dma_fence. There's no way userspace can > predict and defend itself from that. > > So I think where that leaves us is that there is no safe place to > create a dma_fence except for inside the ioctl which submits the work > and only after any necessary memory has been allocated. That's a > pretty stiff requirement. We may still be able to interact with > userspace a bit more explicitly but I think it throws any notion of > userspace direct submit out the window. > > --Jason > > > > - Bas > > > > > > --Jason > > > > > > On Mon, May 3, 2021 at 9:42 AM Alex Deucher <alexdeuc...@gmail.com> > wrote: > > > > > > > > On Sat, May 1, 2021 at 6:27 PM Marek Olšák <mar...@gmail.com> wrote: > > > > > > > > > > On Wed, Apr 28, 2021 at 5:07 AM Michel Dänzer <mic...@daenzer.net> > wrote: > > > > >> > > > > >> On 2021-04-28 8:59 a.m., Christian König wrote: > > > > >> > Hi Dave, > > > > >> > > > > > >> > Am 27.04.21 um 21:23 schrieb Marek Olšák: > > > > >> >> Supporting interop with any device is always possible. It > depends on which drivers we need to interoperate with and update them. > We've already found the path forward for amdgpu. We just need to find out > how many other drivers need to be updated and evaluate the cost/benefit > aspect. > > > > >> >> > > > > >> >> Marek > > > > >> >> > > > > >> >> On Tue, Apr 27, 2021 at 2:38 PM Dave Airlie <airl...@gmail.com > <mailto:airl...@gmail.com>> wrote: > > > > >> >> > > > > >> >> On Tue, 27 Apr 2021 at 22:06, Christian König > > > > >> >> <ckoenig.leichtzumer...@gmail.com <mailto: > ckoenig.leichtzumer...@gmail.com>> wrote: > > > > >> >> > > > > > >> >> > Correct, we wouldn't have synchronization between device > with and without user queues any more. > > > > >> >> > > > > > >> >> > That could only be a problem for A+I Laptops. > > > > >> >> > > > > >> >> Since I think you mentioned you'd only be enabling this on > newer > > > > >> >> chipsets, won't it be a problem for A+A where one A is a > generation > > > > >> >> behind the other? > > > > >> >> > > > > >> > > > > > >> > Crap, that is a good point as well. > > > > >> > > > > > >> >> > > > > >> >> I'm not really liking where this is going btw, seems like > a ill > > > > >> >> thought out concept, if AMD is really going down the road > of designing > > > > >> >> hw that is currently Linux incompatible, you are going to > have to > > > > >> >> accept a big part of the burden in bringing this support > in to more > > > > >> >> than just amd drivers for upcoming generations of gpu. > > > > >> >> > > > > >> > > > > > >> > Well we don't really like that either, but we have no other > option as far as I can see. > > > > >> > > > > >> I don't really understand what "future hw may remove support for > kernel queues" means exactly. While the per-context queues can be mapped to > userspace directly, they don't *have* to be, do they? I.e. the kernel > driver should be able to either intercept userspace access to the queues, > or in the worst case do it all itself, and provide the existing > synchronization semantics as needed? > > > > >> > > > > >> Surely there are resource limits for the per-context queues, so > the kernel driver needs to do some kind of virtualization / multi-plexing > anyway, or we'll get sad user faces when there's no queue available for > <current hot game>. > > > > >> > > > > >> I'm probably missing something though, awaiting enlightenment. :) > > > > > > > > > > > > > > > The hw interface for userspace is that the ring buffer is mapped > to the process address space alongside a doorbell aperture (4K page) that > isn't real memory, but when the CPU writes into it, it tells the hw > scheduler that there are new GPU commands in the ring buffer. Userspace > inserts all the wait, draw, and signal commands into the ring buffer and > then "rings" the doorbell. It's my understanding that the ring buffer and > the doorbell are always mapped in the same GPU address space as the > process, which makes it very difficult to emulate the current protected > ring buffers in the kernel. The VMID of the ring buffer is also not > changeable. > > > > > > > > > > > > > The doorbell does not have to be mapped into the process's GPU > virtual > > > > address space. The CPU could write to it directly. Mapping it into > > > > the GPU's virtual address space would allow you to have a device kick > > > > off work however rather than the CPU. E.g., the GPU could kick off > > > > it's own work or multiple devices could kick off work without CPU > > > > involvement. > > > > > > > > Alex > > > > > > > > > > > > > The hw scheduler doesn't do any synchronization and it doesn't see > any dependencies. It only chooses which queue to execute, so it's really > just a simple queue manager handling the virtualization aspect and not much > else. > > > > > > > > > > Marek > > > > > _______________________________________________ > > > > > dri-devel mailing list > > > > > dri-de...@lists.freedesktop.org > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > _______________________________________________ > > > > mesa-dev mailing list > > > > mesa-dev@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > _______________________________________________ > > > dri-devel mailing list > > > dri-de...@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev