On 2021-05-20 4:18 p.m., Daniel Vetter wrote: > On Thu, May 20, 2021 at 10:13:38AM +0200, Michel Dänzer wrote: >> On 2021-05-20 9:55 a.m., Daniel Vetter wrote: >>> On Wed, May 19, 2021 at 5:48 PM Michel Dänzer <mic...@daenzer.net> wrote: >>>> >>>> On 2021-05-19 5:21 p.m., Jason Ekstrand wrote: >>>>> On Wed, May 19, 2021 at 5:52 AM Michel Dänzer <mic...@daenzer.net> wrote: >>>>>> >>>>>> On 2021-05-19 12:06 a.m., Jason Ekstrand wrote: >>>>>>> On Tue, May 18, 2021 at 4:17 PM Daniel Vetter <dan...@ffwll.ch> wrote: >>>>>>>> >>>>>>>> On Tue, May 18, 2021 at 7:40 PM Christian König >>>>>>>> <ckoenig.leichtzumer...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Am 18.05.21 um 18:48 schrieb Daniel Vetter: >>>>>>>>>> On Tue, May 18, 2021 at 2:49 PM Christian König >>>>>>>>>> <ckoenig.leichtzumer...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> And as long as we are all inside amdgpu we also don't have any >>>>>>>>>>> oversync, >>>>>>>>>>> the issue only happens when we share dma-bufs with i915 (radeon and >>>>>>>>>>> AFAIK nouveau does the right thing as well). >>>>>>>>>> Yeah because then you can't use the amdgpu dma_resv model anymore and >>>>>>>>>> have to use the one atomic helpers use. Which is also the one that >>>>>>>>>> e.g. Jason is threathening to bake in as uapi with his dma_buf ioctl, >>>>>>>>>> so as soon as that lands and someone starts using it, something has >>>>>>>>>> to >>>>>>>>>> adapt _anytime_ you have a dma-buf hanging around. Not just when it's >>>>>>>>>> shared with another device. >>>>>>>>> >>>>>>>>> Yeah, and that is exactly the reason why I will NAK this uAPI change. >>>>>>>>> >>>>>>>>> This doesn't works for amdgpu at all for the reasons outlined above. >>>>>>>> >>>>>>>> Uh that's really not how uapi works. "my driver is right, everyone >>>>>>>> else is wrong" is not how cross driver contracts are defined. If that >>>>>>>> means a perf impact until you've fixed your rules, that's on you. >>>>>>>> >>>>>>>> Also you're a few years too late with nacking this, it's already uapi >>>>>>>> in the form of the dma-buf poll() support. >>>>>>> >>>>>>> ^^ My fancy new ioctl doesn't expose anything that isn't already >>>>>>> there. It just lets you take a snap-shot of a wait instead of doing >>>>>>> an active wait which might end up with more fences added depending on >>>>>>> interrupts and retries. The dma-buf poll waits on all fences for >>>>>>> POLLOUT and only the exclusive fence for POLLIN. It's already uAPI. >>>>>> >>>>>> Note that the dma-buf poll support could be useful to Wayland >>>>>> compositors for the same purpose as Jason's new ioctl (only using client >>>>>> buffers which have finished drawing for an output frame, to avoid >>>>>> missing a refresh cycle due to client drawing), *if* it didn't work >>>>>> differently with amdgpu. >>>>>> >>>>>> Am I understanding correctly that Jason's new ioctl would also work >>>>>> differently with amdgpu as things stand currently? If so, that would be >>>>>> a real bummer and might hinder adoption of the ioctl by Wayland >>>>>> compositors. >>>>> >>>>> My new ioctl has identical semantics to poll(). It just lets you take >>>>> a snapshot in time to wait on later instead of waiting on whatever >>>>> happens to be set right now. IMO, having identical semantics to >>>>> poll() isn't something we want to change. >>>> >>>> Agreed. >>>> >>>> I'd argue then that making amdgpu poll semantics match those of other >>>> drivers is a pre-requisite for the new ioctl, otherwise it seems unlikely >>>> that the ioctl will be widely adopted. >>> >>> This seems backwards, because that means useful improvements in all >>> other drivers are stalled until amdgpu is fixed. >>> >>> I think we need agreement on what the rules are, reasonable plan to >>> get there, and then that should be enough to unblock work in the wider >>> community. Holding the community at large hostage because one driver >>> is different is really not great. >> >> I think we're in violent agreement. :) The point I was trying to make is >> that amdgpu really needs to be fixed to be consistent with other drivers >> ASAP. > > It's not that easy at all. I think best case we're looking at about a one > year plan to get this into shape, taking into account usual release/distro > update latencies. > > Best case. > > But also it's not a really big issue, since this shouldn't stop > compositors from using poll on dma-buf fd or the sync_file stuff from > Jason: The use-case for this in compositors is to avoid a single client > stalling the entire desktop. If a driver lies by not setting the exclusive > fence when expected, you simply don't get this stall avoidance benefit of > misbehaving clients.
That's a good point; I was coming to the same realization. > But also this needs a gpu scheduler and higher priority for the > compositor (or a lot of hw planes so you can composite > with them alone), so it's all fairly academic issue. I went ahead and implemented this for mutter: https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 Works as intended on my work laptop with Intel GPU, so it's not just academic. :) I hope this can serve as motivation for providing the same poll semantics (and a higher priority GFX queue exposed via EGL_IMG_context_priority) in amdgpu as well. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer