On Thu, May 20, 2021 at 9:04 PM Jason Ekstrand <ja...@jlekstrand.net> wrote:
>
> On Thu, May 20, 2021 at 12:23 PM Jason Ekstrand <ja...@jlekstrand.net> wrote:
> >
> > On Thu, May 20, 2021 at 5:50 AM Christian König
> > <ckoenig.leichtzumer...@gmail.com> wrote:
> > >
> > > Am 20.05.21 um 09:55 schrieb Daniel Vetter:
> > > > On Wed, May 19, 2021 at 5:48 PM Michel Dänzer <mic...@daenzer.net> 
> > > > wrote:
> > > >> On 2021-05-19 5:21 p.m., Jason Ekstrand wrote:
> > > >>> On Wed, May 19, 2021 at 5:52 AM Michel Dänzer <mic...@daenzer.net> 
> > > >>> wrote:
> > > >>>> On 2021-05-19 12:06 a.m., Jason Ekstrand wrote:
> > > >>>>> On Tue, May 18, 2021 at 4:17 PM Daniel Vetter <dan...@ffwll.ch> 
> > > >>>>> wrote:
> > > >>>>>> On Tue, May 18, 2021 at 7:40 PM Christian König
> > > >>>>>> <ckoenig.leichtzumer...@gmail.com> wrote:
> > > >>>>>>> Am 18.05.21 um 18:48 schrieb Daniel Vetter:
> > > >>>>>>>> On Tue, May 18, 2021 at 2:49 PM Christian König
> > > >>>>>>>> <ckoenig.leichtzumer...@gmail.com> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> And as long as we are all inside amdgpu we also don't have any 
> > > >>>>>>>>> oversync,
> > > >>>>>>>>> the issue only happens when we share dma-bufs with i915 (radeon 
> > > >>>>>>>>> and
> > > >>>>>>>>> AFAIK nouveau does the right thing as well).
> > > >>>>>>>> Yeah because then you can't use the amdgpu dma_resv model 
> > > >>>>>>>> anymore and
> > > >>>>>>>> have to use the one atomic helpers use. Which is also the one 
> > > >>>>>>>> that
> > > >>>>>>>> e.g. Jason is threathening to bake in as uapi with his dma_buf 
> > > >>>>>>>> ioctl,
> > > >>>>>>>> so as soon as that lands and someone starts using it, something 
> > > >>>>>>>> has to
> > > >>>>>>>> adapt _anytime_ you have a dma-buf hanging around. Not just when 
> > > >>>>>>>> it's
> > > >>>>>>>> shared with another device.
> > > >>>>>>> Yeah, and that is exactly the reason why I will NAK this uAPI 
> > > >>>>>>> change.
>
> I just re-sent my dma-buf sync_file import/export series.  Assuming we
> can sort out what implicit sync looks like on the inside of dma-buf,
> would that alleviate some of your uAPI fears?  The idea would be that
> radeonsi and RADV would use amdgpu explicit sync primitives for
> everything and then, at the very end, fetch a sync_file and stuff it
> in the dma-buf's implicit sync container.  No nasty new uAPI for you.
> We still get implicit sync.  Everyone wins?

You still need the implicit fencing opt-out, which currently amdgpu
lacks completely.

But I also thought through the security implications of the patch set
(including the exclusive injection patch 4), and I think even with
current amdgpu that's perfectly fine. Not very useful since the fences
you get out aren't reflecting status accurately, but that's not a
correctness/security issue. You'll simply hit stalls when you don't
expect, because the kernel is allowed to throw random other exclusive
fences in whenever it feels like.

> Of course, that still leaves the question of what read and write
> fences are, what they mean, and where they go in the dma_resv.  But
> I'm trying to separate problems here.

Yeah I'll dump my patch set for clarifying status quo tomorrow for that.
-Daniel

>
> --Jason
>
>
> > > >>>>>>> This doesn't works for amdgpu at all for the reasons outlined 
> > > >>>>>>> above.
> > > >>>>>> Uh that's really not how uapi works. "my driver is right, everyone
> > > >>>>>> else is wrong" is not how cross driver contracts are defined. If 
> > > >>>>>> that
> > > >>>>>> means a perf impact until you've fixed your rules, that's on you.
> > > >>>>>>
> > > >>>>>> Also you're a few years too late with nacking this, it's already 
> > > >>>>>> uapi
> > > >>>>>> in the form of the dma-buf poll() support.
> > > >>>>> ^^  My fancy new ioctl doesn't expose anything that isn't already
> > > >>>>> there.  It just lets you take a snap-shot of a wait instead of doing
> > > >>>>> an active wait which might end up with more fences added depending 
> > > >>>>> on
> > > >>>>> interrupts and retries.  The dma-buf poll waits on all fences for
> > > >>>>> POLLOUT and only the exclusive fence for POLLIN.  It's already uAPI.
> > > >>>> Note that the dma-buf poll support could be useful to Wayland 
> > > >>>> compositors for the same purpose as Jason's new ioctl (only using 
> > > >>>> client buffers which have finished drawing for an output frame, to 
> > > >>>> avoid missing a refresh cycle due to client drawing), *if* it didn't 
> > > >>>> work differently with amdgpu.
> > > >>>>
> > > >>>> Am I understanding correctly that Jason's new ioctl would also work 
> > > >>>> differently with amdgpu as things stand currently? If so, that would 
> > > >>>> be a real bummer and might hinder adoption of the ioctl by Wayland 
> > > >>>> compositors.
> > > >>> My new ioctl has identical semantics to poll().  It just lets you take
> > > >>> a snapshot in time to wait on later instead of waiting on whatever
> > > >>> happens to be set right now.  IMO, having identical semantics to
> > > >>> poll() isn't something we want to change.
> > > >> Agreed.
> > > >>
> > > >> I'd argue then that making amdgpu poll semantics match those of other 
> > > >> drivers is a pre-requisite for the new ioctl, otherwise it seems 
> > > >> unlikely that the ioctl will be widely adopted.
> > > > This seems backwards, because that means useful improvements in all
> > > > other drivers are stalled until amdgpu is fixed.
> > >
> > > Well there is nothing to fix in amdgpu, what we need to is to come up
> > > with an DMA-buf implicit syncing model which works for everyone.
> > >
> > > I've pointed this problem out at FOSDEM roughly 6 years ago, before
> > > DMA-buf was even merged upstream and way before amdgpu even existed. And
> > > the response was yeah, maybe we need to look at this as well.
> > >
> > > Over the years I've mentioned now at least 5 times that this isn't going
> > > to work in some situations and came up with different approaches how to
> > > fix it.
> > >
> > > And you still have the nerves to tell me that this isn't a problem and
> > > we should fix amdgpu instead? Sorry, but I'm really running out of ideas
> > > how to explain why this isn't working for everybody.
> >
> > I'm trying really hard to not fuel a flame war here but I tend to lean
> > Daniel's direction on this.  Stepping back from the individual needs
> > of amdgpu and looking at things from the PoV of Linux as a whole, AMD
> > being a special snowflake here is bad.  I think we have two problems:
> > amdgpu doesn't play by the established rules, and the rules don't work
> > well for amdgpu.  We need to solve BOTH problems.  Does that mean we
> > need to smash something into amdgpu to force it into the dma-buf model
> > today?  Maybe not; stuff's working well enough, I guess.  But we can't
> > just rewrite all the rules and break everyone else either.
> >
> > > That amdgpu wants to be special is true, but it is a fundamental problem
> > > that we have designed the implicit sync in DMA-buf only around the needs
> > > of DRM drivers at that time instead of going a step back and saying hey
> > > what would be an approach which works for everyone.
> >
> > How else was it supposed to be designed?  Based on the needs of
> > non-existent future drivers?  That's just not fair.  We (Intel) are
> > being burned by various aspects of dma-buf these days too.  It does no
> > good to blame past developers or our past selves for not knowing the
> > future.  It sucks but it's what we have.  And, to move forward, we
> > need to fix it.  Let's do that.
> >
> > My concern with the flags approach as I'm beginning to digest it is
> > that it's a bit too much of an attempt to rewrite history for my
> > liking.  What do I mean by that?  I mean that any solution we come up
> > with needs ensure that legacy drivers and modern drivers can play
> > nicely together.  Either that or we need to modernize all the users of
> > dma-buf implicit sync.  I really don't like the "as long as AMD+Intel
> > works, we're good" approach.
> >
> > > You just need to apply my example from FOSDEM with ring buffers in a
> > > single BO to the DMA-buf implicit sync model and immediately see how it
> > > falls apart.
> > >
> > > > I think we need agreement on what the rules are, reasonable plan to
> > > > get there, and then that should be enough to unblock work in the wider
> > > > community. Holding the community at large hostage because one driver
> > > > is different is really not great.
> > >
> > > Well forcing a drivers into a synchronization model not ideal for their
> > > hardware isn't great either.
> >
> > As I said above, we're feeling the pain too.
> >
> > --Jason



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Reply via email to