Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Christian König Tue, 27 Apr 2021 05:06:09 -0700

Correct, we wouldn't have synchronization between device with andwithout user queues any more.


That could only be a problem for A+I Laptops.

Memory management will just work with preemption fences which pause theuser queues of a process before evicting something. That will be adma_fence, but also a well known approach.


Christian.

Am 27.04.21 um 13:49 schrieb Marek Olšák:

If we don't use future fences for DMA fences at all, e.g. we don't usethem for memory management, it can work, right? Memory management cansuspend user queues anytime. It doesn't need to use DMA fences. Theremight be something that I'm missing here.

What would we lose without DMA fences? Just inter-devicesynchronization? I think that might be acceptable.

The only case when the kernel will wait on a future fence is before apage flip. Everything today already depends on userspace not hangingthe gpu, which makes everything a future fence.


Marek

On Tue., Apr. 27, 2021, 04:02 Daniel Vetter, <[email protected]<mailto:[email protected]>> wrote:


    On Mon, Apr 26, 2021 at 04:59:28PM -0400, Marek Olšák wrote:
    > Thanks everybody. The initial proposal is dead. Here are some
    thoughts on
    > how to do it differently.
    >
    > I think we can have direct command submission from userspace via
    > memory-mapped queues ("user queues") without changing window
    systems.
    >
    > The memory management doesn't have to use GPU page faults like HMM.
    > Instead, it can wait for user queues of a specific process to go
    idle and
    > then unmap the queues, so that userspace can't submit anything.
    Buffer
    > evictions, pinning, etc. can be executed when all queues are
    unmapped
    > (suspended). Thus, no BO fences and page faults are needed.
    >
    > Inter-process synchronization can use timeline semaphores.
    Userspace will
    > query the wait and signal value for a shared buffer from the
    kernel. The
    > kernel will keep a history of those queries to know which process is
    > responsible for signalling which buffer. There is only the
    wait-timeout
    > issue and how to identify the culprit. One of the solutions is
    to have the
    > GPU send all GPU signal commands and all timed out wait commands
    via an
    > interrupt to the kernel driver to monitor and validate userspace
    behavior.
    > With that, it can be identified whether the culprit is the
    waiting process
    > or the signalling process and which one. Invalid signal/wait
    parameters can
    > also be detected. The kernel can force-signal only the
    semaphores that time
    > out, and punish the processes which caused the timeout or used
    invalid
    > signal/wait parameters.
    >
    > The question is whether this synchronization solution is robust
    enough for
    > dma_fence and whatever the kernel and window systems need.

    The proper model here is the preempt-ctx dma_fence that amdkfd uses
    (without page faults). That means dma_fence for synchronization is
    doa, at
    least as-is, and we're back to figuring out the winsys problem.

    "We'll solve it with timeouts" is very tempting, but doesn't work.
    It's
    akin to saying that we're solving deadlock issues in a locking
    design by
    doing a global s/mutex_lock/mutex_lock_timeout/ in the kernel. Sure it
    avoids having to reach the reset button, but that's about it.

    And the fundamental problem is that once you throw in userspace
    command
    submission (and syncing, at least within the userspace driver,
    otherwise
    there's kinda no point if you still need the kernel for
    cross-engine sync)
    means you get deadlocks if you still use dma_fence for sync under
    perfectly legit use-case. We've discussed that one ad nauseam last
    summer:

    
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences
    
<https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_fence#indefinite-dma-fences>

    See silly diagramm at the bottom.

    Now I think all isn't lost, because imo the first step to getting
    to this
    brave new world is rebuilding the driver on top of userspace
    fences, and
    with the adjusted cmd submit model. You probably don't want to use
    amdkfd,
    but port that as a context flag or similar to render nodes for
    gl/vk. Of
    course that means you can only use this mode in headless, without
    glx/wayland winsys support, but it's a start.
    -Daniel

    >
    > Marek
    >
    > On Tue, Apr 20, 2021 at 4:34 PM Daniel Stone
    <[email protected] <mailto:[email protected]>> wrote:
    >
    > > Hi,
    > >
    > > On Tue, 20 Apr 2021 at 20:30, Daniel Vetter <[email protected]
    <mailto:[email protected]>> wrote:
    > >
    > >> The thing is, you can't do this in drm/scheduler. At least
    not without
    > >> splitting up the dma_fence in the kernel into separate memory
    fences
    > >> and sync fences
    > >
    > >
    > > I'm starting to think this thread needs its own glossary ...
    > >
    > > I propose we use 'residency fence' for execution fences which
    enact
    > > memory-residency operations, e.g. faulting in a page
    ultimately depending
    > > on GPU work retiring.
    > >
    > > And 'value fence' for the pure-userspace model suggested by
    timeline
    > > semaphores, i.e. fences being (*addr == val) rather than being
    able to look
    > > at ctx seqno.
    > >
    > > Cheers,
    > > Daniel
    > > _______________________________________________
    > > mesa-dev mailing list
    > > [email protected]
    <mailto:[email protected]>
    > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
    <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
    > >

--Daniel Vetter

    Software Engineer, Intel Corporation
    http://blog.ffwll.ch <http://blog.ffwll.ch>


_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal

Reply via email to