On 06.12.2017 08:01, James Jones wrote:
On 12/01/2017 10:34 AM, Nicolai Hähnle wrote:
On 01.12.2017 18:09, Nicolai Hähnle wrote:
[snip]
As for the actual transition API, I accept that some metadata may be
required, and the metadata probably needs to depend on the memory
layout,
which is often vendor-specific. But even linear layouts need some
transitions for caches. We probably need at least some generic
"off-device
usage" bit.
I've started thinking of cached as a capability with a transition.. I
think that helps. Maybe it needs to somehow be more specific (ie. if
you have two devices both with there own cache with no coherency
between the two)
As I wrote above, I'd prefer not to think of "cached" as a capability
at least for radeonsi.
From the desktop perspective, I would say let's ignore caches, the
drivers know which caches they need to flush to make data visible to
other devices on the system.
On the other hand, there are probably SoC cases where non-coherent
caches are shared between some but not all devices, and in that case
perhaps we do need to communicate this.
So perhaps we should have two kinds of "capabilities".
The first, like framebuffer compression, is a capability of the
allocated memory layout (because the compression requires a meta
surface), and devices that expose it may opportunistically use it.
The second, like caches, is a capability that the device/driver will
use and you don't get a say in it, but other devices/drivers also
don't need to be aware of them.
So then you could theoretically have a system that gives you:
GPU: FOO/tiled(layout-caps=FOO/cc, dev-caps=FOO/gpu-cache)
Display: FOO/tiled(layout-caps=FOO/cc)
Video: FOO/tiled(dev-caps=FOO/vid-cache)
Camera: FOO/tiled(dev-caps=FOO/vid-cache)
[snip]
FWIW, I think all that stuff about different caches quite likely
over-complicates things. At the end of each "command submission" of
whichever type of engine, the buffer must be in a state where the
kernel is free to move it around for memory management purposes. This
already puts a big constraint on the kind of (non-coherent) caches
that can be supported anyway, so I wouldn't be surprised if we could
get away with a *much* simpler approach.
I'd rather not depend on this type of cleverness if possible. Other
kernels/OS's may not behave this way, and I'd like the allocator
mechanism to be something we can use across all or at least most of the
POSIX and POSIX-like OS's we support. Also, this particular example is
not true of our proprietary Linux driver, and I suspect it won't always
be the case for other drivers. If a particular driver or OS fits this
assumption, the driver is always free to return no-op transitions in
that case.
Agreed.
(What I wrote about memory management should be true for all systems,
but the kernel could use an engine that goes through the relevant caches
for memory management-related buffer moves. It just so happens that it
doesn't do that on our hardware, but that's by no means universal.)
Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev