On June 15, 2018 01:14:24 Michel Dänzer <mic...@daenzer.net> wrote:
On 2018-06-15 07:31 AM, Jason Ekstrand wrote:
On Thu, Jun 14, 2018 at 10:55 AM, Jason Ekstrand <ja...@jlekstrand.net>
wrote:
On June 14, 2018 01:43:12 Michel Dänzer <mic...@daenzer.net> wrote:
On 2018-06-13 10:26 PM, Jason Ekstrand wrote:
The current BO cache puts BOs back into the recycle bucket the moment the
refcount hits zero. If the BO is busy, we just don't re-use it until it
isn't or we re-use it for a render target which we assume will be used
first for drawing. This patch series reworks the way the BO cache works
a
bit so that we don't ever recycle a busy BO. On the down side, it means
that we don't get the "keep busy BOs busy" heuristic (which we have no
proof actually helps). On the up side, we can now easily use a MRU
heuristic instead of round-robin for all buffers and not just the busy
ones. Will this be an improvement, a regression or a wash? I don't know
but I doubt it will have a major effect one way or another.
FWIW, I suspect this could be a significant loss with overlapping copies
in glamor (e.g. x11perf -copywinwin500), because it won't be able to
reuse the busy BOs anymore (glamor creates a temporary FBO for each
overlapping copy).
That's rather horrific... That seems like something glamour could do
better.
As of xserver 1.20, glamor can use GL_MESA_tile_raster_order if
available.
How common are overlapping copies in practice? Are we talking a
couple per frame or hundreds?
X doesn't have a "frame" concept per se, but overlapping copies can be
quite common e.g. when scrolling, or moving windows without a
compositor.
1.19 I think
I did some testing and x11perf -copywinwin500 is... exactly the same with
or without my patches. If anything they might improve it by just a hair.
Possible explanations I can think of:
1. Your glamor still has its own FBO cache. Which version of xserver are
you testing with?
2. The i965 driver cache isn't hit even before these changes.
It's definitely getting hit in both cases, it just may require a slightly
larger cache of we aren't recycling BOs until they're idle.
3. Allocating BOs from the kernel is significantly cheaper with i915 vs
amdgpu.
(4. Your GPU is too slow for it to matter. What kind of numbers are you
getting?)
That's entirely possible.
FWIW, on a Radeon R9 285 I get
360000 trep @ 0.0257 msec ( 38900.0/sec): Copy 500x500 from window to
window
with glamor's FBO cache and
240000 trep @ 0.0700 msec ( 14300.0/sec): Copy 500x500 from window to
window
without (radeonsi's cache doesn't reclaim BOs either until they are
idle), i.e. almost a factor of 3.
I was getting about 6200 on the laptop I was testing with. That's about 1/5
of the speed you're seeing so maybe it just isn't mattering. Still, if
that's the kind of drop you're seeing them maybe we should reconsider.
--Jason
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev