Hi,
On 05.05.2018 03:56, James Xiong wrote:
From: "Xiong, James" <james.xi...@intel.com>
With the current implementation, brw_bufmgr may round up a request
size to the next bucket size, result in 25% more memory allocated in
the worst senario. For example:
Request size Actual size
32KB+1Byte 40KB
.
8MB+1Byte 10MB
.
96MB+1Byte 112MB
This series align the buffer size up to page instead of a bucket size
to improve memory allocation efficiency.
Performance and memory usage were measured on a gen9 platform using
Basemark ES3, GfxBench 4 and 5, each test case ran 6 times.
Basemark ES3
score peak memory size(KB)
before after diff before after diff
max avg max avg max avg
22 21 23 21 2.83% 1.21% 409928 395573 -14355
20 20 20 20 0.53% 0.41%
Thanks for the new data!
As the values below seem similar to what you earlier sent, I assume
the tests are listed here in the same order, i.e:
GfxBench 4.0
scorepeak memory size(KB)
> score peak memory
size(KB)
> before after diff before after
diff
> max avg max avg max avg
gl_4 > 584 577 586 583 0.45% 1.02% 566489 539699
-26791
manhattan > 1604 1144 1650 1202 2.81% 4.86% 439220 411596
-27624
gl_trex > 2711 2222 2718 2152 0.25% -3.25% 126065 121398
-4667
gl_alu2 > 1218 1213 1212 1154 -0.53% -5.10% 54153 53868
-285
driver2 > 106 104 106 103 0.85% -1.66% 12730 12666
-64
gl_4_off > 728 727 727 726 -0.03% -0.16% 614730 586794
-27936
manhattan_off > 1732 1709 1740 1728 0.49% 1.11% 475716 447726
-27990
gl_trex_off > 3051 2969 3066 3047 0.50% 2.55% 154169 148962
-5207
gl_alu2_off > 2626 2607 2626 2625 0.00% 0.70% 84119 83150
-969
driver2_off > 211 208 208 205 -1.26% -1.21% 39924 39667
-257
GfxBench 5.0
> score peak memory size(KB)
> before after diff before after diff
> max avg max avg max avg
gl_5 > 260 258 259 256 -0.39% -0.85% 1111037 1013520
-97517
gl_5_off > 298 295 298 297 0.00% 0.45% 1143593 1040844
-102749
As expected, max gives more stable results than average.
There could be performance improvement in Manhattan v3.0. At least it
had largest peak memory usage saving in GfxBench v4, both absolutely &
relatively (6%).
gl_alu2 onscreen average drop seems also suspiciously large, but as it's
not visible in max value, or in alu2 offscreen, or your previous test,
I think it it's just random variation.
In light of what I know of these tests variance on TDP limited devices,
I think rest of your GfxBench v4 & v5 performance changes also fall
within random variance.
- Eero
Xiong, James (4):
i965/drm: Reorganize code for the next patch
i965/drm: Round down buffer size and calculate the bucket index
i965/drm: Searching for a cached buffer for reuse
i965/drm: Purge the bucket when its cached buffer is evicted
src/mesa/drivers/dri/i965/brw_bufmgr.c | 139 ++++++++++++++++++---------------
src/util/list.h | 5 ++
2 files changed, 79 insertions(+), 65 deletions(-)
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev