On 05/03/2017 10:37 PM, Brian Paul wrote:
On 05/01/2017 10:03 AM, Brian Paul wrote:
On 05/01/2017 08:32 AM, Brian Paul wrote:
On 04/28/2017 05:12 PM, Marek Olšák wrote:
Hi,
This series shrinks various gallium structures and removes
set_index_buffer in order to decrease CPU overhead.
PART 1: Performance results
All testing below was done with radeonsi, and I used the drawoverhead
microbenchmark from mesa/demos ported to piglit and using GL 3.0
Compat and GL 3.2 Core (same GL states in both contexts).
1) Performance difference for the removal of set_index_buffer only:
Compat: DrawElements: 5.1 -> 5.3 million draws/second
Core: DrawElements: 5.1 -> 5.5 million draws/second
The result is better for the core profile where u_vbuf is disabled.
2) Performance difference with all 4 patches (Core profile only)
DrawArrays: 8.3 -> 8.5 million draws/second
DrawElements: 5.2 -> 5.8 million draws/second
3) Performance difference with threaded Gallium (Core profile only):
DrawElements: 5.9 -> 7.1 million draws/second
Threaded Gallium is still work in progress and might require
a non-trivial amount of driver work.
PART 2: Call for testing
These drivers have been tested:
- ddebug
- llvmpipe
- r300 (also with SWTCL)
- r600
- radeonsi
- softpipe
- trace
These drivers need testing:
- etnaviv
- freedreno
- nv30
- nv50
- nvc0
- svga
- swr
- vc4
- virgl
The following state trackers might need testing:
- nine
You can get the patches by fetching:
git://people.freedesktop.org/~mareko/mesa gallium-cleanup
I'd like to ask to you for testing drivers that I couldn't test.
Please let me know when you're done testing and if things are good.
After that, I'll push everything assuming the code review goes well.
You can also ignore this if you don't mind fixing your driver in
the master branch later.
With our VMware driver there's a whole bunch of clipflat failures. I'll
try to see if it's something simple, otherwise, it may take a day or two
to look closer.
I think the attached patch fixes things (it should be merged with 3/4).
I need to do another full piglit run, at least...
I see two more regressions from master:
spec/nv_primitive_restart/primitive-restart-vbo_combined_vertex_and_index
and
spec/!opengl 3.1/primitive-restart-xfb flush: pass fail
I spent a little time debugging without success. I'm not sure when I'll
have time to resume. I guess you can push the changes anyway and I'll
try to patch things up later.
OK, I think I found the problem. The following patch fixes the two
regressions:
diff --git a/src/gallium/auxiliary/util/u_prim_restart.c
b/src/gallium/auxiliary/util/u_prim_restart.c
index b7675fa..9ff93a7 100644
--- a/src/gallium/auxiliary/util/u_prim_restart.c
+++ b/src/gallium/auxiliary/util/u_prim_restart.c
@@ -227,7 +227,7 @@ util_draw_vbo_without_prim_restart(struct
pipe_context *context,
} \
}
- start = info->start;
+ start = 0;
count = 0;
switch (info->index_size) {
case 1:
I believe this is a long-standing bug in this function which just
happened to get exposed by the index buffer change. Previously,
info->start was zero but the index buffer offset was non-zero. Now,
info->start is non-zero because the index buffer offset has gone away.
So on the first iteration of the scanning loop, we compute info->start +
start to get the first sub-primitive location. That wound up being
2*info->start which is clearly wrong.
I'll do another full piglit run, but hopefully this should be the last
change needed.
-Brian
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev