Re: [Mesa-dev] r600g: status of the r600-sb branch

Christian König Fri, 19 Apr 2013 08:15:28 -0700

Hi Vadim,

from your description it seems to be a post processing stage working onthe bytecode of the shaders and additional to that is quite separatedfrom the rest of the driver.

If that's the case then I don't really see a reason why we shouldn'tmerge it, but at least at the beginning it should probably be disabledby default.

On the other hand we should question if there are any optimizations inthere that could be done on earlier stages, something like on the GLSLlevel for example?


Cheers,
Christian.

Am 19.04.2013 16:48, schrieb Vadim Girlin:

Hi,
In the previous status update I said that the r600-sb branch is notready to be merged yet, but recently I've done some cleanups andreworks, and though I haven't finished everything that I plannedinitially, I think now it's in a better state and may be consideredfor merging.
I'm interested to know if the people think that merging of the r600-sbbranch makes sense at all. I'll try to explain here why it makes senseto me.
Although I understand that the development of llvm backend is aprimary goal for the r600g developers, it's a complicated process andmay require quite some time to achieve good results regarding theshader/compiler performance, and at the same time this branch alreadyworks and provides good results in many cases. That's why I think itmakes sense to merge this branch as a non-default backend at least asa temporary solution for shader performance problems. We can alwaysget rid of it if it becomes too much a maintenance burden or when llvmbackend catches up in terms of shader performance and compilationspeed/overhead.
Regarding the support and maintenance of this code, I'll try to do mybest to fix possible issues, and so far there are no known unfixedissues. I tested it with many apps on evergreen and fixed all issueswith other chips that were reported to me on the list or privatelyafter the last status announce. There are no piglit regressions onevergreen when this branch is used with both default and llvm backends.
This code was intentionally separated as much as possible from theother parts of the driver, basically there are just two functions usedfrom r600g, and the shader code is passed to/from r600-sb as ahardware bytecode that is not going to change. I think it won'trequire any modifications at all to keep it in sync with the mostchanges in r600g.
Some work might be required though if we'll want to add support forthe new hw features that are currently unused, e.g. geometry shaders,new instruction types for compute shaders, etc, but I think I'll beable to catch up when it's implemented in the driver and default orllvm backend. E.g. this branch already works for me on evergreen withsome simple OpenCL kernels, including bfgminer where it increasesperformance of the kernel compiled with llvm backend by more than 20%for me.
Besides the performance benefits, I think that alternative backendalso might help with debugging of the default or llvm backend, in somecases it helped me by exposing the bugs that are not very obviousotherwise, e.g. it may be hard to compare the dumps from default andllvm backend to spot the regression because they are too different,but after processing both shaders with r600-sb the code is usuallytransformed to some more common form, and often this makes it easierto compare and find the differences in shader logic.
One additional feature that might help with llvm backend debugging isthe disassembler that works on the hardware bytecode instead of theinternal r600g bytecode structs. This results in the more readableshader dumps for instructions passed in native hw encoding from llvmbackend. I think this also can help to catch more potential bugsrelated to bytecode building in r600g/llvm. Currently r600-sb uses itsbytecode disassembler for all shader dumps, including the fetchshaders, even when optimization is not enabled. Basically it canreplace r600_bytecode_disasm and related code completely.
Below are some quick benchmarks for shader performance and compilationtime, to demonstrate that currently r600-sb might provide betterperformance for users, at least in some cases.
As an example of the shaders with good optimization opportunities Iused the application that computes and renders atmospheric scatteringeffects, it was mentioned in the previous thread:
http://lists.freedesktop.org/archives/mesa-dev/2013-February/034682.html
Here are current results for that app (Main.noprecompute, frames persecond) with default backend, default backend + r600-sb, and llvmbackend:
    def    def+sb    llvm
    240    590    248
Another quick benchmark is an OpenCL kernel performance with bfgminer(megahash/s):
    llvm    llvm+sb
    68    87
One more benchmark is for compilation speed/overhead - I used twopiglit tests, first compiles a lot of shaders (IIRC more thanthousand), second compiles a few huge shaders. Result is a test runtime in seconds, this includes not only the compilation time butanyway shows the difference:
            def    def+sb    llvm
tfb max-varyings    10    14    53
fp-long-alu        0.17    0.38    0.68
This is especially important for GL apps, because longer compilationtime results in the more significant freezes in the games etc. As forthe quality of the compiled code in this test, of course generallyllvm backend is already able to produce better code in some cases, bute.g. for the longest shader from the fp-long-alu test both backendsoptimize it to the two alu instructions.
Of course this branch won't magically make all applications faster,many older apps are not really limited by the shader performance atall, but I think it might improve performance for many relativelymodern applications/engines, e.g. for the applications based on theUnigine and Source engines.
The branch itself can be found here:

http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-sb
You might prefer to browse new files in a tree instead of reading ahuge patch:
http://cgit.freedesktop.org/~vadimg/mesa/tree/src/gallium/drivers/r600/sb?h=r600-sb
If you'd like to test it, currently the optimization for GL shaders isenabled by default, can be disabled with R600_SB=0. Optimization forcompute shaders is not enabled by default because it's still verylimited and experimental, can be enabled with R600_SB_CL=1.Disassemble of the optimized shaders is printed with R600_DUMP_SHADERS=2.
If you think that merging of the branch makes sense, anycomments/suggestions about what is required to prepare the branch formerging are welcome.
Vadim
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] r600g: status of the r600-sb branch

Reply via email to