Greetings, I hope that, even if you work will be short-lived, e.g. until LLVM bytecode compiler takes off, the know-how is still very useful.
On Thu, Feb 14, 2013 at 4:04 AM, Vadim Girlin <vadimgir...@gmail.com> wrote: > Hi, > > Last month I finally found the time to work on the rewrite of my previous > shader optimization branch, now it's mostly done in terms of the > correctness of produced code and feature support (at least on evergreen), > though it's still a work in progress in terms of the efficiency of > generated shader code and the efficiency of the backend itself. > > I spent some time last year studying the LLVM infrastructure and R600 LLVM > backend and trying to improve it, but after all I came to the conclusion > that for me it might be easier to implement all that I wanted in the custom > backend. This allows for more simple and efficient implementation - e.g. I > don't have to deal with CFGs because in fact we have structured code, so > it's possible to use more simple and efficient algorithms. > > Currently the branch has no regressions with piglit's quick-driver.tests > on evergreen (it doesn't rely on the fallback to unoptimized code for the > shaders with relative addressing and other cases unlike the previous > branch), and so far I don't see any rendering issues with the apps that I > used for testing - Lightsmark 2008, Unigine Heaven 3.0 and some others. > There are also some performance improvements with the gpu-bound apps. > > I tried to keep in mind the differences between chip classes, so I hope it > should only require minor fixes to make it work on non-evergreen chips, but > I doubt that it will work out of the box - support for some non-evergreen > hw-specific features is still missing, e.g. I'm sure that indirect > addressing currently won't work on R6xx, though basic tests might work in > theory. Fixing this shouldn't require a lot of work though. > > The branch can be found in my freedesktop repo: > > http://cgit.freedesktop.org/~**vadimg/mesa/log/?h=r600-sb<http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-sb> > > Regarding the differences from the previous branch - there are some > additional optimizations, e.g. global value numbering with some basic > support for constant folding (not all instructions are currently handled, > but it's easy to extend), global code motion that can hoist invariant code > out of the loops etc. Some optimizations that were implemented in the > previous branch are not implemented in the new branch (yet), e.g. > propagation of modifiers (I'm not even sure if it has any noticeable effect > on performance). > > Unlike the previous branch, there is support for indirect addressing on > registers - currently it uses my previously posted patch (that was not > very welcome) for obtaining the information about addressable register > ranges, but it's not required and can be dropped, I just used that patch > for testing. Without that information opportunities for optimization are > limited though, and perhaps it makes sense to not try to optimize the > shaders with indirect gpr addressing at all and rely on the old backend > until we'll have the proper solution to pass that information to the > drivers. > > There is also initial support for ALU predication, but it's not complete > and currently unused, I'm not sure if predication support will have > significant effect on performance that will justify more complex and > expensive algorithms for register allocator and scheduler, probably I'll > look into it later, I consider this as a low priority. In the case of > predicated source code (from LLVM backend) the predication is eliminated > using speculative execution and conditional moves, same as with the simple > if-conversion pass that is also implemented. > > The branch currently uses as source the bytecode built by the old backend > (that may also come from LLVM backend) and some additional information > (about inputs etc), final bytecode is built by the new builder in the > branch. Building two versions of the bytecode doesn't look very efficient, > but currently it simplifies debugging. I'm planning to implement > translation from TGSI directly to my representation, it should simplify the > translator and allow to get rid of unnecessary intermediate passes. > > Some old and new environment variables can be used to control the behavior > of this backend: > > R600_SB - 0 - disable new backend completely, 1 - enable (default) > R600_SB_USE_NEW_BYTECODE - 0 - disable use of the produced bytecode > (useful if you only want to look at the dump of the optimized shader > without passing it to hw), 1 - enable (default) > R600_DUMP_SHADERS - will also dump the dissasemble of the optimized shader > after original bytecode (if backend is not disabled with R600_SB=0). > > Produced shader code is not ideal - e.g. you may notice not very necessary > MOVs inserted before DOT4 instructions, it's a known issue and I'm going to > look into it - this may require rework of the regalloc/scheduler. I had to > sacrifice some features to make it work correctly with Heaven first, so > that now I can try to improve it while being able to test for regressions. > > Also probably there are some issues with the cleanness of the code - I had > to rework some parts a few times while fixing all problems, so there is > possibly unused code and other remnants of the previous versions. Anyway, I > still consider it as a work in progress and some things are going to be > reworked. > > I'm not sure what will be the destiny of this branch, taking into account > that we also have actively developed LLVM backend that is required for > OpenCL anyway. Your opinions are welcome. > > Vadim > ______________________________**_________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/**mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev