Ah, I didn't know you had any other env vars. It's preferable to have as many boolean flags as possible handled by a single env var, because it's easier to use (R600_DUMP_SHADERS counts as a pretty ugly list of boolean flags hidden behind a magic number). Feel free to have separate env vars for more complex parameters.
I skimmed through some of your code and the coding style looks good. I'm also okay with C++, it really seems like the right choice here. However I agree with the argument that one header file per cpp might not always be a good idea, especially if the header file is pretty small. Marek On Sat, Apr 20, 2013 at 11:02 AM, Vadim Girlin <vadimgir...@gmail.com> wrote: > On 04/20/2013 03:11 AM, Marek Olšák wrote: >> >> Please don't add any new environment variables and use R600_DEBUG >> instead. The other environment variables are deprecated. > > > I agree, those vars probably need some cleanup, they were added before > R600_DEBUG appeared. > > Though I'm afraid some of my options won't fit well into the R600_DEBUG > flags, unless we'll add support for the name/value pairs with optional > custom parsers. > > E.g. I have a group of env vars to define the range of included/excluded > shaders for optimization and mode (include/exclude/off), I thought about > doing this with a single var and custom parser to specify the range e.g. as > "10-20", but after all it's just a debug feature, not intended for everyday > use, and so far I failed to convince myself that it's worth the efforts. > > I can implement the support for custom parsers for R600_DEBUG, but do we > really need it? Maybe it would be enough to add e.g."sb" instead of R600_SB > var to the R600_DEBUG flags for enabling it (probably together with other > boolean options such as R600_SB_USE_NEW_BYTECODE) but leave more complicated > internal debug options as is? > > Vadim > > >> There is a table for R600_DEBUG in r600_pipe.c and it even comes with >> a help feature: R600_DEBUG=help >> >> Marek >> >> On Fri, Apr 19, 2013 at 4:48 PM, Vadim Girlin <vadimgir...@gmail.com> >> wrote: >>> >>> Hi, >>> >>> In the previous status update I said that the r600-sb branch is not ready >>> to >>> be merged yet, but recently I've done some cleanups and reworks, and >>> though >>> I haven't finished everything that I planned initially, I think now it's >>> in >>> a better state and may be considered for merging. >>> >>> I'm interested to know if the people think that merging of the r600-sb >>> branch makes sense at all. I'll try to explain here why it makes sense to >>> me. >>> >>> Although I understand that the development of llvm backend is a primary >>> goal >>> for the r600g developers, it's a complicated process and may require >>> quite >>> some time to achieve good results regarding the shader/compiler >>> performance, >>> and at the same time this branch already works and provides good results >>> in >>> many cases. That's why I think it makes sense to merge this branch as a >>> non-default backend at least as a temporary solution for shader >>> performance >>> problems. We can always get rid of it if it becomes too much a >>> maintenance >>> burden or when llvm backend catches up in terms of shader performance and >>> compilation speed/overhead. >>> >>> Regarding the support and maintenance of this code, I'll try to do my >>> best >>> to fix possible issues, and so far there are no known unfixed issues. I >>> tested it with many apps on evergreen and fixed all issues with other >>> chips >>> that were reported to me on the list or privately after the last status >>> announce. There are no piglit regressions on evergreen when this branch >>> is >>> used with both default and llvm backends. >>> >>> This code was intentionally separated as much as possible from the other >>> parts of the driver, basically there are just two functions used from >>> r600g, >>> and the shader code is passed to/from r600-sb as a hardware bytecode that >>> is >>> not going to change. I think it won't require any modifications at all to >>> keep it in sync with the most changes in r600g. >>> >>> Some work might be required though if we'll want to add support for the >>> new >>> hw features that are currently unused, e.g. geometry shaders, new >>> instruction types for compute shaders, etc, but I think I'll be able to >>> catch up when it's implemented in the driver and default or llvm backend. >>> E.g. this branch already works for me on evergreen with some simple >>> OpenCL >>> kernels, including bfgminer where it increases performance of the kernel >>> compiled with llvm backend by more than 20% for me. >>> >>> Besides the performance benefits, I think that alternative backend also >>> might help with debugging of the default or llvm backend, in some cases >>> it >>> helped me by exposing the bugs that are not very obvious otherwise, e.g. >>> it >>> may be hard to compare the dumps from default and llvm backend to spot >>> the >>> regression because they are too different, but after processing both >>> shaders >>> with r600-sb the code is usually transformed to some more common form, >>> and >>> often this makes it easier to compare and find the differences in shader >>> logic. >>> >>> One additional feature that might help with llvm backend debugging is the >>> disassembler that works on the hardware bytecode instead of the internal >>> r600g bytecode structs. This results in the more readable shader dumps >>> for >>> instructions passed in native hw encoding from llvm backend. I think this >>> also can help to catch more potential bugs related to bytecode building >>> in >>> r600g/llvm. Currently r600-sb uses its bytecode disassembler for all >>> shader >>> dumps, including the fetch shaders, even when optimization is not >>> enabled. >>> Basically it can replace r600_bytecode_disasm and related code >>> completely. >>> >>> Below are some quick benchmarks for shader performance and compilation >>> time, >>> to demonstrate that currently r600-sb might provide better performance >>> for >>> users, at least in some cases. >>> >>> As an example of the shaders with good optimization opportunities I used >>> the >>> application that computes and renders atmospheric scattering effects, it >>> was >>> mentioned in the previous thread: >>> http://lists.freedesktop.org/archives/mesa-dev/2013-February/034682.html >>> >>> Here are current results for that app (Main.noprecompute, frames per >>> second) >>> with default backend, default backend + r600-sb, and llvm backend: >>> def def+sb llvm >>> 240 590 248 >>> >>> Another quick benchmark is an OpenCL kernel performance with bfgminer >>> (megahash/s): >>> llvm llvm+sb >>> 68 87 >>> >>> One more benchmark is for compilation speed/overhead - I used two piglit >>> tests, first compiles a lot of shaders (IIRC more than thousand), second >>> compiles a few huge shaders. Result is a test run time in seconds, this >>> includes not only the compilation time but anyway shows the difference: >>> def def+sb llvm >>> tfb max-varyings 10 14 53 >>> fp-long-alu 0.17 0.38 0.68 >>> >>> This is especially important for GL apps, because longer compilation time >>> results in the more significant freezes in the games etc. As for the >>> quality >>> of the compiled code in this test, of course generally llvm backend is >>> already able to produce better code in some cases, but e.g. for the >>> longest >>> shader from the fp-long-alu test both backends optimize it to the two alu >>> instructions. >>> >>> Of course this branch won't magically make all applications faster, many >>> older apps are not really limited by the shader performance at all, but I >>> think it might improve performance for many relatively modern >>> applications/engines, e.g. for the applications based on the Unigine and >>> Source engines. >>> >>> The branch itself can be found here: >>> >>> http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-sb >>> >>> You might prefer to browse new files in a tree instead of reading a huge >>> patch: >>> >>> >>> http://cgit.freedesktop.org/~vadimg/mesa/tree/src/gallium/drivers/r600/sb?h=r600-sb >>> >>> If you'd like to test it, currently the optimization for GL shaders is >>> enabled by default, can be disabled with R600_SB=0. Optimization for >>> compute >>> shaders is not enabled by default because it's still very limited and >>> experimental, can be enabled with R600_SB_CL=1. Disassemble of the >>> optimized >>> shaders is printed with R600_DUMP_SHADERS=2. >>> >>> If you think that merging of the branch makes sense, any >>> comments/suggestions about what is required to prepare the branch for >>> merging are welcome. >>> >>> Vadim >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev