On Tue, 2016-08-30 at 16:14 -0400, Connor Abbott wrote: > On Tue, Aug 30, 2016 at 10:06 AM, Marek Olšák <mar...@gmail.com> > wrote: > > > > On Tue, Aug 30, 2016 at 3:21 PM, Eero Tamminen > > <eero.t.tammi...@intel.com> wrote: > > > > > > Hi, > > > > > > > > > On 30.08.2016 12:51, Marek Olšák wrote: > > > > > > > > > > > > Recently I discovered that our GLSL compiler spends a lot of > > > > time in > > > > rzalloc_size, so I looked at possible options to optimize that. > > > > It's > > > > worth noting that too many existing allocations slow down > > > > subsequent > > > > malloc calls, which in turn slows down the GLSL compiler. When > > > > I kept > > > > 5 instances of LLVMContext alive between compilations (I wanted > > > > to > > > > reuse them), the GLSL compiler slowed down. That shows that the > > > > GLSL > > > > compiler performance is too dependent on the size and > > > > complexity of > > > > the heap. > > > > > > > > So I decided to write my own linear allocator and then compared > > > > it > > > > with jemalloc preloaded by LD, and jemalloc linked statically > > > > and used > > > > by ralloc only. > > > > > > > > The test was shader-db using AMD's shader collection. The > > > > command line > > > > was: > > > > time GALLIUM_NOOP=1 shader-db/run shaders > > > > The noop driver ensures the compilation process ends with TGSI. > > > > > > > > > > > > Default Mesa: > > > > real 0m58.343s > > > > user 3m48.828s > > > > sys 0m0.760s > > > > > > > > Mesa with LD_PRELOAD=/usr/lib/x86_64-linux- > > > > gnu/libjemalloc.so.1: > > > > real 0m48.550s (17% less time) > > > > user 3m9.544s > > > > sys 0m1.700s > > > > > > > > Ralloc using _mesa_je_{calloc, realloc, free} and Mesa links > > > > against > > > > my libmesa_jemalloc_pic.a: > > > > real 0m49.580s (15% less time) > > > > user 3m14.452s > > > > sys 0m0.996s > > > > > > > > Ralloc using my own linear allocator that allocates out of 32KB > > > > buffers for 512b and smaller allocations: > > > > real 0m46.521s (20% less time) > > > > user 3m1.304s > > > > sys 0m1.740s > > > > > > > > > > > > Now let's test complete compilation down to GCN bytecode: > > > > > > > > Default Mesa: > > > > real 1m57.634s > > > > user 7m41.692s > > > > sys 0m1.824s > > > > > > > > Mesa with LD_PRELOAD=/usr/lib/x86_64-linux- > > > > gnu/libjemalloc.so.1: > > > > real 1m42.604s (13% less time) > > > > user 6m39.776s > > > > sys 0m3.828s > > > > > > > > Ralloc using _mesa_je_{calloc, realloc, free} and Mesa links > > > > against > > > > my libmesa_jemalloc_pic.a: > > > > real 1m44.413s (11% less time) > > > > user 6m48.808s > > > > sys 0m2.480s > > > > > > > > Ralloc using my own linear allocator: > > > > real 1m40.486s (14.6% less time) > > > > user 6m34.456s > > > > sys 0m2.224s > > > > > > > > > > > > The linear allocator that I wrote has a very high memory usage > > > > due to > > > > the inability to free 32KB blocks if those blocks have at least > > > > one > > > > living allocation. The workaround would be to do realloc() when > > > > changing a ralloc parent in order to "defragment" the memory, > > > > but > > > > that's more involved. > > > > > > > > I don't know much about glibc, but it's hard to believe that > > > > glibc > > > > people have been purposely ignoring jemalloc for so long. There > > > > must > > > > be some anti-performance politics going on, but enough of > > > > speculations. > > > > > > > > > Different allocators have different trade-offs: > > > * single-core speed > > > * multi-core speed > > > * memory usage > > > * long time memory fragmentation > > > * alloc debugging support & robustness > > > > > > And they can behave different with different allocation patterns > > > and sizes. > > > Jemalloc being better in one test than ptmalloc doesn't > > > necessarily mean > > > that it's better in another. > > > > > > Here's some discussion on the subject: > > > https://lwn.net/Articles/273084/ > > > > > > The used algorithms and some of the trade-offs are described in > > > allocators' > > > source codes. > > > > > > > > > > > > > > If we don't care about memory usage, let's use my allocator. > > > > > > > > > Modern games are most demanding use-case for compiler, use > > > largest number of > > > shaders, but almost all (>90%) Steam games are *still* 32- > > > bit. Before > > > compiler memory usage optimizations by Ian & Co, several of them > > > crashed > > > because they ran out of 32-bit address space. > > > > Did the games crash because i965 was using GLSL IR as its main > > compiler IR? Or was the problem that GLSL IR hadn't been released > > at > > link time, because the driver had to keep all of it for compiling > > shader variants? The memory usage issue might have been i965- > > specific > > and not relevant right now. > > > > Note that Gallium releases GLSL IR in glLinkProgram and other > > drivers > > should do that too. If some drivers don't, they are going to have > > memory usage issues either way. > > I believe that at the time, i965 had to keep GLSL IR around after > linking to handle shader variants. Nowadays, we release the GLSL IR > at > link time and only hang onto the NIR for variants.
Are you sure? As far as I was aware no one ever finished this up for i965. > NIR is inherently a > lot more compact than GLSL IR since it uses a lot fewer variables and > variable dereferences (they're mostly replaced by SSA values during > optimization time). It's not as compact as TGSI, since it's designed > to be mutated/optimized, but it could be made a lot smaller with a > little tuning. Also, Ian did a lot of work to make GLSL's memory > footprint smaller, which still helps during link time. > > > > > > > Marek > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev