On Wed, Aug 1, 2018 at 3:20 PM, Dave Airlie <airl...@gmail.com> wrote: > Sounds like a major project for someone to fix llvm, doesn't AMD have > compiled devs?
Yes, but they are from entirely different teams. Marek > > Acked-by: Dave Airlie <airl...@gmail.com> > > Dave. > > On Thu., 2 Aug. 2018, 04:43 Marek Olšák, <mar...@gmail.com> wrote: >> >> On Mon, Jul 23, 2018 at 11:33 PM, Timothy Arceri <tarc...@itsqueeze.com> >> wrote: >> > On 24/07/18 11:15, Marek Olšák wrote: >> >> >> >> On Fri, Jul 20, 2018 at 12:53 AM, Dave Airlie <airl...@gmail.com> >> >> wrote: >> >>> >> >>> On 20 July 2018 at 13:12, Marek Olšák <mar...@gmail.com> wrote: >> >>>> >> >>>> From: Marek Olšák <marek.ol...@amd.com> >> >>>> >> >>>> To make >> >>>> dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 >> >>>> finish sooner on the older CPUs. (otherwise it gets killed and we >> >>>> fail >> >>>> the test) >> >>> >> >>> >> >>> I think this is possibly a bad idea, since it's clear LLVM has some >> >>> pathalogical >> >>> behaviour the AMDGPU backend for this shader and we are just papering >> >>> over it. >> >>> >> >>> A quick dig into LLVM shows horrible misuse of a SmallVector data >> >>> structure >> >>> for what ends up having 2000 entries in it. >> >>> >> >>> I'm not going to out right NAK this, but it would be nice to have it >> >>> accompanied >> >>> by a pointer to an llvm bug against the amdgpu backend for the >> >>> pathalogical case. >> >> >> >> >> >> Even if I comment out the push_back call in LLVM, it's still too slow. >> >> (the dEQP test times out and fails) LLVMCodeGenLevelLess is faster, >> >> but I don't know yet if it's enough for the test. >> > >> > >> > I hard-coded the second buffer block to column_major rather than >> > row_major >> > which reduced total run time from 15 -> 9 seconds on my machine. So it >> > seems >> > temps would definitely help. Proper packing support would also likely >> > help a >> > little more but not as much. >> >> 15 -> 9 is not enough. We need to decrease the compile time by 60% or >> more. >> >> For Dave: Commenting out the "push_back" call in LLVM is also not enough. >> >> Only LLVMCodeGenLevelLess gives the desired improvement (~60%), though >> the test is dangerously close to timing out and getting killed. >> LLVMCodeGenLevelNone is fastest, but the bytecode is horrible (live >> variables between blocks are always spilled). >> >> If there is no straightforward way to improve compile times (I think >> there isn't), I'll have to push this. >> >> Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev