On Mon, Nov 13, 2017 at 9:31 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: > On 11.11.2017 04:09, Marek Olšák wrote: >> >> On Fri, Nov 10, 2017 at 9:58 PM, Nicolai Hähnle <nhaeh...@gmail.com> >> wrote: >>> >>> On 10.11.2017 19:24, Connor Abbott wrote: >>>> >>>> >>>> On Fri, Nov 10, 2017 at 1:19 PM, Marek Olšák <mar...@gmail.com> wrote: >>>>> >>>>> >>>>> On Fri, Nov 10, 2017 at 6:55 PM, Nicolai Hähnle <nhaeh...@gmail.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 10.11.2017 18:43, Marek Olšák wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Nov 10, 2017 at 2:09 AM, Connor Abbott <cwabbo...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Nov 9, 2017 at 7:17 PM, Marek Olšák <mar...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Nov 10, 2017 at 12:40 AM, Matt Arsenault >>>>>>>>> <arse...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Nov 10, 2017, at 07:41, Marek Olšák <mar...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> This fixes the TCS gl_ClipDistance piglit failure that was >>>>>>>>>>> uncovered >>>>>>>>>>> by a recent LLVM change. The solution is to set volatile on loads >>>>>>>>>>> and stores to enforce proper ordering. >>>>>>>>>>> >>>>>>>>>>> Please review. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Every LDS access certainly should not be volatile. This kills all >>>>>>>>>> optimizations, like formation of ds_read2_b32. What ordering issue >>>>>>>>>> are you >>>>>>>>>> having? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> It might be caused by inttoptr(NULL) that we do to declare LDS. >>>>>>>>> There >>>>>>>>> is simply no ordering enforced, which is weird. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> As soon as you do inttoptr(NULL), you've generated a poison value >>>>>>>> (in >>>>>>>> LLVM legalese), so LLVM will assume that you never dereference it >>>>>>>> and >>>>>>>> optimize accordingly. I think a GEP instruction without the inbounds >>>>>>>> parameter set will get rid of the poison value, although I'm not >>>>>>>> sure >>>>>>>> about the case where the offset is known to be zero. At least, >>>>>>>> that's >>>>>>>> my reading of the langref text for the GEP instruction >>>>>>>> (https://llvm.org/docs/LangRef.html#id215). If zero is a valid >>>>>>>> address >>>>>>>> in LDS, then it sounds like LLVM needs to be fixed to disable this >>>>>>>> optimization for certain address spaces. On the other hand, if >>>>>>>> you're >>>>>>>> doing inttoptr(NULL) + offset, where "offset" is the result of a >>>>>>>> ptrtoint somewhere, you should be doing inttoptr(offset) instead, >>>>>>>> and >>>>>>>> then LLVM should never misbehave. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I don't think that using inttoptr before every load and store would >>>>>>> be >>>>>>> better than volatile. The must be a better solution. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Can't we just allocate the required LDS memory explicitly like we did >>>>>> for >>>>>> the LDS-based derivative computations? >>>>>> >>>>>> It may require shuffling around a bit how/when we calculate the >>>>>> required >>>>>> sizes, but it doesn't seem impossible. >>>>> >>>>> >>>>> >>>>> We want to share the same declaration in TCS main and epilog parts. >>>>> >>>>> Does LLVM know that LDS declarations are pre-initialized? >>>>> Do sized LDS declarations affect SIMD-occupancy-based optimizations? >>>>> Because Mesa always declares 64kB of LDS and the real value is >>>>> determined at runtime. >>>> >>>> >>>> >>>> I don't know about the latter, but for the former, if you declare the >>>> LDS variable as having external linkage, LLVM should assume that it >>>> might be initialized beforehand -- exactly like a global non-static >>>> variable in C. >>> >>> >>> >>> Makes sense. >>> >>> I don't think LLVM is really looking at LDS size too closely for >>> anything, >>> since LDS is per-thread group. But it's been a while since I checked. >>> >>> So just declaring a 64/32 KB memory block and then potentially not using >>> all >>> of it is probably fine and is probably the best short-term solution (if >>> it >>> works). >>> >>> It's a good point though that "shuffling around the computation of the >>> required sizes" is potentially much more involved than I was thinking at >>> first. It looks like if we wanted to be perfectly honest with LLVM about >>> what's going on (and I believe we should, in the long run), we'd have to >>> teach it a notion of "per-thread LDS memory". That requires more thought. >> >> >> We can't tell LLVM the size of LDS, because we don't know it - it's >> computed from 2 independent shaders (LS and HS, or ES and GS). > > > That's true, but purely as a short-term solution, can't we tell LLVM the > size of LDS is 64KB anyway instead of the inttoptr(0), even if we're not > going to use it all? What goes wrong in that approach?
This: "Do sized LDS declarations affect SIMD-occupancy-based optimizations?" And shader parts should be allowed to (re)declare the same LDS. BTW, the max LDS size for non-pixel non-compute shaders is 32KB. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev