On Thu, Oct 30, 2014 at 12:38 PM, Ian Romanick <i...@freedesktop.org> wrote: > On 10/29/2014 11:59 PM, Matt Turner wrote: >> On Wed, Oct 29, 2014 at 6:11 PM, Thomas Helland >> <thomashellan...@gmail.com> wrote: >>> This series does some initial work to make expansion of >>> the get_range function a lot cleaner. >>> It also adds a couple simple initial ranges. >>> These patches are by no means perfect, but I hope >>> they will provide some feedback and ideas. >>> I'm hoping to expand this to do the following: >>> -Add get_range for most opcodes I can think of >>> -Add more utility functions to the constant_util file. >>> -Repurpose the file to optimize more than just min/max. >>> -Elimintate if's that we know the result of >>> -Whatever pops into my head >> >> Sounds good. >> >>> I have some questions about undefined behaviour regarding this. >>> Do we have anyway of signaling in our IR that >>> the variable is the result of undefined behaviour? >>> >>> In compilers like llvm, if I recall, they have a flag for this >>> so they can signal undefined behaviour and use whatever value >>> gives the most efficient code for its uses.(used in -ffast-math). >>> >>> A hypotetichal situation: >>> We find that we have sqrt(x) where x has upper bound < 0. >>> The spec says the behavior is undefined for x < 0. >>> The same applies for inverse sqrt, log, log2 and pow. >>> How should this be handled? >>> Should a warning be issued? >>> Could we simplify this to a constant 0? >>> That would allow more optimizations to occur. >> >> That's probably what I'd try first. >> >> I applied your series and ran our internal shader-db through it. The >> good news is that it helps some programs! >> >> The bad news is that it hurts even more programs. I randomly selected >> two, and the relevant diffs looked like this: >> >> -math.sat exp(8) g91<1>F g86<8,8,1>F null { >> align1 1Q compacted }; >> +math exp(8) g91<1>F g85<8,8,1>F null { >> align1 1Q compacted }; >> +sel.l(8) g92<1>F g91<8,8,1>F 1F { align1 1Q >> }; >> >> So we're saying we know the result of exp() must be >= 0, so no need >> to handle the lower bound. Instead just clamp the top. Except saturate >> is free and just clamping the top is not. >> >> Disabling ir_unop_exp/ir_unop_exp2 from patch 6/6 shows some programs >> actually do benefit from this optimization though. Before, they did >> things like: >> >> math exp(8) g17<1>F g14<8,8,1>F null { >> align1 1Q compacted }; >> sel.ge(8) g124<1>F g17<8,8,1>F 0F { >> align1 1Q compacted }; >> >> That tells me that there are gains to be had here. We just have to >> figure out how. >> >> I'm not exactly sure how the best way to handle this is, but it seems >> like we want to trim useless clamps *iff* they cannot be paired with >> another to form a saturate. > > Would it be sufficient to do ir_unop_saturate generation before this > pass? Or is the pass breaking the saturate up? Also... wasn't Eric > saying his platform didn't have free saturates?
FWIW freedreno doesn't have a saturate instruction (or flag or whatever). It's implemented with min/max. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev