On Tue, Jan 12, 2016 at 12:54 AM, Matt Turner <matts...@gmail.com> wrote: > On Mon, Jan 11, 2016 at 6:58 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >> >> On Jan 11, 2016 2:47 PM, "Matt Turner" <matts...@gmail.com> wrote: >>> >>> The OpenGL specifications for bitfieldInsert() and bitfieldExtract() says: >>> >>> The result will be undefined if <offset> or <bits> is negative, or if >>> the sum of <offset> and <bits> is greater than the number of bits >>> used to store the operand. >>> >>> Therefore passing bits=32, offset=0 is legal and defined in GLSL. >>> >>> But the earlier DX11/SM5 bfi/ibfe/ubfe opcodes are specified to accept a >>> bitfield width ranging from 0-31. As such, Intel and AMD instructions >>> read only the low 5 bits of the width operand, making them not able to >>> implement the GLSL-specified behavior directly. >>> >>> This commit adds a pass that inserts code to implement the trivial cases >>> of <bits> = 32 as >>> >>> bitfieldInsert: >>> bits > 31 ? insert : bitfieldInsert(base, insert, offset, bits) >>> >>> bitfieldExtract: >>> bits > 31 ? value : bitfieldExtract(value, offset, bits) >> >> Why not just add this to nir_opt_algebraic with a flag in >> nir_compiler_options? That's the way we've done a bunch of other lowering. > > Note that these are replacing the expression with an expression > containing the original expression. > > I think there's a distinction to be made between lowering (decomposing > an operation into a number of simpler operations) and what I've called > a "fixup" (inserting some additional code needed for correctness or > other reasons). > > I claim that doing this fixup once before optimizations is the best > solution, with the alternatives suffering from problems: > > - do it in nir_opt_algebraic. Problem: how to avoid adding the fixup > code multiple times? The comparison could be optimized away... > - do it after optimizations. Problem: might have been able to > optimize the comparison away...
The problem with doing it this way is that now bitfield_insert and bitfield_extract now have two different semantics, which seems a little ugly. For example, it's impossible to know in an optimization pass whether it's legal to do the inverse transform: bits > 31 ? value : bitfield_extract(value, offset, bits) -> bitfield_extract, which for HW implementing the GL semantics, might be a useful thing. I think the cleanest way to do it would be: - introduce a new bitfield_extract opcode so that we have both bitfield_extract and bfe - clarify that bfe, bfi, and bfm have the D3D semantics, while bitfield_extract and bitfield_insert are equivalent to the GLSL builtins - add a flag to lower the GLSL opcodes into the D3D opcodes, inserting the fixup code along the way _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev