On Mon, Jan 4, 2016 at 12:52 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Mon, Jan 4, 2016 at 12:44 PM, Matt Turner <matts...@gmail.com> wrote: >> On Wed, Dec 30, 2015 at 4:26 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >>> On Wed, Dec 30, 2015 at 3:26 PM, Matt Turner <matts...@gmail.com> wrote: >>>> The OpenGL specifications for these functions say: >>>> >>>> The result will be undefined if <offset> or <bits> is negative, or if >>>> the sum of <offset> and <bits> is greater than the number of bits >>>> used to store the operand. >>>> >>>> Therefore passing bits=32, offset=0 is legal and defined in GLSL. >>>> >>>> But the earlier DX11/SM5 bfi/ibfe/ubfe opcodes are specified to accept a >>>> bitfield width ranging from 0-31. As such, Intel and AMD instructions >>>> read only the low 5 bits of the width operand, making them not compliant >>>> with the GLSL spec, so we have to special case the bits=32 case. >>>> >>>> Checking that offset=0 is not necessary, since for any other value, >>>> <offset> + <bits> will be greater than 32, which is specified as >>>> generating an undefined result. >>>> >>>> Fixes: >>>> ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2 >>>> ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3 >>>> ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 >>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 >>>> --- >>>> Yuck. Suggestions welcome. >>> >>> Can you make a piglit test? Want to see if nvidia has the same >>> problem. According to >>> http://docs.nvidia.com/cuda/parallel-thread-execution/#integer-arithmetic-instructions-bfe, >>> offset/bits can actually be up to 255 (although I can't fully imagine >>> why one might want that). However perhaps the HW differs. >> >> I just sent: [PATCH] arb_gpu_shader5: Test corner cases of >> bitfieldInsert/bitfieldExtract. >> >> It's not totally tested (as in, I haven't fixed i965 to make it pass >> because I found out that the bfi2 instruction is also broken...) but I >> am curious to see what the proprietary NVIDIA driver does. > > I'm curious too. On nvc0 the new bitfieldExtract tests still pass, but > bitfieldInsert now fails.
And on softpipe (which uses tgsi_exec), BFE fails while BFI passes. Great :) http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/auxiliary/tgsi/tgsi_exec.c#n4073 These definitions might have come from the DX11 op pseudocode... but the question is why does BFI pass? Shouldn't it also fail? Do I need to & 31 something on nvc0 to make it all work? _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev