On Wed, May 7, 2014 at 10:55 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> So... this shader (from >> generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): >> >> uniform mat2 arg0; >> uniform mat2 arg1; >> >> void main() >> { >> bool result = (arg0 == arg1); >> gl_FragColor = vec4(result, 0.0, 0.0, 0.0); >> } >> >> Which becomes the following IR: >> >> ( >> (declare (shader_out ) vec4 gl_FragColor) >> (declare (temporary ) vec4 gl_FragColor) >> (declare (uniform ) mat2 arg0) >> (declare (uniform ) mat2 arg1) >> (function main >> (signature void >> (parameters >> ) >> ( >> (declare (temporary ) vec4 vec_ctor) >> (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) >> (declare (temporary ) bvec2 mat_cmp_bvec) >> (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal >> (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref >> arg0) (constant int (0)) ) ) ) >> (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal >> (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref >> arg0) (constant int (1)) ) ) ) >> (assign (x) (var_ref vec_ctor) (expression float b2f >> (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) >> (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) >> (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) >> )) >> >> ) >> >> >> When converted to TGS becomes: >> >> FRAG >> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 >> DCL OUT[0], COLOR >> DCL CONST[0..3] >> DCL TEMP[0..2], LOCAL >> IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} >> IMM[1] INT32 {0, 0, 0, 0} >> 0: MOV TEMP[0].yzw, IMM[0].xxxx >> 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy >> 2: OR TEMP[1].x, TEMP[1].xxxx, TEMP[1].yyyy >> 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy >> 4: OR TEMP[2].x, TEMP[2].xxxx, TEMP[2].yyyy >> 5: MOV TEMP[1].y, TEMP[2].xxxx >> 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy >> 7: USNE TEMP[1].x, TEMP[1].xxxx, IMM[1].xxxx >> 8: NOT TEMP[1].x, TEMP[1].xxxx >> 9: AND TEMP[0].x, TEMP[1].xxxx, IMM[0].yyyy >> 10: MOV OUT[0], TEMP[0] >> 11: END >> >> Note that FSNE/OR are used, implying that the integer version of these >> is expected. However then it goes on to use DP2, which, as I >> understand, does a floating point multiply + add. Now, this _happens_ >> to work out, since the integer representations of float 0 and int 0 >> are the same, and those are really the only possilibities we care >> about. >> >> However this seems really dodgy... wouldn't it be clearer to use >> either SNE + OR (which would still work!) + DP2, or alternatively AND >> them all together instead of SNE/DP2? This seems to come in via >> ir_unop_any_nequal. IMO the latter would be better since it keeps > > Erm, sorry -- the email subject and this sentence isn't _quite_ > accurate. That should be ir_unop_any. ir_binop_any_nequal is what > generates the FSNE/OR' combos. But everything else still holds :) > >> things in integer space, and presumably AND's are cheaper than >> fmul/fadd. >> >> I noticed this because nouveau's codegen logic isn't able to optimize >> this intelligently and I was trying to figure out why. >> >> Thoughts?
I sent a patch that implements a native integers version of ir_unop_any: http://patchwork.freedesktop.org/patch/25569/ >From the overall symmetry of things, it seems like this was just forgotten whenever native integer support was added. All the other any_equal/etc have if (native_integers) do OR+etc else DP2 + etc. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev