Hi Jason, On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote: > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev <el...@igalia.com> wrote: > > From: Antia Puentes <apuen...@igalia.com> > > > > For operations that have a predefined operand size > 0, defined in > > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the > > components from outside the source vector. However, the driver > > expects those components to have a swizzle value equal to the swizzle > > of the component in (number of vector elements - 1). For example, > > > > for a vec2 operation with an identity swizzle (.xy): > > - (NIR -> XYXX, driver ->XYYYY) > > > > for a vec3 operation with swizzle (.zxy) > > - (NIR-> ZXYX, driver -> ZXYY) > > Why is this needed. Was there some bug you ran into regarding this or > are you just trying to make the generated code match? If the > component isn't used, then it seems to me like the swizzle shouldn't > matter. NIR may give you bogus swizzles outside of the enabled > channels but it may do that regardless. I'm also not seeing how > composing swizzles fixes anything.
There were several bugs in our NIR->vec4 code path related to NIR returning bogus swizzles outside the components of a vector. When doing an "any", "equal" or "notEqual" operation on vectors, the result of the instruction "cmp" is predicated using "all4h" or "any4h" depending on the operation. The results were incorrect for vectors with less than 4 elements because: (1) I was not setting the writemask of the "cmp" operation according to the vector size and (2) NIR returns bogus swizzles for components outside the vector. As a consequence, we had spurious component-wise comparisons that affected the result of the predication. As composing the operand swizzle with the identity swizzle-per-size means to repeat the value of the swizzle of the last vector-element on the components outside the vector, the error mentioned above does not happen because to repeat the last (inside-boundaries) component-wise comparison for the components outside the vector does not affect the result of the predication. Other way to fix the bugs above is to set the writemask of the "cmp" operation to the appropriate mask according to the vector size, thus forcing the "cmp" instruction to compare only the components inside the vector boundaries. As the vec4_visitor does not set the writemask of the "cmp" instruction because it relies on the operand having the correct swizzle outside the vector boundaries, for consistency with the existing behaviour, I decided to do the same and compose the swizzle (notice that the composing is done in the vec4_visitor when visiting an ir_swizzle, while in NIR we only set the swizzle for the elements inside the vector). I can remove this function if you prefer setting the writemask over composing the swizzles. I remember another regression related to bogus swizzles in unpackHalf2x16, for that function the operand is an uint, however in the vec4_visitor::emit_unpack_half_2x16 function we have: tmp_dst.writemask = WRITEMASK_Y; emit(SHR(tmp_dst, src0, src_reg(16u))); This was not buggy in the old vec4_visitor because the src0's swizzle was correct for the component Y (src0's swizzle would be either XXXX, the identity swizzle for size == 1, or the result of composing the swizzle with XXXX in case the parent IR node was an ir_swizzle). In the case of NIR, as the swizzle can be bogus for all the components except the first one (size == 1), we need to compose it. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev