On Fri, Jul 3, 2015 at 10:23 AM, Antía Puentes <apuen...@igalia.com> wrote: > Hi Jason, > > On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote: >> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev <el...@igalia.com> wrote: >> > From: Antia Puentes <apuen...@igalia.com> >> > >> > For operations that have a predefined operand size > 0, defined in >> > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the >> > components from outside the source vector. However, the driver >> > expects those components to have a swizzle value equal to the swizzle >> > of the component in (number of vector elements - 1). For example, >> > >> > for a vec2 operation with an identity swizzle (.xy): >> > - (NIR -> XYXX, driver ->XYYYY) >> > >> > for a vec3 operation with swizzle (.zxy) >> > - (NIR-> ZXYX, driver -> ZXYY) >> >> Why is this needed. Was there some bug you ran into regarding this or >> are you just trying to make the generated code match? If the >> component isn't used, then it seems to me like the swizzle shouldn't >> matter. NIR may give you bogus swizzles outside of the enabled >> channels but it may do that regardless. I'm also not seeing how >> composing swizzles fixes anything. > > There were several bugs in our NIR->vec4 code path related to NIR > returning bogus swizzles outside the components of a vector. When doing > an "any", "equal" or "notEqual" operation on vectors, the result of the > instruction "cmp" is predicated using "all4h" or "any4h" depending on > the operation. The results were incorrect for vectors with less than 4 > elements because: (1) I was not setting the writemask of the "cmp" > operation according to the vector size and (2) NIR returns bogus > swizzles for components outside the vector. As a consequence, we had > spurious component-wise comparisons that affected the result of the > predication. > > As composing the operand swizzle with the identity swizzle-per-size > means to repeat the value of the swizzle of the last vector-element on > the components outside the vector, the error mentioned above does not > happen because to repeat the last (inside-boundaries) component-wise > comparison for the components outside the vector does not affect the > result of the predication. > > Other way to fix the bugs above is to set the writemask of the "cmp" > operation to the appropriate mask according to the vector size, thus > forcing the "cmp" instruction to compare only the components inside the > vector boundaries. As the vec4_visitor does not set the writemask of the > "cmp" instruction because it relies on the operand having the correct > swizzle outside the vector boundaries, for consistency with the existing > behaviour, I decided to do the same and compose the swizzle (notice that > the composing is done in the vec4_visitor when visiting an ir_swizzle, > while in NIR we only set the swizzle for the elements inside the > vector). I can remove this function if you prefer setting the writemask > over composing the swizzles.
I could go either way. I kind of like the idea of setting the writemask rather than the swizzle, but I'm fine with either, now that you've explained it. If we go with fixing the swizzle, I think I'd like to see a few changes. First, I think I'd rather see this as part of the loop at the top of nir_emit_alu that sets up the sources. Second, your implementation always uses src[0] as representative of all fixed-sized sources; this works now but is not guaranteed. Third, I think it needs a better comment. The current comment shows differences between what NIR gives you in some cases and what the old visitor gives but, as far as I can tell, doesn't actually demonstrate something that would be a bug without it. The explanation you gave above together with a better example would probably be sufficient. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev