On Tue, Nov 29, 2016 at 8:21 PM, Rowley, Timothy O <timothy.o.row...@intel.com> wrote: > > On Nov 27, 2016, at 11:13 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > > On Thu, Nov 24, 2016 at 6:11 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > > Instead of (incorrectly) biasing the snorm value to make it look like a > unorm, just use signed integer math. > > This fixes arb_color_buffer_float-render GL_RGBA8_SNORM > > Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> > --- > src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp | 17 > ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp > b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp > index ad809c4..339ca52 100644 > --- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp > +++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp > @@ -692,9 +692,13 @@ struct BlendJit : public Builder > dst[i] = BITCAST(dst[i], mSimdInt32Ty); > break; > case SWR_TYPE_SNORM: > - src[i] = FADD(src[i], VIMMED1(0.5f)); > - dst[i] = FADD(dst[i], VIMMED1(0.5f)); > - /* fallthrough */ > + src[i] = FP_TO_SI( > + FMUL(src[i], VIMMED1(scale[i])), > + mSimdInt32Ty); > + dst[i] = FP_TO_SI( > + FMUL(dst[i], VIMMED1(scale[i])), > + mSimdInt32Ty); > + break; > case SWR_TYPE_UNORM: > src[i] = FP_TO_UI( > FMUL(src[i], VIMMED1(scale[i])), > @@ -728,11 +732,14 @@ struct BlendJit : public Builder > result[i] = BITCAST(result[i], mSimdFP32Ty); > break; > case SWR_TYPE_SNORM: > + result[i] = SHL(result[i], 32 - info.bpc[i]); > + result[i] = ASHR(result[i], 32 - info.bpc[i]); > > > These two immediate arguments should probably have a C() around them. > I've fixed that up in my tree. Hopefully these will emit as VPSLLD and > VPSRAD. Not sure how to check that. > > > With the version of the patch from your tree, I’m seeing this IR: > > %24 = ashr exact <8 x i32> %23, i32 24 > %25 = sitofp <8 x i32> %24 to <8 x float> > %26 = fmul <8 x float> %25, <float 0x3F80204080000000, float > 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, > float 0x3F80204080000000, float 0x3F80204080000000, float > 0x3F80204080000000, float 0x3F80204080000000> > store <8 x float> %26, <8 x float>* %result, align 32 > > Turn into this x86 code: > > 9a: vpslld ymm1,ymm3,0x18 > 9f: vpsrad ymm1,ymm1,0x18 > a4: vcvtdq2ps ymm1,ymm1 > a8: vmulps ymm1,ymm1,ymm2 > ac: vmovaps YMMWORD PTR [rax+0x20],ymm1 > > So llvm does what you expected. > > Version of this patch from your tree Reviewed-by: Tim Rowley > <timothy.o.row...@intel.com> > >
Thanks! I actually verified this myself as well by adding a DumpAsm call to the blend jitter. I think the pre-C() call was also fine - there's an explicit CreateShl() and so on with an int arg for the shift, which is why I didn't notice there was a problem in the first place -- there wasn't. However using C() is more consistent with the other code, so I'll keep that. [As an aside, we end up going down this path with logicop == NOOP, which is the biggest waste ever. But that's a fix for later.] > > + result[i] = FMUL(SI_TO_FP(result[i], mSimdFP32Ty), > + VIMMED1(1.0f / scale[i])); > + break; > case SWR_TYPE_UNORM: > result[i] = FMUL(UI_TO_FP(result[i], mSimdFP32Ty), > VIMMED1(1.0f / scale[i])); > - if (info.type[i] == SWR_TYPE_SNORM) > - result[i] = FADD(result[i], VIMMED1(-0.5f)); > break; > } > > -- > 2.7.3 > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev