On Nov 27, 2016, at 11:13 PM, Ilia Mirkin <imir...@alum.mit.edu<mailto:imir...@alum.mit.edu>> wrote:
On Thu, Nov 24, 2016 at 6:11 PM, Ilia Mirkin <imir...@alum.mit.edu<mailto:imir...@alum.mit.edu>> wrote: Instead of (incorrectly) biasing the snorm value to make it look like a unorm, just use signed integer math. This fixes arb_color_buffer_float-render GL_RGBA8_SNORM Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu<mailto:imir...@alum.mit.edu>> --- src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp index ad809c4..339ca52 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp @@ -692,9 +692,13 @@ struct BlendJit : public Builder dst[i] = BITCAST(dst[i], mSimdInt32Ty); break; case SWR_TYPE_SNORM: - src[i] = FADD(src[i], VIMMED1(0.5f)); - dst[i] = FADD(dst[i], VIMMED1(0.5f)); - /* fallthrough */ + src[i] = FP_TO_SI( + FMUL(src[i], VIMMED1(scale[i])), + mSimdInt32Ty); + dst[i] = FP_TO_SI( + FMUL(dst[i], VIMMED1(scale[i])), + mSimdInt32Ty); + break; case SWR_TYPE_UNORM: src[i] = FP_TO_UI( FMUL(src[i], VIMMED1(scale[i])), @@ -728,11 +732,14 @@ struct BlendJit : public Builder result[i] = BITCAST(result[i], mSimdFP32Ty); break; case SWR_TYPE_SNORM: + result[i] = SHL(result[i], 32 - info.bpc[i]); + result[i] = ASHR(result[i], 32 - info.bpc[i]); These two immediate arguments should probably have a C() around them. I've fixed that up in my tree. Hopefully these will emit as VPSLLD and VPSRAD. Not sure how to check that. With the version of the patch from your tree, I’m seeing this IR: %24 = ashr exact <8 x i32> %23, i32 24 %25 = sitofp <8 x i32> %24 to <8 x float> %26 = fmul <8 x float> %25, <float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000, float 0x3F80204080000000> store <8 x float> %26, <8 x float>* %result, align 32 Turn into this x86 code: 9a: vpslld ymm1,ymm3,0x18 9f: vpsrad ymm1,ymm1,0x18 a4: vcvtdq2ps ymm1,ymm1 a8: vmulps ymm1,ymm1,ymm2 ac: vmovaps YMMWORD PTR [rax+0x20],ymm1 So llvm does what you expected. Version of this patch from your tree Reviewed-by: Tim Rowley <timothy.o.row...@intel.com<mailto:timothy.o.row...@intel.com>> + result[i] = FMUL(SI_TO_FP(result[i], mSimdFP32Ty), + VIMMED1(1.0f / scale[i])); + break; case SWR_TYPE_UNORM: result[i] = FMUL(UI_TO_FP(result[i], mSimdFP32Ty), VIMMED1(1.0f / scale[i])); - if (info.type[i] == SWR_TYPE_SNORM) - result[i] = FADD(result[i], VIMMED1(-0.5f)); break; } -- 2.7.3
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev