On 25 July 2015 at 01:15, Marek Olšák <mar...@gmail.com> wrote: > On Wed, Jul 22, 2015 at 12:51 AM, Dave Airlie <airl...@gmail.com> wrote: >> From: Dave Airlie <airl...@redhat.com> >> >> This is part of ARB_gpu_shader5, and this passes >> all the piglit tests currently available. >> >> Signed-off-by: Dave Airlie <airl...@redhat.com> >> --- >> docs/GL3.txt | 2 +- >> src/gallium/drivers/radeonsi/si_shader.c | 232 >> ++++++++++++++++++++++++++++++- >> 2 files changed, 232 insertions(+), 2 deletions(-) >> >> diff --git a/docs/GL3.txt b/docs/GL3.txt >> index 4f6c415..d74ae63 100644 >> --- a/docs/GL3.txt >> +++ b/docs/GL3.txt >> @@ -107,7 +107,7 @@ GL 4.0, GLSL 4.00: >> - Geometry shader instancing DONE (r600, >> radeonsi, llvmpipe, softpipe) >> - Geometry shader multiple streams DONE () >> - Enhanced per-sample shading DONE (r600, radeonsi) >> - - Interpolation functions DONE (r600) >> + - Interpolation functions DONE (r600, radeonsi) >> - New overload resolution rules DONE >> GL_ARB_gpu_shader_fp64 DONE (nvc0, >> radeonsi, llvmpipe, softpipe) >> GL_ARB_sample_shading DONE (i965, nv50, >> nvc0, r600, radeonsi) >> diff --git a/src/gallium/drivers/radeonsi/si_shader.c >> b/src/gallium/drivers/radeonsi/si_shader.c >> index c5d80f0..0c01c90 100644 >> --- a/src/gallium/drivers/radeonsi/si_shader.c >> +++ b/src/gallium/drivers/radeonsi/si_shader.c >> @@ -2263,6 +2263,225 @@ static void si_llvm_emit_ddxy( >> emit_data->output[0] = lp_build_gather_values(gallivm, result, 4); >> } >> >> +/* return 4 values - v2i32 DDX, v2i32 DDY */ >> +static LLVMValueRef si_llvm_emit_ddxy_interp( >> + struct lp_build_tgsi_context * bld_base, >> + LLVMValueRef interp_ij) > > Is there any chance we could simplify this by using the DDX/DDY > instructions directly here? For example: > > result = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_DDX, arg0);
Not really, I thought about trying to combine the functions a few time, but nothing fell out that I thought was simpler. The interp version just does a ddxy on an input i/j pair, not a tgsi src. It didn't seem useful to convert things to a tgsi src, just to pipe it back in. I've also no idea what the overhead on LDS store/loads is, but this patch definitely tries to minimise how many times we hit it, whereas using the DDX/Y interface would hit LDS stores 4 times, this only hits it twice. Dave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev