On Wed, May 23, 2018 at 6:37 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: > On 23.05.2018 15:30, Bas Nieuwenhuizen wrote: >> >> WQM is pretty reliable now on LLVM 7, so let us just use >> DPP + WQM. >> >> This gives approximately a 1.5% performance increase on the >> vrcompositor built-in benchmark. >> >> v2: Use ac_build_quad_swizzle. >> --- >> src/amd/common/ac_llvm_build.c | 16 +++++++++++++++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/src/amd/common/ac_llvm_build.c >> b/src/amd/common/ac_llvm_build.c >> index 36c1d62637b..0c0228fe9c7 100644 >> --- a/src/amd/common/ac_llvm_build.c >> +++ b/src/amd/common/ac_llvm_build.c >> @@ -1170,7 +1170,21 @@ ac_build_ddxy(struct ac_llvm_context *ctx, >> LLVMValueRef tl, trbl, args[2]; >> LLVMValueRef result; >> - if (ctx->chip_class >= VI) { >> + if (ctx->chip_class >= VI && HAVE_LLVM >= 0x0700) { > > > Do you really need the chip_class check here? ac_build_quad_swizzle should > just use ds_swizzle on the older chips, right? > > So all the code below can be removed once we drop support for LLVM < 7 > (which will of course be quite some time in the future, but hey!)
Fair enough, removed the check locally. > > Apart from that, > > Reviewed-by: Nicolai Hähnle <nicolai.haeh...@amd.com> > >> + unsigned tl_lanes[4], trbl_lanes[4]; >> + >> + for (unsigned i = 0; i < 4; ++i) { >> + tl_lanes[i] = i & mask; >> + trbl_lanes[i] = (i & mask) + idx; >> + } >> + >> + tl = ac_build_quad_swizzle(ctx, val, >> + tl_lanes[0], tl_lanes[1], >> + tl_lanes[2], tl_lanes[3]); >> + trbl = ac_build_quad_swizzle(ctx, val, >> + trbl_lanes[0], trbl_lanes[1], >> + trbl_lanes[2], >> trbl_lanes[3]); >> + } else if (ctx->chip_class >= VI) { >> LLVMValueRef thread_id, tl_tid, trbl_tid; >> thread_id = ac_get_thread_id(ctx); >> > > > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev