On Thu, Nov 2, 2017 at 12:10 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: > On 31.10.2017 16:36, Connor Abbott wrote: >> >> On Tue, Oct 31, 2017 at 2:08 AM, Dave Airlie <airl...@gmail.com> wrote: >>>> >>>> +LLVMValueRef >>>> +ac_build_subgroup_inclusive_scan(struct ac_llvm_context *ctx, >>>> + LLVMValueRef src, >>>> + ac_reduce_op reduce, >>>> + LLVMValueRef identity) >>>> +{ >>>> + /* See >>>> http://gpuopen.com/amd-gcn-assembly-cross-lane-operations/ >>>> + * >>>> + * Note that each dpp/reduce pair is supposed to be compiled >>>> down to >>>> + * one instruction by LLVM, at least for 32-bit values. >>>> + * >>>> + * TODO: use @llvm.amdgcn.ds.swizzle on SI and CI >>>> + */ >>>> + LLVMValueRef value = src; >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, src, >>>> + dpp_row_sr(1), 0xf, 0xf, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, src, >>>> + dpp_row_sr(2), 0xf, 0xf, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, src, >>>> + dpp_row_sr(3), 0xf, 0xf, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, value, >>>> + dpp_row_sr(4), 0xf, 0xe, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, value, >>>> + dpp_row_sr(8), 0xf, 0xc, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, value, >>>> + dpp_row_bcast15, 0xa, 0xf, false)); >>>> + value = reduce(ctx, value, >>>> + ac_build_dpp(ctx, identity, value, >>>> + dpp_row_bcast31, 0xc, 0xf, false)); >>> >>> >>> btw I dumped some shaders from doom on pro, >>> >>> it looked like it ended up with >>> >>> 1, 0xf, 0xf, >>> 2, 0xf, 0xf, >>> 4, 0xf, 0xf >>> 8, 0xf, 0xf >>> bcast15 0xa, 0xf >>> bcast31 0xc, 0xf >>> >>> It also seems to apply these direct to instructions like >>> /*000000002b80*/ s_nop 0x0 >>> /*000000002b84*/ v_min_u32 v83, v83, v83 row_shr:1 bank_mask:15 >>> row_mask:15 >>> /*000000002b8c*/ s_nop 0x1 >>> /*000000002b90*/ v_min_u32 v83, v83, v83 row_shr:2 bank_mask:15 >>> row_mask:15 >>> /*000000002b98*/ s_nop 0x1 >>> /*000000002b9c*/ v_min_u32 v83, v83, v83 row_shr:4 bank_mask:15 >>> row_mask:15 >>> /*000000002ba4*/ s_nop 0x1 >>> /*000000002ba8*/ v_min_u32 v83, v83, v83 row_shr:8 bank_mask:15 >>> row_mask:15 >>> /*000000002bb0*/ s_nop 0x1 >>> /*000000002bb4*/ v_min_u32 v83, v83, v83 row_bcast15 >>> bank_mask:15 row_mask:10 >>> /*000000002bbc*/ s_nop 0x1 >>> /*000000002bc0*/ v_min_u32 v83, v83, v83 row_bcast31 >>> bank_mask:15 row_mask:12 >>> >>> I think the instruction combining is probably an llvm job, but I >>> wonder if the different row_shr >>> etc is what we should use as well. >> >> >> Yeah, LLVM should be combining the move and min -- hence the comment >> here -- but it isn't yet. That shouldn't be too hard to do once we get >> it working. Also, I've seen that way of doing it before, and IIRC it's >> one instruction slower than the sequence in the blog post I cited, >> since even though there's one less instruction, there's an extra >> two-cycle stall between the first two instructions since v83 is the >> destination of the first instruction and DPP source of the second >> (hence the s_nop 0x1). So once we combine instructions this should be >> better than what -pro does :) > > > Agreed, though even more ideally, LLVM would be able to fill those gaps with > other instructions ;)
Well, that isn't really possible when the sequence is in WWM and everything else isn't. We could fill the slot with a scalar instruction, but I think LLVM is currently overly conservative and treats instructions writing EXEC as barriers even though it doesn't need to. > > Anyway, the combining of instructions is really the important task. Agreed. Although I think getting it working first is even more important :) > > Cheers, > Nicolai > > >> >>> >>> Dave. >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > > > -- > Lerne, wie die Welt wirklich ist, > Aber vergiss niemals, wie sie sein sollte. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev