On Tue, Jun 6, 2017 at 1:48 PM, Connor Abbott <cwabbo...@gmail.com> wrote: > On Tue, Jun 6, 2017 at 1:45 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >> >> >> On Mon, Jun 5, 2017 at 9:52 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >>> >>> On Mon, Jun 5, 2017 at 6:37 PM, Connor Abbott <cwabbo...@gmail.com> wrote: >>>> >>>> I pushed a v2 at >>>> https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-divergence-v2. >>>> I'm not sure if I like this version better, though. I'll have to think >>>> about it. In the meantime, feel free to take a look. >>> >>> >>> I've taken a skim through the branch and I agree that I'm not sure either. >>> Here's a few thoughts in no particular order: >>> >>> 1) Other than the fact that it's a pile of churn, it doesn't seem to make >>> too much difference whether dFdx and dFdy are ALU or intrinsics >>> >>> 2) Convergent instructions are, in a lot of ways, easier to deal with >>> than plain cross-thread ones. Convergent ops can always be moved up the >>> dominance tree or down into uniform control-flow. Regular cross-thread >>> instructions can't be moved across any non-uniform control-flow. >>> >>> 3) dFdx and dFdy are weird because they're convergent so it's clear they >>> are special but not clear they should be intrinsics instead of ALU >>> >>> 4) I like the nir_instr_is_convergent() and nir_instr_is_cross_thread() >>> helpers >>> >>> 5) non-convergent cross-thread instructions should definitely be >>> intrinsics. >>> >>> 6) I think the shader ballot stuff is all non-convergent cross-thread as >>> are some of the more advanced subgroup operations (see HLSL shader model >>> 6.0). >> >> >> Having slept on things a bit, I think I've come to the conclusion that >> leaving dFdx and dFdy as-is should be fine so long as we have the >> nir_instr_is_convergent() and _is_cross_thread() helpers. We need to do >> special casing in those for texture instructions anyway so adding in a quick >> switch for ALU derivatives isn't bad. For shader_ballot type instructions, >> I think they're probably best done as intrinsics for now. That way the >> compiler will leave them alone most of the time and only things that >> actually know what they're doing will ever try to optimize them. >> >> --Jason > > Ok, that sounds good.
I pushed a nir-divergence-v3 branch which does just that. I'll start using that as a base for my work on radv. > >> >>> >>> That's all for now, >>> >>> --Jason >>> >>>> >>>> On Mon, Jun 5, 2017 at 2:43 PM, Jason Ekstrand <ja...@jlekstrand.net> >>>> wrote: >>>> > On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott <cwabbo...@gmail.com> >>>> > wrote: >>>> >> >>>> >> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand <ja...@jlekstrand.net> >>>> >> wrote: >>>> >> > I'm not sure how I feel about having these as ALU operations. ALU >>>> >> > operations are generally pure functions (with the exception >>>> >> > derivative) >>>> >> > that >>>> >> > can be re-ordered at will. I don't really like breaking that. In >>>> >> > fact, >>>> >> > I'd >>>> >> > almost be inclined to make derivatives intrinsics and just >>>> >> > special-case >>>> >> > them >>>> >> > in constant folding. Thoughts? >>>> >> >>>> >> I wasn't too sure about this either. It is a little weird to make >>>> >> these ALU instructions. I followed the rule here that if something can >>>> >> be constant-folded, it should be an ALU instruction, but I guess you >>>> >> can argue that it's just a coincidence that these can be >>>> >> constant-folded anyways. >>>> > >>>> > >>>> > Yeah. As subgroup ops get more complicated, I think a log of the >>>> > subgroup >>>> > operations can be constant-folded after a fashion but the rules get >>>> > weird >>>> > fast. >>>> > >>>> >> >>>> >> I guess the main downside is that it would be >>>> >> impossible to make nir_algebraic patterns with these, although I can't >>>> >> think of too many simple pattern-matching type things you'd want to do >>>> >> on these instructions anyways. >>>> > >>>> > >>>> > Yeah. My gut also tells me that shaders which are "advanced" enough to >>>> > use >>>> > subgroup features probably don't need (or it can't be done) the massive >>>> > reductions we do for D3D9-generated shaders. >>>> > >>>> >> >>>> >> Maybe something like not(any(not(foo))) >>>> >> -> all(foo) and vice-versa? >>>> >> >>>> >> > >>>> >> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott <cwabbo...@gmail.com> >>>> >> > wrote: >>>> >> >> >>>> >> >> Signed-off-by: Connor Abbott <cwabbo...@gmail.com> >>>> >> >> --- >>>> >> >> src/compiler/nir/nir_intrinsics.h | 14 ++++++++++++++ >>>> >> >> src/compiler/nir/nir_opcodes.py | 18 ++++++++++++++++-- >>>> >> >> 2 files changed, 30 insertions(+), 2 deletions(-) >>>> >> >> >>>> >> >> diff --git a/src/compiler/nir/nir_intrinsics.h >>>> >> >> b/src/compiler/nir/nir_intrinsics.h >>>> >> >> index 21e7d90..157df7f 100644 >>>> >> >> --- a/src/compiler/nir/nir_intrinsics.h >>>> >> >> +++ b/src/compiler/nir/nir_intrinsics.h >>>> >> >> @@ -330,6 +330,20 @@ SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx) >>>> >> >> SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx) >>>> >> >> SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx) >>>> >> >> SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx) >>>> >> >> +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx) >>>> >> >> + >>>> >> >> + >>>> >> >> +/* ARB_shader_ballot instructions */ >>>> >> >> + >>>> >> >> +SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx) >>>> >> >> +SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx) >>>> >> >> +SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx) >>>> >> >> +SYSTEM_VALUE(subgroup_le_mask, 1, 0, xx, xx, xx) >>>> >> >> +SYSTEM_VALUE(subgroup_lt_mask, 1, 0, xx, xx, xx) >>>> >> >> + >>>> >> >> +INTRINSIC(ballot, 1, ARR(0), true, 0, 0, 0, xx, xx, xx, >>>> >> >> + NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER >>>> >> >> | >>>> >> >> + NIR_INTRINSIC_CROSS_THREAD) >>>> >> >> >>>> >> >> /* Blend constant color values. Float values are clamped. */ >>>> >> >> SYSTEM_VALUE(blend_const_color_r_float, 1, 0, xx, xx, xx) >>>> >> >> diff --git a/src/compiler/nir/nir_opcodes.py >>>> >> >> b/src/compiler/nir/nir_opcodes.py >>>> >> >> index be3ab6d..05a80b2 100644 >>>> >> >> --- a/src/compiler/nir/nir_opcodes.py >>>> >> >> +++ b/src/compiler/nir/nir_opcodes.py >>>> >> >> @@ -120,8 +120,10 @@ def opcode(name, output_size, output_type, >>>> >> >> input_sizes, input_types, >>>> >> >> input_types, convergent, cross_thread, >>>> >> >> algebraic_properties, const_expr) >>>> >> >> >>>> >> >> -def unop_convert(name, out_type, in_type, const_expr): >>>> >> >> - opcode(name, 0, out_type, [0], [in_type], "", const_expr) >>>> >> >> +def unop_convert(name, out_type, in_type, const_expr, >>>> >> >> cross_thread=False, >>>> >> >> + convergent=False): >>>> >> >> + opcode(name, 0, out_type, [0], [in_type], "", const_expr, >>>> >> >> convergent, >>>> >> >> + cross_thread) >>>> >> >> >>>> >> >> def unop(name, ty, const_expr, convergent=False, >>>> >> >> cross_thread=False): >>>> >> >> opcode(name, 0, ty, [0], [ty], "", const_expr, convergent, >>>> >> >> cross_thread) >>>> >> >> @@ -355,6 +357,18 @@ for i in xrange(1, 5): >>>> >> >> for j in xrange(1, 5): >>>> >> >> unop_horiz("fnoise{0}_{1}".format(i, j), i, tfloat, j, >>>> >> >> tfloat, >>>> >> >> "0.0f") >>>> >> >> >>>> >> >> +# ARB_shader_ballot instructions >>>> >> >> + >>>> >> >> +opcode("read_invocation", 0, tuint, [0, 1], [tuint, tuint32], "", >>>> >> >> "src0", >>>> >> >> + cross_thread=True) >>>> >> >> +unop("read_first_invocation", tuint, "src0", cross_thread=True) >>>> >> >> + >>>> >> >> +# ARB_shader_group_vote instructions >>>> >> >> + >>>> >> >> +unop("any_invocations", tbool, "src0", cross_thread=True) >>>> >> >> +unop("all_invocations", tbool, "src0", cross_thread=True) >>>> >> >> +unop("all_invocations_equal", tbool, "true", cross_thread=True) >>>> >> >> + >>>> >> >> def binop_convert(name, out_type, in_type, alg_props, const_expr): >>>> >> >> opcode(name, 0, out_type, [0, 0], [in_type, in_type], >>>> >> >> alg_props, >>>> >> >> const_expr) >>>> >> >> >>>> >> >> -- >>>> >> >> 2.9.3 >>>> >> >> >>>> >> >> _______________________________________________ >>>> >> >> mesa-dev mailing list >>>> >> >> mesa-dev@lists.freedesktop.org >>>> >> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>>> >> > >>>> >> > >>>> > >>>> > >>> >>> >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev