Jason Ekstrand <ja...@jlekstrand.net> writes: > Overall this looks correct. I've got a few nits below and I'd like to take > a look at it with fresh eyes before giving an R-B as it's complicated > especially with all of the stuff to handle non-ssa. Not sure if it's > really worth doing non-ssa now that I see how much more complicated it > makes things. > --Jason > > On Wed, Jan 21, 2015 at 5:26 PM, Eric Anholt <e...@anholt.net> wrote: > >> This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted >> for vc4.
>> +static void >> +reduce_op_replace(nir_alu_instr *instr, nir_ssa_def *def, void *mem_ctx) >> +{ >> + assert(instr->dest.write_mask == 1); >> + >> + if (instr->dest.dest.is_ssa) { >> + nir_src new_src; >> + new_src.is_ssa = true; >> + new_src.ssa = def; >> + nir_ssa_def_rewrite_uses(&instr->dest.dest.ssa, new_src, mem_ctx); >> + } else { >> + nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov); >> + mov->src[0].src.is_ssa = true; >> + mov->src[0].src.ssa = def; >> + >> + nir_alu_dest_copy(&mov->dest, &instr->dest, mem_ctx); >> + nir_instr_insert_after(&instr->instr, &mov->instr); > > > I usually do insert_before when possible. It will result in the same list > (since we remove the instruction we're putting this before) but, since this > happens as we're iterating over it, insert_before is a bit safer. Maybe > this is still safe; I'm not sure. Done. >> + } >> + >> + nir_instr_remove(&instr->instr); >> +} >> + >> +static void >> +lower_reduction(nir_alu_instr *instr, nir_op chan_op, nir_op merge_op, >> + void *mem_ctx) >> +{ >> + unsigned num_components = nir_op_infos[instr->op].input_sizes[0]; >> + >> + nir_ssa_def *last = NULL; >> + for (unsigned i = 0; i < num_components; i++) { >> + nir_alu_instr *chan = nir_alu_instr_create(mem_ctx, chan_op); >> + nir_alu_ssa_dest_init(chan, 1); >> + nir_alu_src_copy(&chan->src[0], &instr->src[0], mem_ctx); >> + chan->src[0].swizzle[0] = chan->src[0].swizzle[i]; >> + if (nir_op_infos[chan_op].num_inputs > 1) { >> > > assert num_inputs == 2 here? Done. >> + nir_alu_src_copy(&chan->src[1], &instr->src[1], mem_ctx); >> + chan->src[1].swizzle[0] = chan->src[1].swizzle[i]; >> + } >> + >> + nir_instr_insert_before(&instr->instr, &chan->instr); >> + >> + if (i == 0) { >> + last = &chan->dest.dest.ssa; >> + } else { >> + nir_alu_instr *merge = nir_alu_instr_create(mem_ctx, merge_op); >> + nir_alu_ssa_dest_init(merge, 1); >> + merge->dest.write_mask = 1; >> + merge->src[0].src.is_ssa = true; >> + merge->src[0].src.ssa = last; >> + merge->src[1].src.is_ssa = true; >> + merge->src[1].src.ssa = &chan->dest.dest.ssa; >> + nir_instr_insert_before(&instr->instr, &merge->instr); >> + last = &merge->dest.dest.ssa; >> > > It might be nice if the tree were better balanced but that looks like way > too much work, so meh. Yeah, ew. We'll probably want the general rebalancer to be ported to NIR, anyway. >> + if (num_components < 2) >> + unsupported(&instr->instr); >> + vec_instr = nir_alu_instr_create(mem_ctx, nir_op_map[num_components >> - 2]); >> + nir_alu_ssa_dest_init(vec_instr, num_components); >> + } >> + >> + /* Walk from the end of the channels, so our incremental inserts after >> the >> + * original instruction end up in a sensible xyzw order. >> + */ >> + for (chan = 0; chan < 4; chan++) { >> + if (!(lower_chans & (1 << chan))) { >> + if (instr->dest.dest.is_ssa && >> + chan < instr->dest.dest.ssa.num_components) { >> + unsupported(&instr->instr); >> + } >> + continue; >> + } >> + >> + nir_alu_instr *lower = nir_alu_instr_create(mem_ctx, instr->op); >> + for (i = 0; i < num_src; i++) { >> + /* bcsel and fcsel reuse the same src channel in src[0]. */ >> > > Stale comment. This isn't the case for bcsel or fcsel anymore Fixed. >> + unsigned src_chan = (nir_op_infos[instr->op].input_sizes[i] == 1 >> ? >> + 0 : chan); >> > > I think you want input_sizes[i] != 0 here. We could have an instruction > that takes a vec2 as one component but is otherwise vectorized. I don't even know what this path could reasonably do with a 2-component src with a >2 component dest, though. Leave the original 2-component swizzle in place? This path removes all other vector inputs from ALU ops, so it seems like a funny theoretical operation to be trying to handle here.
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev