Timothy Arceri <tarc...@itsqueeze.com> writes: > This is intended to be called before nir_lower_io() so that we > can do some linking optimisations with the results. It can also > be used with drivers that don't use nir_lower_io() at all such > as RADV.
> +static void > +lower_load_to_scalar_early(nir_builder *b, nir_intrinsic_instr *intr, > + nir_variable *var, struct hash_table > *split_inputs, > + struct hash_table *split_outputs) > +{ > + b->cursor = nir_before_instr(&intr->instr); > + > + assert(intr->dest.is_ssa); > + > + nir_ssa_def *loads[4]; > + > + nir_variable **chan_vars; > + if (var->data.mode == nir_var_shader_in) { > + chan_vars = get_channel_variables(split_inputs, var); > + } else { > + chan_vars = get_channel_variables(split_outputs, var); > + } > + > + for (unsigned i = 0; i < intr->num_components; i++) { > + nir_variable *chan_var = chan_vars[var->data.location_frac + i]; > + if (!chan_vars[var->data.location_frac + i]) { > + chan_var = nir_variable_clone(var, b->shader); > + chan_var->data.location_frac = var->data.location_frac + i; > + chan_var->type = glsl_channel_type(chan_var->type); > + > + chan_vars[var->data.location_frac + i] = chan_var; > + > + nir_shader_add_variable(b->shader, chan_var); > + } > + > + nir_intrinsic_instr *chan_intr = > + nir_intrinsic_instr_create(b->shader, intr->intrinsic); > + nir_ssa_dest_init(&chan_intr->instr, &chan_intr->dest, > + 1, intr->dest.ssa.bit_size, NULL); > + chan_intr->num_components = 1; > + chan_intr->variables[0] = nir_deref_var_create(chan_intr, chan_var); > + > + if (intr->variables[0]->deref.child) { > + chan_intr->variables[0]->deref.child = > + > &clone_deref_array(nir_deref_as_array(intr->variables[0]->deref.child), > + &chan_intr->variables[0]->deref)->deref; > + } Weird. I would have expected that we would turn arrays into separate vectors (if possible) before this pass, so there wouldn't be any deref chains. Has scalarizing arryas, specifically, been useful? > +/* > + * This function is intended to be called earlier than > nir_lower_io_to_scalar() > + * i.e. before nir_lower_io() is called. > + */ > +void > +nir_lower_io_to_scalar_early(nir_shader *shader, unsigned first, unsigned > last) > +{ > + nir_variable_mode mask = 0; > + if (shader->info.stage != first) > + mask |= nir_var_shader_in; > + > + if (shader->info.stage != last) > + mask |= nir_var_shader_out; Could we pass in a mode mask like nir_lower_io_to_scalar()? This first/last arg feels really weird to me. I actually kind of want this to be in the same pass as the other one -- the cost of creating the two hash tables and switching on a few more intrinsic types seems really small to me (One of my concerns about NIR passes is the lack of documentation, particularly around ordering requirements and driver-lowered IO). > + struct hash_table *split_inputs = > + _mesa_hash_table_create(NULL, _mesa_hash_pointer, > + _mesa_key_pointer_equal); > + struct hash_table *split_outputs = > + _mesa_hash_table_create(NULL, _mesa_hash_pointer, > + _mesa_key_pointer_equal); > + > + nir_foreach_function(function, shader) { > + if (function->impl) { > + nir_builder b; > + nir_builder_init(&b, function->impl); > + > + nir_foreach_block(block, function->impl) { > + nir_foreach_instr_safe(instr, block) { > + if (instr->type != nir_instr_type_intrinsic) > + continue; > + > + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr); > + > + if (intr->num_components == 1) > + continue; > + > + if (intr->intrinsic != nir_intrinsic_load_var && > + intr->intrinsic != nir_intrinsic_store_var && > + intr->intrinsic != nir_intrinsic_interp_var_at_centroid && > + intr->intrinsic != nir_intrinsic_interp_var_at_sample && > + intr->intrinsic != nir_intrinsic_interp_var_at_offset) > + continue; > + > + nir_variable *var = intr->variables[0]->var; > + nir_variable_mode mode = var->data.mode; > + > + /* TODO: add patch support */ > + if (var->data.patch) > + continue; > + > + /* TODO: add doubles support */ > + if (glsl_type_is_64bit(glsl_without_array(var->type))) > + continue; > + > + if (var->data.location < VARYING_SLOT_VAR0 && > + var->data.location >= 0) > + continue; Could we do VARYING_SLOT_TEX* as well?
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev