On Thu, Oct 1, 2015 at 8:14 AM, Connor Abbott <cwabbo...@gmail.com> wrote: > On Wed, Sep 30, 2015 at 12:35 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >> On Wed, Sep 30, 2015 at 8:11 AM, Connor Abbott <cwabbo...@gmail.com> wrote: >>> This will replace direct usage of nir_instrs_equal() in the CSE pass, >>> which reduces an O(n^2) algorithm with an effectively O(n) one. It'll >>> also be useful for implementing GVN on top of GCM. >>> >>> v2: >>> - Add texture support. >>> - Add more comments. >>> - Rename instr_can_hash() to instr_can_rewrite() since it's really more >>> about whether its uses can be rewritten, and it's implicitly used by >>> nir_instrs_equal() as well. >>> - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). >>> - Make the HASH() macro less magical (Topi). >>> - Rewrite the commit message. >>> >>> Signed-off-by: Connor Abbott <cwabbo...@gmail.com> >>> --- >>> src/glsl/nir/nir_instr_set.c | 311 >>> +++++++++++++++++++++++++++++++++++++++++++ >>> src/glsl/nir/nir_instr_set.h | 35 +++++ >>> 2 files changed, 346 insertions(+) >>> >>> diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c >>> index 72ab048..a6f2226 100644 >>> --- a/src/glsl/nir/nir_instr_set.c >>> +++ b/src/glsl/nir/nir_instr_set.c >>> @@ -23,6 +23,178 @@ >>> >>> #include "nir_instr_set.h" >>> >>> +#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data)) >>> + >>> +static uint32_t >>> +hash_src(uint32_t hash, const nir_src *src) >>> +{ >>> + assert(src->is_ssa); >>> + hash = HASH(hash, src->ssa); >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned >>> num_components) >>> +{ >>> + hash = HASH(hash, src->abs); >>> + hash = HASH(hash, src->negate); >>> + >>> + for (unsigned i = 0; i < num_components; i++) >>> + hash = HASH(hash, src->swizzle[i]); >>> + >>> + hash = hash_src(hash, &src->src); >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_alu(uint32_t hash, const nir_alu_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->op); >>> + hash = HASH(hash, instr->dest.dest.ssa.num_components); >> >> Why are you hasing the dest.ssa.num_components instead of the >> writemask? They should be the same in any case, but this seems a bit >> out-of-place. > > No particular reason. I can change it if you want. > >> >>> + >>> + if (nir_op_infos[instr->op].algebraic_properties & >>> NIR_OP_IS_COMMUTATIVE) { >>> + assert(nir_op_infos[instr->op].num_inputs == 2); >>> + uint32_t hash0 = hash_alu_src(hash, &instr->src[0], >>> + >>> nir_ssa_alu_instr_src_components(instr, 0)); >>> + uint32_t hash1 = hash_alu_src(hash, &instr->src[1], >>> + >>> nir_ssa_alu_instr_src_components(instr, 1)); >>> + /* For commutative operations, we need some commutative way of >>> + * combining the hashes. One option would be to XOR them but that >>> + * means that anything with two identical sources will hash to 0 and >>> + * that's common enough we probably don't want the guaranteed >>> + * collision. Either addition or multiplication will also work. >>> + */ >>> + hash = hash0 * hash1; >>> + } else { >>> + for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) { >>> + hash = hash_alu_src(hash, &instr->src[i], >>> + nir_ssa_alu_instr_src_components(instr, i)); >>> + } >>> + } >>> + >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_load_const(uint32_t hash, const nir_load_const_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->def.num_components); >>> + >>> + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f, >>> + instr->def.num_components >>> + * sizeof(instr->value.f[0])); >>> + >>> + return hash; >>> +} >>> + >>> +static int >>> +cmp_phi_src(const void *data1, const void *data2) >>> +{ >>> + const nir_phi_src *src1 = data1, *src2 = data2; >>> + return src1->pred->index - src2->pred->index; >>> +} >>> + >>> +static uint32_t >>> +hash_phi(uint32_t hash, const nir_phi_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->instr.block); >>> + >>> + /* sort sources by predecessor index, since the order shouldn't matter >>> */ >>> + unsigned num_preds = instr->instr.block->predecessors->entries; >>> + nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src)); >> >> You could use NIR_VLA and put it on the stack. That'd probably be more >> efficient. Also, why are you copying the entire source and not just >> making a list of pointers? > > Good point. It does prevent a pointer dereference when comparing, but > that's probably less expensive than copying the entire thing when > swapping. > >> >>> + unsigned i = 0; >>> + nir_foreach_phi_src(instr, src) { >>> + srcs[i++] = *src; >>> + } >>> + qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src); >> >> Sorting seems like as good a way as any. Another option would be to >> do like we do for commutative ALU ops and multiply. Might be more >> efficient but it probably doesn't matter too much. > > Perhaps, although sorting seems safer, since it's closer to how the > hash is supposed to be used. For example, one thing hashing to 0 will > make everything hash to 0, and I suspect there might be other problems > where things aren't getting mixed as well -- which is more of a issue > when combining more than 2 things, although it's not great for > combining sources either.
I think it's fairly unlikely that we'll actually have something hash to zero, especially given that we're hashing block indices in there as well. I think the added hash-table scratching resulting from such a collision is probably less expensive than all this copying, an external library call, and a quick sort. Then again, we have no data so we're just throwing darts at the wall and claiming to be closer to the unmarked bulls-eye than the other. :-) Which makes me wonder, why are we hashing the index and not the pointer? >> >>> + for (i = 0; i < num_preds; i++) { >>> + hash = hash_src(hash, &srcs[i].src); >>> + hash = HASH(hash, srcs[i].pred); >>> + } >>> + free(srcs); >>> + >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_intrinsic(uint32_t hash, const nir_intrinsic_instr *instr) >>> +{ >>> + const nir_intrinsic_info *info = &nir_intrinsic_infos[instr->intrinsic]; >>> + hash = HASH(hash, instr->intrinsic); >>> + >>> + if (info->has_dest) >>> + hash = HASH(hash, instr->dest.ssa.num_components); >>> + >>> + assert(info->num_variables == 0); >>> + >>> + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->const_index, >>> + info->num_indices >>> + * >>> sizeof(instr->const_index[0])); >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_tex(uint32_t hash, const nir_tex_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->op); >>> + hash = HASH(hash, instr->num_srcs); >>> + >>> + for (unsigned i = 0; i < instr->num_srcs; i++) { >>> + hash = HASH(hash, instr->src[i].src_type); >>> + hash = hash_src(hash, &instr->src[i].src); >>> + } >>> + >>> + hash = HASH(hash, instr->coord_components); >>> + hash = HASH(hash, instr->sampler_dim); >>> + hash = HASH(hash, instr->is_array); >>> + hash = HASH(hash, instr->is_shadow); >>> + hash = HASH(hash, instr->is_new_style_shadow); >>> + hash = HASH(hash, instr->const_offset); >>> + unsigned component = instr->component; >>> + hash = HASH(hash, component); >>> + hash = HASH(hash, instr->sampler_index); >>> + hash = HASH(hash, instr->sampler_array_size); >>> + >>> + assert(!instr->sampler); >>> + >>> + return hash; >>> +} >>> + >>> +/* Computes a hash of an instruction for use in a hash table. Note that >>> this >>> + * will only work for instructions where instr_can_rewrite() returns true, >>> and >>> + * it should return identical hashes for two instructions that are the same >>> + * according nir_instrs_equal(). >>> + */ >>> + >>> +static uint32_t >>> +hash_instr(const void *data) >>> +{ >>> + const nir_instr *instr = data; >>> + uint32_t hash = _mesa_fnv32_1a_offset_bias; >>> + >>> + switch (instr->type) { >>> + case nir_instr_type_alu: >>> + hash = hash_alu(hash, nir_instr_as_alu(instr)); >>> + break; >>> + case nir_instr_type_load_const: >>> + hash = hash_load_const(hash, nir_instr_as_load_const(instr)); >>> + break; >>> + case nir_instr_type_phi: >>> + hash = hash_phi(hash, nir_instr_as_phi(instr)); >>> + break; >>> + case nir_instr_type_intrinsic: >>> + hash = hash_intrinsic(hash, nir_instr_as_intrinsic(instr)); >>> + break; >>> + case nir_instr_type_tex: >>> + hash = hash_tex(hash, nir_instr_as_tex(instr)); >>> + break; >>> + default: >>> + unreachable("Invalid instruction type"); >>> + } >>> + >>> + return hash; >>> +} >>> + >>> bool >>> nir_srcs_equal(nir_src src1, nir_src src2) >>> { >>> @@ -66,6 +238,12 @@ nir_alu_srcs_equal(const nir_alu_instr *alu1, const >>> nir_alu_instr *alu2, >>> return nir_srcs_equal(alu1->src[src1].src, alu2->src[src2].src); >>> } >>> >>> +/* Returns "true" if two instructions are equal. Note that this will only >>> + * work for the subset of instructions defined by instr_can_rewrite(). >>> Also, >>> + * it should only return "true" for instructions that hash_instr() will >>> return >>> + * the same hash for (ignoring collisions, of course). >>> + */ >>> + >>> bool >>> nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2) >>> { >>> @@ -204,3 +382,136 @@ nir_instrs_equal(const nir_instr *instr1, const >>> nir_instr *instr2) >>> return false; >>> } >>> >>> +static bool >>> +src_is_ssa(nir_src *src, void *data) >>> +{ >>> + (void) data; >>> + return src->is_ssa; >>> +} >>> + >>> +static bool >>> +dest_is_ssa(nir_dest *dest, void *data) >>> +{ >>> + (void) data; >>> + return dest->is_ssa; >>> +} >>> + >>> +/* This function determines if uses of an instruction can safely be >>> rewritten >>> + * to use another identical instruction instead. Note that this function >>> must >>> + * be kept in sync with hash_instr() and nir_instrs_equal() -- only >>> + * instructions that pass this test will be handed on to those functions, >>> and >>> + * conversely they must handle everything that this function returns true >>> for. >>> + */ >>> + >>> +static bool >>> +instr_can_rewrite(nir_instr *instr) >>> +{ >>> + /* We only handle SSA. */ >>> + if (!nir_foreach_dest(instr, dest_is_ssa, NULL) || >>> + !nir_foreach_src(instr, src_is_ssa, NULL)) >>> + return false; >>> + >>> + switch (instr->type) { >>> + case nir_instr_type_alu: >>> + case nir_instr_type_load_const: >>> + case nir_instr_type_phi: >>> + return true; >>> + case nir_instr_type_tex: { >>> + nir_tex_instr *tex = nir_instr_as_tex(instr); >>> + >>> + /* Don't support un-lowered sampler derefs currently. */ >>> + if (tex->sampler) >>> + return false; >>> + >>> + return true; >>> + } >>> + case nir_instr_type_intrinsic: { >>> + const nir_intrinsic_info *info = >>> + &nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic]; >>> + return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) && >>> + (info->flags & NIR_INTRINSIC_CAN_REORDER) && >>> + info->num_variables == 0; /* not implemented yet */ >>> + } >>> + case nir_instr_type_call: >>> + case nir_instr_type_jump: >>> + case nir_instr_type_ssa_undef: >>> + return false; >>> + case nir_instr_type_parallel_copy: >>> + default: >>> + unreachable("Invalid instruction type"); >>> + } >>> + >>> + return false; >>> +} >>> + >>> +static nir_ssa_def * >>> +nir_instr_get_dest_ssa_def(nir_instr *instr) >>> +{ >>> + switch (instr->type) { >>> + case nir_instr_type_alu: >>> + assert(nir_instr_as_alu(instr)->dest.dest.is_ssa); >>> + return &nir_instr_as_alu(instr)->dest.dest.ssa; >>> + case nir_instr_type_load_const: >>> + return &nir_instr_as_load_const(instr)->def; >>> + case nir_instr_type_phi: >>> + assert(nir_instr_as_phi(instr)->dest.is_ssa); >>> + return &nir_instr_as_phi(instr)->dest.ssa; >>> + case nir_instr_type_intrinsic: >>> + assert(nir_instr_as_intrinsic(instr)->dest.is_ssa); >>> + return &nir_instr_as_intrinsic(instr)->dest.ssa; >>> + case nir_instr_type_tex: >>> + assert(nir_instr_as_tex(instr)->dest.is_ssa); >>> + return &nir_instr_as_tex(instr)->dest.ssa; >>> + default: >>> + unreachable("We never ask for any of these"); >>> + } >>> +} >>> + >>> +static bool >>> +cmp_func(const void *data1, const void *data2) >>> +{ >>> + return nir_instrs_equal(data1, data2); >>> +} >>> + >>> +struct set * >>> +nir_instr_set_create(void *mem_ctx) >>> +{ >>> + return _mesa_set_create(mem_ctx, hash_instr, cmp_func); >>> +} >>> + >>> +void >>> +nir_instr_set_destroy(struct set *instr_set) >>> +{ >>> + _mesa_set_destroy(instr_set, NULL); >>> +} >>> + >>> +bool >>> +nir_instr_set_add_or_rewrite(struct set *instr_set, nir_instr *instr) >>> +{ >>> + if (!instr_can_rewrite(instr)) >>> + return false; >>> + >>> + struct set_entry *entry = _mesa_set_search(instr_set, instr); >>> + if (entry) { >>> + nir_ssa_def *def = nir_instr_get_dest_ssa_def(instr); >>> + nir_ssa_def *new_def = >>> + nir_instr_get_dest_ssa_def((nir_instr *) entry->key); >>> + nir_ssa_def_rewrite_uses(def, nir_src_for_ssa(new_def)); >>> + return true; >>> + } >>> + >>> + _mesa_set_add(instr_set, instr); >>> + return false; >>> +} >>> + >>> +void >>> +nir_instr_set_remove(struct set *instr_set, nir_instr *instr) >>> +{ >>> + if (!instr_can_rewrite(instr)) >>> + return; >>> + >>> + struct set_entry *entry = _mesa_set_search(instr_set, instr); >>> + if (entry) >>> + _mesa_set_remove(instr_set, entry); >>> +} >>> + >>> diff --git a/src/glsl/nir/nir_instr_set.h b/src/glsl/nir/nir_instr_set.h >>> index f5baffa..a7f6c9d 100644 >>> --- a/src/glsl/nir/nir_instr_set.h >>> +++ b/src/glsl/nir/nir_instr_set.h >>> @@ -27,3 +27,38 @@ >>> >>> bool nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2); >>> >>> +/** >>> + * This file defines functions for creating, destroying, and manipulating >>> an >>> + * "instruction set," which is an abstraction for finding duplicate >>> + * instructions using a hash set. Note that the question of whether an >>> + * instruction is actually a duplicate (e.g. whether it has any side >>> effects) >>> + * is handled transparently. The user can pass any instruction to >>> + * nir_instr_set_add_or_rewrite() and nir_instr_set_remove(), and if the >>> + * instruction isn't safe to rewrite or isn't supported, it's silently >>> + * removed. >>> + */ >>> + >>> +/*@{*/ >>> + >>> +/** Creates an instruction set, using a given ralloc mem_ctx */ >>> +struct set *nir_instr_set_create(void *mem_ctx); >>> + >>> +/** Destroys an instruction set. */ >>> +void nir_instr_set_destroy(struct set *instr_set); >>> + >>> +/** >>> + * Adds an instruction to an instruction set if it doesn't exist, or if it >>> + * does already exist, rewrites all uses of it to point to the other >>> + * already-inserted instruction. Returns 'true' if the uses of the >>> instruction >>> + * were rewritten. >>> + */ >>> +bool nir_instr_set_add_or_rewrite(struct set *instr_set, nir_instr *instr); >>> + >>> +/** >>> + * Removes an instruction from an instruction set, so that other >>> instructions >>> + * won't be merged with it. >>> + */ >>> +void nir_instr_set_remove(struct set *instr_set, nir_instr *instr); >>> + >>> +/*@}*/ >>> + >>> -- >>> 2.1.0 >>> >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev