On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <bin.ch...@arm.com> wrote: > Hi, > One issue revealed in tree ifcvt is the pass stores/tracks DRs against its > memory references in IR. This causes difficulty in identifying same memory > references appearing in different forms. Given below example: > > void foo (int a[], int b[]) > { > int i; > for (i = 0; i < 100; i++) > { > if (a[i] ==0) > a[i] = b[i]*4; > else > a[i] = b[i]*3; > } > } > > The gimple dump before tree ifcvt is as: > > <bb 2>: > > <bb 3>: > # i_24 = PHI <i_19(7), 0(2)> > # ivtmp_28 = PHI <ivtmp_23(7), 100(2)> > _5 = (long unsigned int) i_24; > _6 = _5 * 4; > _8 = a_7(D) + _6; > _9 = *_8; > if (_9 == 0) > goto <bb 4>; > else > goto <bb 5>; > > <bb 4>: > _11 = b_10(D) + _6; > _12 = *_11; > _13 = _12 * 4; > goto <bb 6>; > > <bb 5>: > _15 = b_10(D) + _6; > _16 = *_15; > _17 = _16 * 3; > > <bb 6>: > # cstore_1 = PHI <_13(4), _17(5)> > *_8 = cstore_1; > i_19 = i_24 + 1; > ivtmp_23 = ivtmp_28 - 1; > if (ivtmp_23 != 0) > goto <bb 7>; > else > goto <bb 8>; > > <bb 7>: > goto <bb 3>; > > <bb 8>: > return; > > The two memory references "*_11" and "*_15" are actually the same thing, but > ifcvt failed to discover this because they are recorded in different forms. > This patch fixes the issue by recording/tracking memory reference against its > innermost_loop_behavior if: the memory reference has innermost_loop_behavior > and it is not a compound reference. > BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv > can handle it. Interesting thing is at current time, it's tree if-conv that > could not handle the case. Once it's if-converted with this patch, > vectorizer are able to handle it too. > > Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7?
Hmm. + equal_p = true; + if (e1->base_address && e2->base_address) + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0); + if (e1->offset && e2->offset) + equal_p &= operand_equal_p (e1->offset, e2->offset, 0); surely better to return false early. I think we don't want this in tree-data-refs.h also because of ... @@ -615,15 +619,29 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a) data_reference_p *master_dr, *base_master_dr; tree ref = DR_REF (a); tree base_ref = DR_BASE_OBJECT (a); + innermost_loop_behavior *innermost = &DR_INNERMOST (a); tree ca = bb_predicate (gimple_bb (DR_STMT (a))); bool exist1, exist2; - while (TREE_CODE (ref) == COMPONENT_REF - || TREE_CODE (ref) == IMAGPART_EXPR - || TREE_CODE (ref) == REALPART_EXPR) - ref = TREE_OPERAND (ref, 0); + /* If reference in DR has innermost loop behavior and it is not + a compound memory reference, we store it to innermost_DR_map, + otherwise to ref_DR_map. */ + if (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a) + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a))) + { + while (TREE_CODE (ref) == COMPONENT_REF + || TREE_CODE (ref) == IMAGPART_EXPR + || TREE_CODE (ref) == REALPART_EXPR) + ref = TREE_OPERAND (ref, 0); + + master_dr = &ref_DR_map->get_or_insert (ref, &exist1); + } + else + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1); we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART and REALPART) before creating the DR (or adjust the equality function and hashing to disregard them which means subtracting their offset from DR_INIT. To adjust the references we collect you'd maybe could use a callback to get_references_in_stmt to adjust them. OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as Index: tree-if-conv.c =================================================================== --- tree-if-conv.c (revision 234215) +++ tree-if-conv.c (working copy) @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo for (i = 0; refs->iterate (i, &dr); i++) { + tree *refp = &DR_REF (dr); + while ((TREE_CODE (*refp) == COMPONENT_REF + && TREE_OPERAND (*refp, 2) == NULL_TREE) + || TREE_CODE (*refp) == IMAGPART_EXPR + || TREE_CODE (*refp) == REALPART_EXPR) + refp = &TREE_OPERAND (*refp, 0); + if (refp != &DR_REF (dr)) + { + tree saved_base = *refp; + *refp = integer_zero_node; + + if (DR_INIT (dr)) + { + tree poffset; + int punsignedp, preversep, pvolatilep; + machine_mode pmode; + HOST_WIDE_INT pbitsize, pbitpos; + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset, + &pmode, &punsignedp, &preversep, &pvolatilep, + false); + gcc_assert (poffset == NULL_TREE); + + DR_INIT (dr) + = wide_int_to_tree (ssizetype, + wi::sub (DR_INIT (dr), + pbitpos / BITS_PER_UNIT)); + } + + *refp = saved_base; + DR_REF (dr) = *refp; + } + dr->aux = XNEW (struct ifc_dr); DR_BASE_W_UNCONDITIONALLY (dr) = false; DR_RW_UNCONDITIONALLY (dr) = false; Can you add a testcase for the vectorization? Richard. > Thanks, > bin > > > 2016-03-11 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * tree-data-ref.h (innermost_loop_behavior_hash): New class for > hashing struct innermost_loop_behavior. > (DR_INNERMOST): New macro. > * tree-if-conv.c (innermost_DR_map): New map. > (ref_DR_map, baseref_DR_map): Revise the comment. > (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR > to innermost_DR_map if it has innermost loop behavior and is not > a compound reference. > (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has > innermost loop behavior and is not a compound reference. > (if_convertible_loop_p_1): Initialize innermost_DR_map. > (if_convertible_loop_p): Release innermost_DR_map. > > gcc/testsuite/ChangeLog > 2016-03-11 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/56625 > PR tree-optimization/69489 > * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test. >