On Tue, Aug 22, 2017 at 4:19 PM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > Richard Biener <richard.guent...@gmail.com> writes: >> On Fri, Aug 18, 2017 at 12:30 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Thu, Aug 17, 2017 at 2:24 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>> On Thu, Aug 17, 2017 at 12:35 PM, Richard Sandiford >>>> <richard.sandif...@linaro.org> wrote: >>>>> "Bin.Cheng" <amker.ch...@gmail.com> writes: >>>>>> On Wed, Aug 16, 2017 at 6:50 PM, Richard Sandiford >>>>>> <richard.sandif...@linaro.org> wrote: >>>>>>> "Bin.Cheng" <amker.ch...@gmail.com> writes: >>>>>>>> On Wed, Aug 16, 2017 at 5:00 PM, Richard Sandiford >>>>>>>> <richard.sandif...@linaro.org> wrote: >>>>>>>>> "Bin.Cheng" <amker.ch...@gmail.com> writes: >>>>>>>>>> On Wed, Aug 16, 2017 at 2:38 PM, Richard Sandiford >>>>>>>>>> <richard.sandif...@linaro.org> wrote: >>>>>>>>>>> The first loop in the testcase regressed after my recent changes to >>>>>>>>>>> dr_analyze_innermost. Previously we would treat "i" as an iv even >>>>>>>>>>> for bb analysis and end up with: >>>>>>>>>>> >>>>>>>>>>> DR_BASE_ADDRESS: p or q >>>>>>>>>>> DR_OFFSET: 0 >>>>>>>>>>> DR_INIT: 0 or 4 >>>>>>>>>>> DR_STEP: 16 >>>>>>>>>>> >>>>>>>>>>> We now always keep the step as 0 instead, so for an int "i" we'd >>>>>>>>>>> have: >>>>>>>>>>> >>>>>>>>>>> DR_BASE_ADDRESS: p or q >>>>>>>>>>> DR_OFFSET: (intptr_t) i >>>>>>>>>>> DR_INIT: 0 or 4 >>>>>>>>>>> DR_STEP: 0 >>>>>>>>>>> >>>>>>>>>>> This is also what we'd like to have for the unsigned "i", but the >>>>>>>>>>> problem is that strip_constant_offset thinks that the "i + 1" in >>>>>>>>>>> "(intptr_t) (i + 1)" could wrap and so doesn't peel off the "+ 1". >>>>>>>>>>> The [i + 1] accesses therefore have a DR_OFFSET equal to the SSA >>>>>>>>>>> name that holds "(intptr_t) (i + 1)", meaning that the accesses no >>>>>>>>>>> longer seem to be related to the [i] ones. >>>>>>>>>> >>>>>>>>>> Didn't read the change in detail, so sorry if I mis-understood >>>>>>>>>> the issue. >>>>>>>>>> I made changes in scev to better fold type conversion by >>>>>>>>>> various sources >>>>>>>>>> of information, for example, vrp, niters, undefined overflow >>>>>>>>>> behavior etc. >>>>>>>>>> In theory these information should be available for other >>>>>>>>>> optimizers without >>>>>>>>>> querying scev. For the mentioned test, vrp should compute >>>>>>>>>> accurate range >>>>>>>>>> information for "i" so that we can fold (intptr_t) (i + 1) it without >>>>>>>>>> worrying >>>>>>>>>> overflow. Note we don't do it in generic folding because >>>>>>>>>> (intptr_t) (i) + 1 >>>>>>>>>> could be more expensive (especially in case of (T)(i + j)), or >>>>>>>>>> because the >>>>>>>>>> CST part is in bigger precision after conversion. >>>>>>>>>> But such folding is wanted in several places, e.g, IVOPTs. To >>>>>>>>>> provide such >>>>>>>>>> an interface, we changed tree-affine and made it do aggressive >>>>>>>>>> fold. I am >>>>>>>>>> curious if it's possible to use aff_tree to implement >>>>>>>>>> strip_constant_offset >>>>>>>>>> here since aggressive folding is wanted. After all, using >>>>>>>>>> additional chrec >>>>>>>>>> looks like a little heavy wrto the simple test. >>>>>>>>> >>>>>>>>> Yeah, using aff_tree does work here when the bounds are constant. >>>>>>>>> It doesn't look like it works for things like: >>>>>>>>> >>>>>>>>> double p[1000]; >>>>>>>>> double q[1000]; >>>>>>>>> >>>>>>>>> void >>>>>>>>> f4 (unsigned int n) >>>>>>>>> { >>>>>>>>> for (unsigned int i = 0; i < n; i += 4) >>>>>>>>> { >>>>>>>>> double a = q[i] + p[i]; >>>>>>>>> double b = q[i + 1] + p[i + 1]; >>>>>>>>> q[i] = a; >>>>>>>>> q[i + 1] = b; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> though, where the bounds on the global arrays guarantee that [i + 1] >>>>>>>>> can't >>>>>>>>> overflow, even though "n" is unconstrained. The patch as posted >>>>>>>>> handles >>>>>>>>> this case too. >>>>>>>> BTW is this a missed optimization in value range analysis? The range >>>>>>>> information for i should flow in a way like: array boundary -> niters >>>>>>>> -> scev/vrp. >>>>>>>> I think that's what niters/scev do in analysis. >>>>>>> >>>>>>> Yeah, maybe :-) It looks like the problem is that when SLP runs, >>>>>>> the previous VRP pass came before loop header copying, so the (single) >>>>>>> header has to cope with n == 0 case. Thus we get: >>>>>> Ah, there are several passes in between vrp and pass_ch, not sure if >>>>>> any such pass depends on vrp intensively. I would suggestion reorder >>>>>> the two passes, or standalone VRP interface updating information for >>>>>> loop region after header copied? This is a non-trivial issue that >>>>>> needs to be fixed. Niters analyzer rely on >>>>>> simplify_using_initial_conditions heavily to get the same information, >>>>>> which in my opinion should be provided by VRP. Though this won't be >>>>>> able to obsolete simplify_using_initial_conditions because VRP is weak >>>>>> in symbolic range... >>>>>> >>>>>>> >>>>>>> Visiting statement: >>>>>>> i_15 = ASSERT_EXPR <i_6, i_6 < n_9(D)>; >>>>>>> Intersecting >>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> and >>>>>>> [0, 0] >>>>>>> to >>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> Intersecting >>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> and >>>>>>> [0, 1000] >>>>>>> to >>>>>>> [0, 0] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> Found new range for i_15: [0, 0] >>>>>>> >>>>>>> Visiting statement: >>>>>>> _3 = i_15 + 1; >>>>>>> Match-and-simplified i_15 + 1 to 1 >>>>>>> Intersecting >>>>>>> [1, 1] >>>>>>> and >>>>>>> [0, +INF] >>>>>>> to >>>>>>> [1, 1] >>>>>>> Found new range for _3: [1, 1] >>>>>>> >>>>>>> (where _3 is the index we care about), followed by: >>>>>>> >>>>>>> Visiting statement: >>>>>>> i_15 = ASSERT_EXPR <i_6, i_6 < n_9(D)>; >>>>>>> Intersectings/aarch64-linux/trunk-orig/debug/gcc' >>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> and >>>>>>> [0, 4] >>>>>>> to >>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> Intersecting >>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> and >>>>>>> [0, 1000] >>>>>>> to >>>>>>> [0, n_9(D) + 4294967295] EQUIVALENCES: { i_6 } (1 elements) >>>>>>> Found new range for i_15: [0, n_9(D) + 4294967295] >>>>>>> >>>>>>> Visiting statement: >>>>>>> _3 = i_15 + 1; >>>>>>> Intersecting >>>>>>> [0, +INF] >>>>>>> and >>>>>>> [0, +INF] >>>>>>> to >>>>>>> [0, +INF] >>>>>>> Found new range for _3: [0, +INF] >>>>>>> >>>>>>> I guess in this case it would be better to intersect the i_15 ranges >>>>>>> to [0, 1000] rather than [0, n_9(D) + 4294967295]. >>>>>>> >>>>>>> It does work if another VRP pass runs after CH. But even then, >>>>>>> is it a good idea to rely on the range info being kept up-to-date >>>>>>> all the way through to SLP? A lot happens inbetween. >>>>>> To some extend yes. Now I understand that SCEV uses niters >>>>>> information to prove no_overflow. Niters analysis does infer better >>>>>> information from array boundary, while VRP fails to do that. I don't >>>>>> worry much about gap between vrp pass and slp, it's basically the same >>>>>> as niters. Both information are analyzed at one point and meant to be >>>>>> used by following passes. It's also not common for vrp information >>>>>> become invalid given we are on SSA? >>>>> >>>>> I'm not worried so much about vrp information becoming invalid on >>>>> an SSA name that existed when VRP was run. It's more a question >>>>> of what happens about SSA names that get introduced after VRP, >>>>> e.g. by things like dom, reassoc, PRE, etc. >>>> For induction variables in concern, these passes shouldn't >>>> aggressively introduces new variables I think. >>>>> >>>>>> Now that data address is not analyzed against loop, VRP would be the >>>>>> only information we can use unless adding back scev analysis. IMHO, >>>>>> the patch is doing so in another way than before. It requires >>>>>> additional chrec data structure. I remember the previous patch >>>>>> enables more slp vectorization, is it because of "step" information >>>>>> from scev? >>>>> >>>>> Do you mean that: >>>>> >>>>> 2017-07-03 Richard Sandiford <richard.sandif...@linaro.org> >>>>> >>>>> * tree-data-ref.c (dr_analyze_innermost): Replace the "nest" >>>>> parameter with a "loop" parameter and use it instead of the >>>>> loop containing DR_STMT. Don't check simple_iv when doing >>>>> BB analysis. Describe the two analysis modes in the comment. >>>>> >>>>> enabled more SLP vectorisation in bb-slp-pr65935.c? That was due >>>>> to us not doing IV analysis for BB vectorisation, and ensuring that >>>>> the step was always zero. >>>> Which means vectorizer code can handle not IV-analyzed offset, but >>>> can't for analyzed form? >>>>> >>>>>> In this patch, step information is simply discarded. I am >>>>>> wondering if possible to always analyze scev within innermost loop for >>>>>> slp while discards step information. >>>>> >>>>> Well, we don't calculate a step for bb analysis (i.e. it's not the case >>>>> the patch calculates step information and then simply discards it). >>>>> I don't see how that would work. For bb analysis, the DR_OFFSET + DR_INIT >>>>> has to give the offset for every execution of the block, not just the >>>>> first iteration of the containing loop. So if we get back a nonzero >>>>> step, we have to do something with it. >>>> Yeah. >>>>> >>>>> But: >>>>> >>>>> (a) the old simple_iv analysis is more expensive than simply calling >>>>> analyze_scev, so I don't think this is a win in terms of complexity. >>>>> >>>>> (b) for bb analysis, there's nothing particularly special about the >>>>> innermost loop. It makes more sense to analyse it in the innermost >>>>> loop for which the offset is invariant, as shown by the second >>>>> testcase in the patch. >>>>> >>>>> (c) The patch helps with loop vectorisation too, since analysing the >>>>> starting DR_OFFSET in the context of the containing loop can help >>>>> in a similar way as analysing the full offset does for SLP. >>>> >>>> I have to admit I am not very much into this method. It complicates >>>> structure as well as code. >>>> Mostly because now dr_init are split into two different fields and one >>>> of it is lazily computed. >>>> >>>> For example: >>>>> @@ -2974,12 +2974,12 @@ vect_vfa_segment_size (struct data_refer >>>>> vect_no_alias_p (struct data_reference *a, struct data_reference *b, >>>>> tree segment_length_a, tree segment_length_b) >>>>> { >>>>> - gcc_assert (TREE_CODE (DR_INIT (a)) == INTEGER_CST >>>>> - && TREE_CODE (DR_INIT (b)) == INTEGER_CST); >>>>> - if (tree_int_cst_equal (DR_INIT (a), DR_INIT (b))) >>>>> + gcc_assert (TREE_CODE (DR_CHREC_INIT (a)) == INTEGER_CST >>>>> + && TREE_CODE (DR_CHREC_INIT (b)) == INTEGER_CST); >>>>> + if (tree_int_cst_equal (DR_CHREC_INIT (a), DR_CHREC_INIT (b))) >>>>> return false; >>>>> >>>>> - tree seg_a_min = DR_INIT (a); >>>>> + tree seg_a_min = DR_CHREC_INIT (a); >>>>> tree seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_min), >>>>> seg_a_min, segment_length_a); >>>>> /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT >>>>> @@ -2990,10 +2990,10 @@ vect_no_alias_p (struct data_reference * >>>>> tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a))); >>>>> seg_a_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_max), >>>>> seg_a_max, unit_size); >>>>> - seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (a)), >>>>> - DR_INIT (a), unit_size); >>>>> + seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CHREC_INIT (a)), >>>>> + DR_CHREC_INIT (a), unit_size); >>>>> } >>>>> - tree seg_b_min = DR_INIT (b); >>>>> + tree seg_b_min = DR_CHREC_INIT (b); >>>>> tree seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_min), >>>>> seg_b_min, segment_length_b); >>>>> if (tree_int_cst_compare (DR_STEP (b), size_zero_node) < 0) >>>> >>>> Use of DR_INIT is simply replaced by DR_CHREC_INIT. Is it safe to do >>>> so in case of non-ZERO >>>> DR_INIT? It worries me that I may need to think twice before >>>> referring to DR_INIT because it's >>>> not clear when DR_OFFSET is split and DR_CHREC_INIT becomes non-ZERO. >>>> It may simply >>>> because I am too dumb to handle this. I will leave this to richi. >>> >>> I'm trying to understand this a bit (not liking it very much in its >>> current form). >>> >>> Can code currently using DR_OFFSET and DR_INIT simply use >>> DR_CHREC_INIT and CHREC_LEFT (DR_CHREC_OFFSET) (or DR_CHREC_OFFSET >>> if DR_CHREC_OFFSET is not a CHREC)? If yes, would there be any downside >>> in doing that? If not, then why? > > There's nothing particularly special about the CHREC_LEFT for users of > the drs. The chrec as a whole describes the variable part of the offset. > >>> That said, I'm all for making DR info more powerful. But I detest >>> having extra fields >>> and adding confusion as to when to use which. Thus if we can make >>> DR_CHREC_INIT >>> be DR_INIT and DR_OFFSET an inline function accessing CHREC_LEFT if >>> suitable plus exposing DR_OFFSET_CHREC that would make me more happy. >> >> And if we want to make it opt-in do a dr_analyze_me_harder () which will >> re-write DR_OFFSET/INIT into the new form. >> >> But DR_OFFSET and DR_INIT (read) accessors would maintain their >> semantics while DR_OFFSET_CHREC would expose the CHREC if >> available. > > After the changes, the only place that actually cared about the split > between the "old" DR_OFFSET and DR_INIT was tree-predcom.c: > ref_at_iteration. Everywhere else just wanted the sum of OFFSET and INIT, > and for them it would have been more convenient to have a combined field. > > So maybe one way of trying to avoid the confusion would be to keep > DR_OFFSET together as the full starting offset from the base, then > provide DR_VAR_OFFSET and DR_CONST_OFFSET as the split forms, with > DR_VAR_OFFSET being more like an abstract value number. See the comment > at the start of the patch below for more details. > > I'm not sure whether the main reason for splitting the offset was to > make life easier for the users of the drs, or whether it was to try > to avoid creating new trees. But then, the unconditional scev analysis > that we did previously already generated new trees.
Make live easier and allow same base + variable offset but differing const offset to be detected easily -- a[i] vs. a[i+1]. @@ -787,14 +821,14 @@ canonicalize_base_object_address (tree a bool dr_analyze_innermost (innermost_loop_behavior *drb, tree ref, - struct loop *loop) + gimple *stmt, struct loop *loop) { please amend the function comment with what STMT is about (DR_STMT I suppose). @@ -893,14 +927,14 @@ dr_analyze_innermost (innermost_loop_beh init = size_binop (PLUS_EXPR, init, dinit); base_misalignment -= TREE_INT_CST_LOW (dinit); - split_constant_offset (offset_iv.base, &offset_iv.base, &dinit); - init = size_binop (PLUS_EXPR, init, dinit); - - step = size_binop (PLUS_EXPR, - fold_convert (ssizetype, base_iv.step), - fold_convert (ssizetype, offset_iv.step)); - base = canonicalize_base_object_address (base_iv.base); + tree offset = size_binop (PLUS_EXPR, + fold_convert (ssizetype, offset_iv.base), + init); so you remove the split_constant_offset handling on offset_iv.base. This may end up no longer walking and expanding def stmts of SSA names contained therein. I suppose this is fully intended so that re-computing the ref address using DR_BASE/DR_OFFSET doesn't end up expanding that redundant code? For analysis relying on this one now needs to resort to DR_VAR/CONST_OFFSET where SCEV analysis hopefully performs similar expansions? Just sth to watch at ... @@ -921,12 +955,12 @@ dr_analyze_innermost (innermost_loop_beh } drb->base_address = base; - drb->offset = fold_convert (ssizetype, offset_iv.base); - drb->init = init; + drb->offset = offset; drb->step = step; + split_constant_offset (scev, &drb->var_offset, &drb->const_offset); so is the full-fledged split_constant_offset (with its SSA name handling) still needed here? Sth to eventually address with a followup. @@ -1490,6 +1482,7 @@ ref_at_iteration (data_reference_p dr, i tree ref_op2 = NULL_TREE; tree new_offset; + split_constant_offset (DR_OFFSET (dr), &off, &coff); if (iter != 0) { new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)); likewise here? Why do you think ref_at_iteration cares? Is that because of codegen quality? I'd have done with coff == size_zero_node plus simplifications that arise from that. Thanks, Richard. > Tested on aarch64-linux-gnu and x86_64-linux-gnu. > > Thanks, > Richard > > > 2017-08-22 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > PR tree-optimization/81635 > * tree-data-ref.h (innermost_loop_behavior): Remove init field. > Add var_offset and const_offset fields. Rename offset_alignment > to var_offset_alignment. > (DR_INIT): Delete. > (DR_CONST_OFFSET, DR_VAR_OFFSET): New macros. > (DR_OFFSET_ALIGNMENT): Replace with... > (DR_VAR_OFFSET_ALIGNMENT): ...this new macro. > (dr_analyze_innermost): Add a gimple * argument. > (dr_equal_offsets_p): Delete. > (dr_var_offsets_equal_p, dr_var_offsets_compare): Declare. > * tree-vectorizer.h (STMT_VINFO_DR_INIT): Delete. > (STMT_VINFO_DR_VAR_OFFSET, STMT_VINFO_DR_CONST_OFFSET): New macros. > (STMT_VINFO_DR_OFFSET_ALIGNMENT): Replace with... > (STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT): ...this new macro. > * tree.c (tree_ctz): Handle POLYNOMIAL_CHREC. > * tree-data-ref.c: Include tree-ssa-loop-ivopts.h. > (split_constant_offset): Handle POLYNOMIAL_CHREC. > (analyze_offset_scev): New function. > (dr_analyze_innermost): Add a gimple * statement. Update after > changes to innermost_behavior. Initialize var_offset and > const_offset. > (create_data_ref): Update call to dr_analyze_innermost. > Update dump after changes to innermost_behavior. > (operator ==): Use dr_var_offsets_equal_p and compare the > DR_CONST_OFFSETs. > (prune_runtime_alias_test_list): Likewise. > (comp_dr_with_seg_len_pair): Use dr_var_offsets_compare and compare > the DR_CONST_OFFSETs. > (create_intersect_range_checks): Use DR_OFFSET without adding > DR_INIT. > (dr_equal_offsets_p1, dr_equal_offsets_p): Delete. > (dr_alignment): Use const_offset instead of init and > var_offset_alignment instead of offset_alignment. > * tree-if-conv.c (innermost_loop_behavior_hash::hash): Don't > test the init field. > (innermost_loop_behavior_hash::equal): Likewise. > (ifcvt_memrefs_wont_trap): Likewise. > (if_convertible_loop_p_1): Likewise. > * tree-loop-distribution.c (build_addr_arg_loc): Use DR_OFFSET > without adding DR_INIT. > (build_rdg_partition_for_vertex): Don't check DR_INIT. > (share_memory_accesses): Likewise. > (pg_add_dependence_edges): Likewise. > (compute_alias_check_pairs): Use dr_var_offsets_compare. > * tree-predcom.c (aff_combination_dr_offset): Use DR_OFFSET without > adding DR_INIT. > (determine_offset): Likewise. > (valid_initializer_p): Likewise. > (find_looparound_phi): Update call to dr_analyze_innermost. > (ref_at_iteration): Use split_constant_offset to split the offset. > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use > const_offset instead of init and var_offset_alignment instead of > offset_alignment. > (vect_find_same_alignment_drs): Use dr_var_offsets_compare and > compare the DR_CONST_OFFSETs. > (dr_group_sort_cmp): Likewise. > (vect_analyze_group_access_1): Use DR_CONST_OFFSET instead of DR_INIT. > (vect_no_alias_p): Likewise. > (vect_analyze_data_ref_accesses): Use dr_var_offsets_equal_p and > compare the DR_CONST_OFFSETs. > (vect_prune_runtime_alias_test_list): Use dr_var_offsets_compare. > (vect_analyze_data_refs): Don't check DR_INIT and use DR_OFFSET > without adding DR_INIT. Use DR_VAR_OFFSET_ALIGNMENT instead of > DR_OFFSET_ALIGNMENT. Update call to dr_analyze_innermost, and > update subsequent dump. > (vect_create_addr_base_for_vector_ref): Use DR_OFFSET without > adding DR_INIT. > * tree-vect-stmts.c (vectorizable_store): Likewise. > (vectorizable_load): Likewise. Use DR_CONST_OFFSET instead > of DR_INIT. > > gcc/testsuite/ > PR tree-optimization/81635 > * gcc.dg/vect/bb-slp-pr81635.c: New test. > > Index: gcc/tree-data-ref.h > =================================================================== > --- gcc/tree-data-ref.h 2017-08-21 10:42:51.088530428 +0100 > +++ gcc/tree-data-ref.h 2017-08-22 14:54:48.630563940 +0100 > @@ -28,29 +28,44 @@ #define GCC_TREE_DATA_REF_H > innermost_loop_behavior describes the evolution of the address of the > memory > reference in the innermost enclosing loop. The address is expressed as > BASE + STEP * # of iteration, and base is further decomposed as the base > - pointer (BASE_ADDRESS), loop invariant offset (OFFSET) and > - constant offset (INIT). Examples, in loop nest > + pointer (BASE_ADDRESS) and the loop invariant offset (OFFSET). > + OFFSET is further expressed as the sum of a zero or non-constant term > + (VAR_OFFSET) and a constant term (CONST_OFFSET). VAR_OFFSET should be > + treated as an abstract representation; in particular, it may contain > + chrecs. CONST_OFFSET is always an INTEGER_CST. > > - for (i = 0; i < 100; i++) > - for (j = 3; j < 100; j++) > + Examples, in loop nest > > - Example 1 Example 2 > - data-ref a[j].b[i][j] *(p + x + 16B + 4B * j) > + 1: for (i = 0; i < 100; i++) > + 2: for (j = 3; j < 100; j++) > > + Example 1 Example 2 > + data-ref a[j].b[i][j] *(p + x + 16B + 4B * j) > > - innermost_loop_behavior > - base_address &a p > - offset i * D_i x > - init 3 * D_j + offsetof (b) 28 > - step D_j 4 > > - */ > + innermost_loop_behavior > + base_address &a p > + offset i * D_i + 3 * D_j + offsetof (b) x + 28 > + var_offset {0, +, D_i}_1 x (or an equiv. chrec) > + const_offset 3 * D_j + offsetof (b) 28 > + step D_j 4 > + > + The main two uses of VAR_OFFSET and CONST_OFFSET are: > + > + 1. to better analyze the alignment, since CONST_OFFSET can be treated as > + the misalignment wrt the alignment of VAR_OFFSET. > + > + 2. to find data references that are a constant number of bytes apart. > + If two data references have the same BASE_ADDRESS and VAR_OFFSET, > + the distance between them is given by the difference in their > + CONST_OFFSETs. */ > struct innermost_loop_behavior > { > tree base_address; > tree offset; > - tree init; > tree step; > + tree var_offset; > + tree const_offset; > > /* BASE_ADDRESS is known to be misaligned by BASE_MISALIGNMENT bytes > from an alignment boundary of BASE_ALIGNMENT bytes. For example, > @@ -91,7 +106,7 @@ struct innermost_loop_behavior > /* The largest power of two that divides OFFSET, capped to a suitably > high value if the offset is zero. This is a byte rather than a bit > quantity. */ > - unsigned int offset_alignment; > + unsigned int var_offset_alignment; > > /* Likewise for STEP. */ > unsigned int step_alignment; > @@ -186,12 +201,13 @@ #define DR_IS_WRITE(DR) (!DR_ > #define DR_IS_CONDITIONAL_IN_STMT(DR) (DR)->is_conditional_in_stmt > #define DR_BASE_ADDRESS(DR) (DR)->innermost.base_address > #define DR_OFFSET(DR) (DR)->innermost.offset > -#define DR_INIT(DR) (DR)->innermost.init > +#define DR_VAR_OFFSET(DR) (DR)->innermost.var_offset > +#define DR_CONST_OFFSET(DR) (DR)->innermost.const_offset > #define DR_STEP(DR) (DR)->innermost.step > #define DR_PTR_INFO(DR) (DR)->alias.ptr_info > #define DR_BASE_ALIGNMENT(DR) (DR)->innermost.base_alignment > #define DR_BASE_MISALIGNMENT(DR) (DR)->innermost.base_misalignment > -#define DR_OFFSET_ALIGNMENT(DR) (DR)->innermost.offset_alignment > +#define DR_VAR_OFFSET_ALIGNMENT(DR) (DR)->innermost.var_offset_alignment > #define DR_STEP_ALIGNMENT(DR) (DR)->innermost.step_alignment > #define DR_INNERMOST(DR) (DR)->innermost > > @@ -412,7 +428,8 @@ #define DDR_REVERSED_P(DDR) (DDR)->rever > #define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p > > > -bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *); > +bool dr_analyze_innermost (innermost_loop_behavior *, tree, > + gimple *, struct loop *); > extern bool compute_data_dependences_for_loop (struct loop *, bool, > vec<loop_p> *, > vec<data_reference_p> *, > @@ -466,8 +483,6 @@ dr_alignment (data_reference *dr) > > extern bool dr_may_alias_p (const struct data_reference *, > const struct data_reference *, bool); > -extern bool dr_equal_offsets_p (struct data_reference *, > - struct data_reference *); > > extern bool runtime_alias_check_p (ddr_p, struct loop *, bool); > extern int data_ref_compare_tree (tree, tree); > @@ -675,4 +690,23 @@ lambda_matrix_new (int m, int n, struct > return mat; > } > > +/* Check if DRA and DRB have equal DR_VAR_OFFSETs. */ > + > +inline bool > +dr_var_offsets_equal_p (struct data_reference *dra, > + struct data_reference *drb) > +{ > + return eq_evolutions_p (DR_VAR_OFFSET (dra), DR_VAR_OFFSET (drb)); > +} > + > +/* Compare the DR_VAR_OFFSETs of DRA and DRB for sorting purposes, > + returning a qsort-style result. */ > + > +inline int > +dr_var_offsets_compare (struct data_reference *dra, > + struct data_reference *drb) > +{ > + return data_ref_compare_tree (DR_VAR_OFFSET (dra), DR_VAR_OFFSET (drb)); > +} > + > #endif /* GCC_TREE_DATA_REF_H */ > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2017-08-04 11:42:45.939105152 +0100 > +++ gcc/tree-vectorizer.h 2017-08-22 14:54:48.633563940 +0100 > @@ -740,14 +740,15 @@ #define STMT_VINFO_VEC_CONST_COND_REDUC_ > > #define STMT_VINFO_DR_WRT_VEC_LOOP(S) (S)->dr_wrt_vec_loop > #define STMT_VINFO_DR_BASE_ADDRESS(S) (S)->dr_wrt_vec_loop.base_address > -#define STMT_VINFO_DR_INIT(S) (S)->dr_wrt_vec_loop.init > #define STMT_VINFO_DR_OFFSET(S) (S)->dr_wrt_vec_loop.offset > +#define STMT_VINFO_DR_VAR_OFFSET(S) (S)->dr_wrt_vec_loop.var_offset > +#define STMT_VINFO_DR_CONST_OFFSET(S) (S)->dr_wrt_vec_loop.const_offset > #define STMT_VINFO_DR_STEP(S) (S)->dr_wrt_vec_loop.step > #define STMT_VINFO_DR_BASE_ALIGNMENT(S) > (S)->dr_wrt_vec_loop.base_alignment > #define STMT_VINFO_DR_BASE_MISALIGNMENT(S) \ > (S)->dr_wrt_vec_loop.base_misalignment > -#define STMT_VINFO_DR_OFFSET_ALIGNMENT(S) \ > - (S)->dr_wrt_vec_loop.offset_alignment > +#define STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT(S) \ > + (S)->dr_wrt_vec_loop.var_offset_alignment > #define STMT_VINFO_DR_STEP_ALIGNMENT(S) \ > (S)->dr_wrt_vec_loop.step_alignment > > Index: gcc/tree.c > =================================================================== > --- gcc/tree.c 2017-08-21 12:14:47.159835474 +0100 > +++ gcc/tree.c 2017-08-22 14:54:48.634563941 +0100 > @@ -2601,6 +2601,12 @@ tree_ctz (const_tree expr) > return MIN (ret1, prec); > } > return 0; > + case POLYNOMIAL_CHREC: > + ret1 = tree_ctz (CHREC_LEFT (expr)); > + if (ret1 == 0) > + return ret1; > + ret2 = tree_ctz (CHREC_RIGHT (expr)); > + return MIN (ret1, ret2); > default: > return 0; > } > Index: gcc/tree-data-ref.c > =================================================================== > --- gcc/tree-data-ref.c 2017-08-21 10:42:51.088530428 +0100 > +++ gcc/tree-data-ref.c 2017-08-22 14:54:48.629563940 +0100 > @@ -86,6 +86,7 @@ Software Foundation; either version 3, o > #include "expr.h" > #include "gimple-iterator.h" > #include "tree-ssa-loop-niter.h" > +#include "tree-ssa-loop-ivopts.h" > #include "tree-ssa-loop.h" > #include "tree-ssa.h" > #include "cfgloop.h" > @@ -730,7 +731,19 @@ split_constant_offset (tree exp, tree *v > *off = ssize_int (0); > STRIP_NOPS (exp); > > - if (tree_is_chrec (exp) > + if (TREE_CODE (exp) == POLYNOMIAL_CHREC) > + { > + split_constant_offset (CHREC_LEFT (exp), &op0, &op1); > + if (op0 != CHREC_LEFT (exp)) > + { > + *var = build3 (POLYNOMIAL_CHREC, type, CHREC_VAR (exp), > + op0, CHREC_RIGHT (exp)); > + *off = op1; > + } > + return; > + } > + > + if (automatically_generated_chrec_p (exp) > || get_gimple_rhs_class (TREE_CODE (exp)) == GIMPLE_TERNARY_RHS) > return; > > @@ -765,7 +778,28 @@ canonicalize_base_object_address (tree a > return build_fold_addr_expr (TREE_OPERAND (addr, 0)); > } > > -/* Analyze the behavior of memory reference REF. There are two modes: > +/* Analyze the scalar evolution of OFFSET in the innermost parent of > + LOOP for which it isn't invariant. Return OFFSET itself if the > + value is invariant or if it's too complex to analyze. */ > + > +static tree > +analyze_offset_scev (struct loop *loop, tree offset) > +{ > + struct loop *inv_loop = outermost_invariant_loop_for_expr (loop, offset); > + if (inv_loop != NULL) > + { > + if (loop_depth (inv_loop) == 0) > + return offset; > + loop = loop_outer (inv_loop); > + } > + tree res = analyze_scalar_evolution (loop, offset); > + if (chrec_contains_undetermined (res)) > + return offset; > + return res; > +} > + > +/* Analyze the behavior of memory reference REF, which occurs in STMT. > + There are two modes: > > - BB analysis. In this case we simply split the address into base, > init and offset components, without reference to any containing loop. > @@ -787,14 +821,14 @@ canonicalize_base_object_address (tree a > > bool > dr_analyze_innermost (innermost_loop_behavior *drb, tree ref, > - struct loop *loop) > + gimple *stmt, struct loop *loop) > { > HOST_WIDE_INT pbitsize, pbitpos; > tree base, poffset; > machine_mode pmode; > int punsignedp, preversep, pvolatilep; > affine_iv base_iv, offset_iv; > - tree init, dinit, step; > + tree dinit; > bool in_loop = (loop && loop->num); > > if (dump_file && (dump_flags & TDF_DETAILS)) > @@ -885,7 +919,7 @@ dr_analyze_innermost (innermost_loop_beh > } > } > > - init = ssize_int (pbitpos / BITS_PER_UNIT); > + tree init = ssize_int (pbitpos / BITS_PER_UNIT); > > /* Subtract any constant component from the base and add it to INIT > instead. > Adjust the misalignment to reflect the amount we subtracted. */ > @@ -893,14 +927,14 @@ dr_analyze_innermost (innermost_loop_beh > init = size_binop (PLUS_EXPR, init, dinit); > base_misalignment -= TREE_INT_CST_LOW (dinit); > > - split_constant_offset (offset_iv.base, &offset_iv.base, &dinit); > - init = size_binop (PLUS_EXPR, init, dinit); > - > - step = size_binop (PLUS_EXPR, > - fold_convert (ssizetype, base_iv.step), > - fold_convert (ssizetype, offset_iv.step)); > - > base = canonicalize_base_object_address (base_iv.base); > + tree offset = size_binop (PLUS_EXPR, > + fold_convert (ssizetype, offset_iv.base), > + init); > + tree step = size_binop (PLUS_EXPR, > + fold_convert (ssizetype, base_iv.step), > + fold_convert (ssizetype, offset_iv.step)); > + tree scev = analyze_offset_scev (loop_containing_stmt (stmt), offset); > > /* See if get_pointer_alignment can guarantee a higher alignment than > the one we calculated above. */ > @@ -921,12 +955,12 @@ dr_analyze_innermost (innermost_loop_beh > } > > drb->base_address = base; > - drb->offset = fold_convert (ssizetype, offset_iv.base); > - drb->init = init; > + drb->offset = offset; > drb->step = step; > + split_constant_offset (scev, &drb->var_offset, &drb->const_offset); > drb->base_alignment = base_alignment; > drb->base_misalignment = base_misalignment & (base_alignment - 1); > - drb->offset_alignment = highest_pow2_factor (offset_iv.base); > + drb->var_offset_alignment = highest_pow2_factor (drb->var_offset); > drb->step_alignment = highest_pow2_factor (step); > > if (dump_file && (dump_flags & TDF_DETAILS)) > @@ -1154,7 +1188,7 @@ create_data_ref (loop_p nest, loop_p loo > DR_IS_READ (dr) = is_read; > DR_IS_CONDITIONAL_IN_STMT (dr) = is_conditional_in_stmt; > > - dr_analyze_innermost (&DR_INNERMOST (dr), memref, > + dr_analyze_innermost (&DR_INNERMOST (dr), memref, stmt, > nest != NULL ? loop : NULL); > dr_analyze_indices (dr, nest, loop); > dr_analyze_alias (dr); > @@ -1166,15 +1200,17 @@ create_data_ref (loop_p nest, loop_p loo > print_generic_expr (dump_file, DR_BASE_ADDRESS (dr), TDF_SLIM); > fprintf (dump_file, "\n\toffset from base address: "); > print_generic_expr (dump_file, DR_OFFSET (dr), TDF_SLIM); > - fprintf (dump_file, "\n\tconstant offset from base address: "); > - print_generic_expr (dump_file, DR_INIT (dr), TDF_SLIM); > + fprintf (dump_file, "\n\tvariable part of offset: "); > + print_generic_expr (dump_file, DR_VAR_OFFSET (dr), TDF_SLIM); > + fprintf (dump_file, "\n\tconstant part of offset: "); > + print_generic_expr (dump_file, DR_CONST_OFFSET (dr), TDF_SLIM); > fprintf (dump_file, "\n\tstep: "); > print_generic_expr (dump_file, DR_STEP (dr), TDF_SLIM); > fprintf (dump_file, "\n\tbase alignment: %d", DR_BASE_ALIGNMENT (dr)); > fprintf (dump_file, "\n\tbase misalignment: %d", > DR_BASE_MISALIGNMENT (dr)); > - fprintf (dump_file, "\n\toffset alignment: %d", > - DR_OFFSET_ALIGNMENT (dr)); > + fprintf (dump_file, "\n\tvariable offset alignment: %d", > + DR_VAR_OFFSET_ALIGNMENT (dr)); > fprintf (dump_file, "\n\tstep alignment: %d", DR_STEP_ALIGNMENT (dr)); > fprintf (dump_file, "\n\tbase_object: "); > print_generic_expr (dump_file, DR_BASE_OBJECT (dr), TDF_SLIM); > @@ -1324,11 +1360,12 @@ runtime_alias_check_p (ddr_p ddr, struct > operator == (const dr_with_seg_len& d1, > const dr_with_seg_len& d2) > { > - return operand_equal_p (DR_BASE_ADDRESS (d1.dr), > + return (operand_equal_p (DR_BASE_ADDRESS (d1.dr), > DR_BASE_ADDRESS (d2.dr), 0) > - && data_ref_compare_tree (DR_OFFSET (d1.dr), DR_OFFSET (d2.dr)) == > 0 > - && data_ref_compare_tree (DR_INIT (d1.dr), DR_INIT (d2.dr)) == 0 > - && data_ref_compare_tree (d1.seg_len, d2.seg_len) == 0; > + && dr_var_offsets_equal_p (d1.dr, d2.dr) > + && data_ref_compare_tree (DR_CONST_OFFSET (d1.dr), > + DR_CONST_OFFSET (d2.dr)) == 0 > + && data_ref_compare_tree (d1.seg_len, d2.seg_len) == 0); > } > > /* Comparison function for sorting objects of dr_with_seg_len_pair_t > @@ -1360,17 +1397,15 @@ comp_dr_with_seg_len_pair (const void *p > if ((comp_res = data_ref_compare_tree (DR_STEP (a2.dr), > DR_STEP (b2.dr))) != 0) > return comp_res; > - if ((comp_res = data_ref_compare_tree (DR_OFFSET (a1.dr), > - DR_OFFSET (b1.dr))) != 0) > + if ((comp_res = dr_var_offsets_compare (a1.dr, b1.dr)) != 0) > return comp_res; > - if ((comp_res = data_ref_compare_tree (DR_INIT (a1.dr), > - DR_INIT (b1.dr))) != 0) > + if ((comp_res = data_ref_compare_tree (DR_CONST_OFFSET (a1.dr), > + DR_CONST_OFFSET (b1.dr))) != 0) > return comp_res; > - if ((comp_res = data_ref_compare_tree (DR_OFFSET (a2.dr), > - DR_OFFSET (b2.dr))) != 0) > + if ((comp_res = dr_var_offsets_compare (a2.dr, b2.dr)) != 0) > return comp_res; > - if ((comp_res = data_ref_compare_tree (DR_INIT (a2.dr), > - DR_INIT (b2.dr))) != 0) > + if ((comp_res = data_ref_compare_tree (DR_CONST_OFFSET (a2.dr), > + DR_CONST_OFFSET (b2.dr))) != 0) > return comp_res; > > return 0; > @@ -1455,10 +1490,9 @@ prune_runtime_alias_test_list (vec<dr_wi > > if (!operand_equal_p (DR_BASE_ADDRESS (dr_a1->dr), > DR_BASE_ADDRESS (dr_a2->dr), 0) > - || !operand_equal_p (DR_OFFSET (dr_a1->dr), > - DR_OFFSET (dr_a2->dr), 0) > - || !tree_fits_shwi_p (DR_INIT (dr_a1->dr)) > - || !tree_fits_shwi_p (DR_INIT (dr_a2->dr))) > + || !dr_var_offsets_equal_p (dr_a1->dr, dr_a2->dr) > + || !tree_fits_shwi_p (DR_CONST_OFFSET (dr_a1->dr)) > + || !tree_fits_shwi_p (DR_CONST_OFFSET (dr_a2->dr))) > continue; > > /* Only merge const step data references. */ > @@ -1484,11 +1518,13 @@ prune_runtime_alias_test_list (vec<dr_wi > continue; > > /* Make sure dr_a1 starts left of dr_a2. */ > - if (tree_int_cst_lt (DR_INIT (dr_a2->dr), DR_INIT (dr_a1->dr))) > + if (tree_int_cst_lt (DR_CONST_OFFSET (dr_a2->dr), > + DR_CONST_OFFSET (dr_a1->dr))) > std::swap (*dr_a1, *dr_a2); > > bool do_remove = false; > - wide_int diff = wi::sub (DR_INIT (dr_a2->dr), DR_INIT (dr_a1->dr)); > + wide_int diff = wi::sub (DR_CONST_OFFSET (dr_a2->dr), > + DR_CONST_OFFSET (dr_a1->dr)); > wide_int min_seg_len_b; > tree new_seg_len; > > @@ -1756,10 +1792,6 @@ create_intersect_range_checks (struct lo > tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr); > tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr); > > - offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a), > - offset_a, DR_INIT (dr_a.dr)); > - offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b), > - offset_b, DR_INIT (dr_b.dr)); > addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a); > addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b); > > @@ -1826,48 +1858,6 @@ create_runtime_alias_checks (struct loop > } > } > > -/* Check if OFFSET1 and OFFSET2 (DR_OFFSETs of some data-refs) are identical > - expressions. */ > -static bool > -dr_equal_offsets_p1 (tree offset1, tree offset2) > -{ > - bool res; > - > - STRIP_NOPS (offset1); > - STRIP_NOPS (offset2); > - > - if (offset1 == offset2) > - return true; > - > - if (TREE_CODE (offset1) != TREE_CODE (offset2) > - || (!BINARY_CLASS_P (offset1) && !UNARY_CLASS_P (offset1))) > - return false; > - > - res = dr_equal_offsets_p1 (TREE_OPERAND (offset1, 0), > - TREE_OPERAND (offset2, 0)); > - > - if (!res || !BINARY_CLASS_P (offset1)) > - return res; > - > - res = dr_equal_offsets_p1 (TREE_OPERAND (offset1, 1), > - TREE_OPERAND (offset2, 1)); > - > - return res; > -} > - > -/* Check if DRA and DRB have equal offsets. */ > -bool > -dr_equal_offsets_p (struct data_reference *dra, > - struct data_reference *drb) > -{ > - tree offset1, offset2; > - > - offset1 = DR_OFFSET (dra); > - offset2 = DR_OFFSET (drb); > - > - return dr_equal_offsets_p1 (offset1, offset2); > -} > - > /* Returns true if FNA == FNB. */ > > static bool > @@ -5083,13 +5073,13 @@ dr_alignment (innermost_loop_behavior *d > /* Get the alignment of BASE_ADDRESS + INIT. */ > unsigned int alignment = drb->base_alignment; > unsigned int misalignment = (drb->base_misalignment > - + TREE_INT_CST_LOW (drb->init)); > + + TREE_INT_CST_LOW (drb->const_offset)); > if (misalignment != 0) > alignment = MIN (alignment, misalignment & -misalignment); > > /* Cap it to the alignment of OFFSET. */ > if (!integer_zerop (drb->offset)) > - alignment = MIN (alignment, drb->offset_alignment); > + alignment = MIN (alignment, drb->var_offset_alignment); > > /* Cap it to the alignment of STEP. */ > if (!integer_zerop (drb->step)) > Index: gcc/tree-if-conv.c > =================================================================== > --- gcc/tree-if-conv.c 2017-07-13 09:25:12.666266733 +0100 > +++ gcc/tree-if-conv.c 2017-08-22 14:54:48.630563940 +0100 > @@ -149,7 +149,6 @@ innermost_loop_behavior_hash::hash (cons > > hash = iterative_hash_expr (e->base_address, 0); > hash = iterative_hash_expr (e->offset, hash); > - hash = iterative_hash_expr (e->init, hash); > return iterative_hash_expr (e->step, hash); > } > > @@ -161,8 +160,6 @@ innermost_loop_behavior_hash::equal (con > || (!e1->base_address && e2->base_address) > || (!e1->offset && e2->offset) > || (e1->offset && !e2->offset) > - || (!e1->init && e2->init) > - || (e1->init && !e2->init) > || (!e1->step && e2->step) > || (e1->step && !e2->step)) > return false; > @@ -173,9 +170,6 @@ innermost_loop_behavior_hash::equal (con > if (e1->offset && e2->offset > && !operand_equal_p (e1->offset, e2->offset, 0)) > return false; > - if (e1->init && e2->init > - && !operand_equal_p (e1->init, e2->init, 0)) > - return false; > if (e1->step && e2->step > && !operand_equal_p (e1->step, e2->step, 0)) > return false; > @@ -856,8 +850,7 @@ ifcvt_memrefs_wont_trap (gimple *stmt, v > innermost_loop_behavior *innermost = &DR_INNERMOST (a); > > gcc_assert (DR_STMT (a) == stmt); > - gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a) > - || DR_INIT (a) || DR_STEP (a)); > + gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a) || DR_STEP (a)); > > master_dr = innermost_DR_map->get (innermost); > gcc_assert (master_dr != NULL); > @@ -1433,8 +1426,7 @@ if_convertible_loop_p_1 (struct loop *lo > if (TREE_CODE (ref) == COMPONENT_REF > || TREE_CODE (ref) == IMAGPART_EXPR > || TREE_CODE (ref) == REALPART_EXPR > - || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr) > - || DR_INIT (dr) || DR_STEP (dr))) > + || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr) || DR_STEP (dr))) > { > while (TREE_CODE (ref) == COMPONENT_REF > || TREE_CODE (ref) == IMAGPART_EXPR > Index: gcc/tree-loop-distribution.c > =================================================================== > --- gcc/tree-loop-distribution.c 2017-08-21 10:42:51.088530428 +0100 > +++ gcc/tree-loop-distribution.c 2017-08-22 14:54:48.630563940 +0100 > @@ -903,8 +903,7 @@ build_addr_arg_loc (location_t loc, data > { > tree addr_base; > > - addr_base = size_binop_loc (loc, PLUS_EXPR, DR_OFFSET (dr), DR_INIT (dr)); > - addr_base = fold_convert_loc (loc, sizetype, addr_base); > + addr_base = fold_convert_loc (loc, sizetype, DR_OFFSET (dr)); > > /* Test for a negative stride, iterating over every element. */ > if (tree_int_cst_sgn (DR_STEP (dr)) == -1) > @@ -1289,8 +1288,7 @@ build_rdg_partition_for_vertex (struct g > > /* Partition can only be executed sequentially if there is any > unknown data reference. */ > - if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) > - || !DR_INIT (dr) || !DR_STEP (dr)) > + if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_STEP (dr)) > partition->type = PTYPE_SEQUENTIAL; > > bitmap_set_bit (partition->datarefs, idx); > @@ -1507,21 +1505,18 @@ share_memory_accesses (struct graph *rdg > { > dr1 = datarefs_vec[i]; > > - if (!DR_BASE_ADDRESS (dr1) > - || !DR_OFFSET (dr1) || !DR_INIT (dr1) || !DR_STEP (dr1)) > + if (!DR_BASE_ADDRESS (dr1) || !DR_OFFSET (dr1) || !DR_STEP (dr1)) > continue; > > EXECUTE_IF_SET_IN_BITMAP (partition2->datarefs, 0, j, bj) > { > dr2 = datarefs_vec[j]; > > - if (!DR_BASE_ADDRESS (dr2) > - || !DR_OFFSET (dr2) || !DR_INIT (dr2) || !DR_STEP (dr2)) > + if (!DR_BASE_ADDRESS (dr2) || !DR_OFFSET (dr2) || !DR_STEP (dr2)) > continue; > > if (operand_equal_p (DR_BASE_ADDRESS (dr1), DR_BASE_ADDRESS (dr2), > 0) > && operand_equal_p (DR_OFFSET (dr1), DR_OFFSET (dr2), 0) > - && operand_equal_p (DR_INIT (dr1), DR_INIT (dr2), 0) > && operand_equal_p (DR_STEP (dr1), DR_STEP (dr2), 0)) > return true; > } > @@ -1705,7 +1700,6 @@ pg_add_dependence_edges (struct graph *r > runtime alias check. */ > if (!DR_BASE_ADDRESS (dr1) || !DR_BASE_ADDRESS (dr2) > || !DR_OFFSET (dr1) || !DR_OFFSET (dr2) > - || !DR_INIT (dr1) || !DR_INIT (dr2) > || !DR_STEP (dr1) || !tree_fits_uhwi_p (DR_STEP (dr1)) > || !DR_STEP (dr2) || !tree_fits_uhwi_p (DR_STEP (dr2)) > || res == 0) > @@ -2203,7 +2197,7 @@ compute_alias_check_pairs (struct loop * > DR_BASE_ADDRESS (dr_b)); > > if (comp_res == 0) > - comp_res = data_ref_compare_tree (DR_OFFSET (dr_a), DR_OFFSET (dr_b)); > + comp_res = dr_var_offsets_compare (dr_a, dr_b); > gcc_assert (comp_res != 0); > > if (latch_dominated_by_data_ref (loop, dr_a)) > Index: gcc/tree-predcom.c > =================================================================== > --- gcc/tree-predcom.c 2017-08-10 14:36:08.057471267 +0100 > +++ gcc/tree-predcom.c 2017-08-22 14:54:48.631563940 +0100 > @@ -666,18 +666,13 @@ suitable_reference_p (struct data_refere > return true; > } > > -/* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */ > +/* Stores DR_OFFSET (DR) to OFFSET. */ > > static void > aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset) > { > - tree type = TREE_TYPE (DR_OFFSET (dr)); > - aff_tree delta; > - > - tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset, > - &name_expansions); > - aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr))); > - aff_combination_add (offset, &delta); > + tree_to_aff_combination_expand (DR_OFFSET (dr), TREE_TYPE (DR_OFFSET (dr)), > + offset, &name_expansions); > } > > /* Determines number of iterations of the innermost enclosing loop before B > @@ -710,8 +705,7 @@ determine_offset (struct data_reference > /* If the references have loop invariant address, check that they > access > exactly the same location. */ > *off = 0; > - return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0) > - && operand_equal_p (DR_INIT (a), DR_INIT (b), 0)); > + return operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0); > } > > /* Compare the offsets of the addresses, and check whether the difference > @@ -1171,8 +1165,7 @@ valid_initializer_p (struct data_referen > /* If the address of the reference is invariant, initializer must access > exactly the same location. */ > if (integer_zerop (DR_STEP (root))) > - return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0) > - && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0)); > + return operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0); > > /* Verify that this index of REF is equal to the root's index at > -DISTANCE-th iteration. */ > @@ -1247,7 +1240,7 @@ find_looparound_phi (struct loop *loop, > memset (&init_dr, 0, sizeof (struct data_reference)); > DR_REF (&init_dr) = init_ref; > DR_STMT (&init_dr) = phi; > - if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, loop)) > + if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, phi, loop)) > return NULL; > > if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref)) > @@ -1481,8 +1474,7 @@ replace_ref_with (gimple *stmt, tree new > ref_at_iteration (data_reference_p dr, int iter, > gimple_seq *stmts, tree niters = NULL_TREE) > { > - tree off = DR_OFFSET (dr); > - tree coff = DR_INIT (dr); > + tree off, coff; > tree ref = DR_REF (dr); > enum tree_code ref_code = ERROR_MARK; > tree ref_type = NULL_TREE; > @@ -1490,6 +1482,7 @@ ref_at_iteration (data_reference_p dr, i > tree ref_op2 = NULL_TREE; > tree new_offset; > > + split_constant_offset (DR_OFFSET (dr), &off, &coff); > if (iter != 0) > { > new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)); > Index: gcc/tree-vect-data-refs.c > =================================================================== > --- gcc/tree-vect-data-refs.c 2017-08-21 10:42:51.088530428 +0100 > +++ gcc/tree-vect-data-refs.c 2017-08-22 14:54:48.632563940 +0100 > @@ -866,7 +866,7 @@ vect_compute_data_ref_alignment (struct > base_misalignment = (*entry)->base_misalignment; > } > > - if (drb->offset_alignment < vector_alignment > + if (drb->var_offset_alignment < vector_alignment > || !step_preserves_misalignment_p > /* We need to know whether the step wrt the vectorized loop is > negative when computing the starting misalignment below. */ > @@ -928,7 +928,7 @@ vect_compute_data_ref_alignment (struct > base_misalignment = 0; > } > unsigned int misalignment = (base_misalignment > - + TREE_INT_CST_LOW (drb->init)); > + + TREE_INT_CST_LOW (drb->const_offset)); > > /* If this is a backward running DR then first access in the larger > vectype actually is N-1 elements before the address in the DR. > @@ -2187,13 +2187,13 @@ vect_find_same_alignment_drs (struct dat > return; > > if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0) > - || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0) > + || !dr_var_offsets_equal_p (dra, drb) > || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0)) > return; > > /* Two references with distance zero have the same alignment. */ > - offset_int diff = (wi::to_offset (DR_INIT (dra)) > - - wi::to_offset (DR_INIT (drb))); > + offset_int diff = (wi::to_offset (DR_CONST_OFFSET (dra)) > + - wi::to_offset (DR_CONST_OFFSET (drb))); > if (diff != 0) > { > /* Get the wider of the two alignments. */ > @@ -2434,7 +2434,7 @@ vect_analyze_group_access_1 (struct data > gimple *next = GROUP_NEXT_ELEMENT (vinfo_for_stmt (stmt)); > struct data_reference *data_ref = dr; > unsigned int count = 1; > - tree prev_init = DR_INIT (data_ref); > + tree prev_init = DR_CONST_OFFSET (data_ref); > gimple *prev = stmt; > HOST_WIDE_INT diff, gaps = 0; > > @@ -2444,9 +2444,10 @@ vect_analyze_group_access_1 (struct data > data-ref (supported only for loads), we vectorize only the first > stmt, and the rest get their vectorized loads from the first > one. */ > - if (!tree_int_cst_compare (DR_INIT (data_ref), > - DR_INIT (STMT_VINFO_DATA_REF ( > - vinfo_for_stmt (next))))) > + data_reference *next_ref > + = STMT_VINFO_DATA_REF (vinfo_for_stmt (next)); > + if (!tree_int_cst_compare (DR_CONST_OFFSET (data_ref), > + DR_CONST_OFFSET (next_ref))) > { > if (DR_IS_WRITE (data_ref)) > { > @@ -2469,14 +2470,14 @@ vect_analyze_group_access_1 (struct data > } > > prev = next; > - data_ref = STMT_VINFO_DATA_REF (vinfo_for_stmt (next)); > + data_ref = next_ref; > > /* All group members have the same STEP by construction. */ > gcc_checking_assert (operand_equal_p (DR_STEP (data_ref), step, 0)); > > /* Check that the distance between two accesses is equal to the > type > size. Otherwise, we have gaps. */ > - diff = (TREE_INT_CST_LOW (DR_INIT (data_ref)) > + diff = (TREE_INT_CST_LOW (DR_CONST_OFFSET (data_ref)) > - TREE_INT_CST_LOW (prev_init)) / type_size; > if (diff != 1) > { > @@ -2499,7 +2500,7 @@ vect_analyze_group_access_1 (struct data > gap in the access, GROUP_GAP is always 1. */ > GROUP_GAP (vinfo_for_stmt (next)) = diff; > > - prev_init = DR_INIT (data_ref); > + prev_init = DR_CONST_OFFSET (data_ref); > next = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next)); > /* Count the number of data-refs in the chain. */ > count++; > @@ -2715,13 +2716,10 @@ dr_group_sort_cmp (const void *dra_, con > return cmp; > } > > - /* And according to DR_OFFSET. */ > - if (!dr_equal_offsets_p (dra, drb)) > - { > - cmp = data_ref_compare_tree (DR_OFFSET (dra), DR_OFFSET (drb)); > - if (cmp != 0) > - return cmp; > - } > + /* And according to DR_VAR_OFFSET. */ > + cmp = dr_var_offsets_compare (dra, drb); > + if (cmp != 0) > + return cmp; > > /* Put reads before writes. */ > if (DR_IS_READ (dra) != DR_IS_READ (drb)) > @@ -2745,8 +2743,9 @@ dr_group_sort_cmp (const void *dra_, con > return cmp; > } > > - /* Then sort after DR_INIT. In case of identical DRs sort after stmt UID. > */ > - cmp = tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb)); > + /* Then sort after DR_CONST_OFFSET. In case of identical DRs sort after > + stmt UID. */ > + cmp = tree_int_cst_compare (DR_CONST_OFFSET (dra), DR_CONST_OFFSET (drb)); > if (cmp == 0) > return gimple_uid (DR_STMT (dra)) < gimple_uid (DR_STMT (drb)) ? -1 : 1; > return cmp; > @@ -2817,7 +2816,7 @@ vect_analyze_data_ref_accesses (vec_info > if (DR_IS_READ (dra) != DR_IS_READ (drb) > || !operand_equal_p (DR_BASE_ADDRESS (dra), > DR_BASE_ADDRESS (drb), 0) > - || !dr_equal_offsets_p (dra, drb) > + || !dr_var_offsets_equal_p (dra, drb) > || !gimple_assign_single_p (DR_STMT (dra)) > || !gimple_assign_single_p (DR_STMT (drb))) > break; > @@ -2835,7 +2834,8 @@ vect_analyze_data_ref_accesses (vec_info > break; > > /* Do not place the same access in the interleaving chain twice. */ > - if (tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb)) == 0) > + if (tree_int_cst_compare (DR_CONST_OFFSET (dra), > + DR_CONST_OFFSET (drb)) == 0) > break; > > /* Check the types are compatible. > @@ -2844,9 +2844,10 @@ vect_analyze_data_ref_accesses (vec_info > TREE_TYPE (DR_REF (drb)))) > break; > > - /* Sorting has ensured that DR_INIT (dra) <= DR_INIT (drb). */ > - HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_INIT (dra)); > - HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_INIT (drb)); > + /* Sorting has ensured that > + DR_CONST_OFFSET (dra) <= DR_CONST_OFFSET (drb). */ > + HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_CONST_OFFSET (dra)); > + HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_CONST_OFFSET (drb)); > gcc_assert (init_a <= init_b); > > /* If init_b == init_a + the size of the type * k, we have an > @@ -2859,10 +2860,10 @@ vect_analyze_data_ref_accesses (vec_info > /* If we have a store, the accesses are adjacent. This splits > groups into chunks we support (we don't support vectorization > of stores with gaps). */ > + HOST_WIDE_INT prev_init > + = TREE_INT_CST_LOW (DR_CONST_OFFSET (datarefs_copy[i - 1])); > if (!DR_IS_READ (dra) > - && (init_b - (HOST_WIDE_INT) TREE_INT_CST_LOW > - (DR_INIT (datarefs_copy[i-1])) > - != type_size_a)) > + && (init_b - prev_init) != type_size_a) > break; > > /* If the step (if not zero or non-constant) is greater than the > @@ -2974,12 +2975,12 @@ vect_vfa_segment_size (struct data_refer > vect_no_alias_p (struct data_reference *a, struct data_reference *b, > tree segment_length_a, tree segment_length_b) > { > - gcc_assert (TREE_CODE (DR_INIT (a)) == INTEGER_CST > - && TREE_CODE (DR_INIT (b)) == INTEGER_CST); > - if (tree_int_cst_equal (DR_INIT (a), DR_INIT (b))) > + gcc_assert (TREE_CODE (DR_CONST_OFFSET (a)) == INTEGER_CST > + && TREE_CODE (DR_CONST_OFFSET (b)) == INTEGER_CST); > + if (tree_int_cst_equal (DR_CONST_OFFSET (a), DR_CONST_OFFSET (b))) > return false; > > - tree seg_a_min = DR_INIT (a); > + tree seg_a_min = DR_CONST_OFFSET (a); > tree seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_min), > seg_a_min, segment_length_a); > /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT > @@ -2990,10 +2991,10 @@ vect_no_alias_p (struct data_reference * > tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a))); > seg_a_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_a_max), > seg_a_max, unit_size); > - seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (a)), > - DR_INIT (a), unit_size); > + seg_a_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CONST_OFFSET (a)), > + DR_CONST_OFFSET (a), unit_size); > } > - tree seg_b_min = DR_INIT (b); > + tree seg_b_min = DR_CONST_OFFSET (b); > tree seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_min), > seg_b_min, segment_length_b); > if (tree_int_cst_compare (DR_STEP (b), size_zero_node) < 0) > @@ -3001,8 +3002,8 @@ vect_no_alias_p (struct data_reference * > tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b))); > seg_b_min = fold_build2 (PLUS_EXPR, TREE_TYPE (seg_b_max), > seg_b_max, unit_size); > - seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_INIT (b)), > - DR_INIT (b), unit_size); > + seg_b_max = fold_build2 (PLUS_EXPR, TREE_TYPE (DR_CONST_OFFSET (b)), > + DR_CONST_OFFSET (b), unit_size); > } > > if (tree_int_cst_le (seg_a_max, seg_b_min) > @@ -3148,8 +3149,7 @@ vect_prune_runtime_alias_test_list (loop > comp_res = data_ref_compare_tree (DR_BASE_ADDRESS (dr_a), > DR_BASE_ADDRESS (dr_b)); > if (comp_res == 0) > - comp_res = data_ref_compare_tree (DR_OFFSET (dr_a), > - DR_OFFSET (dr_b)); > + comp_res = dr_var_offsets_compare (dr_a, dr_b); > > /* Alias is known at compilation time. */ > if (comp_res == 0 > @@ -3455,7 +3455,7 @@ vect_analyze_data_refs (vec_info *vinfo, > { > gimple *stmt; > stmt_vec_info stmt_info; > - tree base, offset, init; > + tree base, offset; > enum { SG_NONE, GATHER, SCATTER } gatherscatter = SG_NONE; > bool simd_lane_access = false; > int vf; > @@ -3488,8 +3488,7 @@ vect_analyze_data_refs (vec_info *vinfo, > } > > /* Check that analysis of the data-ref succeeded. */ > - if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_INIT (dr) > - || !DR_STEP (dr)) > + if (!DR_BASE_ADDRESS (dr) || !DR_OFFSET (dr) || !DR_STEP (dr)) > { > bool maybe_gather > = DR_IS_READ (dr) > @@ -3515,7 +3514,6 @@ vect_analyze_data_refs (vec_info *vinfo, > gcc_assert (newdr != NULL && DR_REF (newdr)); > if (DR_BASE_ADDRESS (newdr) > && DR_OFFSET (newdr) > - && DR_INIT (newdr) > && DR_STEP (newdr) > && integer_zerop (DR_STEP (newdr))) > { > @@ -3523,8 +3521,7 @@ vect_analyze_data_refs (vec_info *vinfo, > { > tree off = DR_OFFSET (newdr); > STRIP_NOPS (off); > - if (TREE_CODE (DR_INIT (newdr)) == INTEGER_CST > - && TREE_CODE (off) == MULT_EXPR > + if (TREE_CODE (off) == MULT_EXPR > && tree_fits_uhwi_p (TREE_OPERAND (off, 1))) > { > tree step = TREE_OPERAND (off, 1); > @@ -3555,7 +3552,7 @@ vect_analyze_data_refs (vec_info *vinfo, > { > DR_OFFSET (newdr) = ssize_int (0); > DR_STEP (newdr) = step; > - DR_OFFSET_ALIGNMENT (newdr) > + DR_VAR_OFFSET_ALIGNMENT (newdr) > = BIGGEST_ALIGNMENT; > DR_STEP_ALIGNMENT (newdr) > = highest_pow2_factor (step); > @@ -3665,7 +3662,6 @@ vect_analyze_data_refs (vec_info *vinfo, > > base = unshare_expr (DR_BASE_ADDRESS (dr)); > offset = unshare_expr (DR_OFFSET (dr)); > - init = unshare_expr (DR_INIT (dr)); > > if (is_gimple_call (stmt) > && (!gimple_call_internal_p (stmt) > @@ -3701,9 +3697,7 @@ vect_analyze_data_refs (vec_info *vinfo, > inner loop: *(BASE + INIT + OFFSET). By construction, > this address must be invariant in the inner loop, so we > can consider it as being used in the outer loop. */ > - tree init_offset = fold_build2 (PLUS_EXPR, TREE_TYPE (offset), > - init, offset); > - tree init_addr = fold_build_pointer_plus (base, init_offset); > + tree init_addr = fold_build_pointer_plus (base, offset); > tree init_ref = build_fold_indirect_ref (init_addr); > > if (dump_enabled_p ()) > @@ -3715,7 +3709,7 @@ vect_analyze_data_refs (vec_info *vinfo, > } > > if (!dr_analyze_innermost (&STMT_VINFO_DR_WRT_VEC_LOOP (stmt_info), > - init_ref, loop)) > + init_ref, stmt, loop)) > /* dr_analyze_innermost already explained the failure. */ > return false; > > @@ -3728,10 +3722,14 @@ vect_analyze_data_refs (vec_info *vinfo, > dump_printf (MSG_NOTE, "\n\touter offset from base address: "); > dump_generic_expr (MSG_NOTE, TDF_SLIM, > STMT_VINFO_DR_OFFSET (stmt_info)); > - dump_printf (MSG_NOTE, > - "\n\touter constant offset from base address: "); > + dump_printf (MSG_NOTE, "\n\tvariable part of outer offset" > + " from base address: "); > + dump_generic_expr (MSG_NOTE, TDF_SLIM, > + STMT_VINFO_DR_VAR_OFFSET (stmt_info)); > + dump_printf (MSG_NOTE, "\n\tconstart part of outer offset" > + " from base address: "); > dump_generic_expr (MSG_NOTE, TDF_SLIM, > - STMT_VINFO_DR_INIT (stmt_info)); > + STMT_VINFO_DR_CONST_OFFSET (stmt_info)); > dump_printf (MSG_NOTE, "\n\touter step: "); > dump_generic_expr (MSG_NOTE, TDF_SLIM, > STMT_VINFO_DR_STEP (stmt_info)); > @@ -3739,8 +3737,9 @@ vect_analyze_data_refs (vec_info *vinfo, > STMT_VINFO_DR_BASE_ALIGNMENT (stmt_info)); > dump_printf (MSG_NOTE, "\n\touter base misalignment: %d\n", > STMT_VINFO_DR_BASE_MISALIGNMENT (stmt_info)); > - dump_printf (MSG_NOTE, "\n\touter offset alignment: %d\n", > - STMT_VINFO_DR_OFFSET_ALIGNMENT (stmt_info)); > + dump_printf (MSG_NOTE, > + "\n\touter variable offset alignment: %d\n", > + STMT_VINFO_DR_VAR_OFFSET_ALIGNMENT (stmt_info)); > dump_printf (MSG_NOTE, "\n\touter step alignment: %d\n", > STMT_VINFO_DR_STEP_ALIGNMENT (stmt_info)); > } > @@ -4055,23 +4054,18 @@ vect_create_addr_base_for_vector_ref (gi > innermost_loop_behavior *drb = vect_dr_behavior (dr); > > tree data_ref_base = unshare_expr (drb->base_address); > - tree base_offset = unshare_expr (drb->offset); > - tree init = unshare_expr (drb->init); > - > + tree base_offset; > if (loop_vinfo) > - base_name = get_name (data_ref_base); > + { > + base_name = get_name (data_ref_base); > + base_offset = fold_convert (sizetype, unshare_expr (drb->offset)); > + } > else > { > - base_offset = ssize_int (0); > - init = ssize_int (0); > + base_offset = size_int (0); > base_name = get_name (DR_REF (dr)); > } > > - /* Create base_offset */ > - base_offset = size_binop (PLUS_EXPR, > - fold_convert (sizetype, base_offset), > - fold_convert (sizetype, init)); > - > if (offset) > { > offset = fold_build2 (MULT_EXPR, sizetype, > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2017-08-21 15:50:48.664709938 +0100 > +++ gcc/tree-vect-stmts.c 2017-08-22 14:54:48.633563940 +0100 > @@ -5970,9 +5970,7 @@ vectorizable_store (gimple *stmt, gimple > stride_base > = fold_build_pointer_plus > (unshare_expr (DR_BASE_ADDRESS (first_dr)), > - size_binop (PLUS_EXPR, > - convert_to_ptrofftype (unshare_expr (DR_OFFSET > (first_dr))), > - convert_to_ptrofftype (DR_INIT (first_dr)))); > + convert_to_ptrofftype (unshare_expr (DR_OFFSET (first_dr)))); > stride_step = fold_convert (sizetype, unshare_expr (DR_STEP > (first_dr))); > > /* For a store with loop-invariant (but other than power-of-2) > @@ -6299,7 +6297,6 @@ vectorizable_store (gimple *stmt, gimple > && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR > && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0)) > && integer_zerop (DR_OFFSET (first_dr)) > - && integer_zerop (DR_INIT (first_dr)) > && alias_sets_conflict_p (get_alias_set (aggr_type), > get_alias_set (TREE_TYPE (ref_type)))) > { > @@ -6993,9 +6990,7 @@ vectorizable_load (gimple *stmt, gimple_ > stride_base > = fold_build_pointer_plus > (DR_BASE_ADDRESS (first_dr), > - size_binop (PLUS_EXPR, > - convert_to_ptrofftype (DR_OFFSET (first_dr)), > - convert_to_ptrofftype (DR_INIT (first_dr)))); > + convert_to_ptrofftype (DR_OFFSET (first_dr))); > stride_step = fold_convert (sizetype, DR_STEP (first_dr)); > > /* For a load with loop-invariant (but other than power-of-2) > @@ -7394,7 +7389,6 @@ vectorizable_load (gimple *stmt, gimple_ > && TREE_CODE (DR_BASE_ADDRESS (first_dr)) == ADDR_EXPR > && VAR_P (TREE_OPERAND (DR_BASE_ADDRESS (first_dr), 0)) > && integer_zerop (DR_OFFSET (first_dr)) > - && integer_zerop (DR_INIT (first_dr)) > && alias_sets_conflict_p (get_alias_set (aggr_type), > get_alias_set (TREE_TYPE (ref_type))) > && (alignment_support_scheme == dr_aligned > @@ -7417,8 +7411,8 @@ vectorizable_load (gimple *stmt, gimple_ > = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt_for_drptr)); > tree diff = fold_convert (sizetype, > size_binop (MINUS_EXPR, > - DR_INIT (first_dr), > - DR_INIT (ptrdr))); > + DR_CONST_OFFSET > (first_dr), > + DR_CONST_OFFSET (ptrdr))); > dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, > stmt, diff); > } > Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635.c > =================================================================== > --- /dev/null 2017-08-21 17:11:06.647307681 +0100 > +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635.c 2017-08-22 14:54:48.628563940 > +0100 > @@ -0,0 +1,56 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target vect_unpack } */ > + > +double p[1000]; > +double q[1000]; > + > +void > +f1 (void) > +{ > + for (unsigned int i = 0; i < 1000; i += 4) > + { > + double a = q[i] + p[i]; > + double b = q[i + 1] + p[i + 1]; > + q[i] = a; > + q[i + 1] = b; > + } > +} > + > +void > +f2 (void) > +{ > + for (unsigned int i = 0; i < 500; i += 6) > + for (unsigned int j = 0; j < 500; j += 4) > + { > + double a = q[j] + p[i]; > + double b = q[j + 1] + p[i + 1]; > + q[i] = a; > + q[i + 1] = b; > + } > +} > + > +void > +f3 (void) > +{ > + for (unsigned int i = 2; i < 1000; i += 4) > + { > + double a = q[i - 2] + p[i - 2]; > + double b = q[i - 1] + p[i - 1]; > + q[i - 2] = a; > + q[i - 1] = b; > + } > +} > + > +void > +f4 (unsigned int n) > +{ > + for (unsigned int i = 0; i < n; i += 4) > + { > + double a = q[i] + p[i]; > + double b = q[i + 1] + p[i + 1]; > + q[i] = a; > + q[i + 1] = b; > + } > +} > + > +/* { dg-final { scan-tree-dump-times "basic block vectorized" 4 "slp1" } } */