Richard Guenther wrote:
> On Thu, 23 Feb 2012, Ulrich Weigand wrote:
> > The assert in question looks like:
> >
> > if (nested_in_vect_loop
> > && (TREE_INT_CST_LOW (STMT_VINFO_DR_STEP (stmt_info))
> > % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
> > {
> > gcc_assert (alignment_support_scheme !=
> > dr_explicit_realign_optimized);
> > compute_in_loop = true;
> > }
> >
> > where your patch changed DR_STEP to STMT_VINFO_DR_STEP (reverting just this
> > one change makes the ICEs go away).
> >
> > However, at the place where the decision to use the
> > dr_explicit_realign_optimized
> > strategy is made (tree-vect-data-refs.c:vect_supportable_dr_alignment), we
> > still
> > have:
> >
> > if ((nested_in_vect_loop
> > && (TREE_INT_CST_LOW (DR_STEP (dr))
> > != GET_MODE_SIZE (TYPE_MODE (vectype))))
> > || !loop_vinfo)
> > return dr_explicit_realign;
> > else
> > return dr_explicit_realign_optimized;
> >
> > Should this now also use STMT_VINFO_DR_STEP?
>
> Yes, I think so.
Hmmm. Reading the comment in vect_supportable_dr_alignment:
However, in the case of outer-loop vectorization, when vectorizing a
memory access in the inner-loop nested within the LOOP that is now being
vectorized, while it is guaranteed that the misalignment of the
vectorized memory access will remain the same in different outer-loop
iterations, it is *not* guaranteed that is will remain the same throughout
the execution of the inner-loop. This is because the inner-loop advances
with the original scalar step (and not in steps of VS). If the inner-loop
step happens to be a multiple of VS, then the misalignment remains fixed
and we can use the optimized realignment scheme.
it would appear that in this case, checking the inner-loop step is deliberate.
Given the comment in vectorizable_load:
/* If the misalignment remains the same throughout the execution of the
loop, we can create the init_addr and permutation mask at the loop
preheader. Otherwise, it needs to be created inside the loop.
This can only occur when vectorizing memory accesses in the inner-loop
nested within an outer-loop that is being vectorized. */
this looks to me that, since the check is intended to verify that
"misalignment remains the same throughout the execuction of the loop",
we actually want to check the inner-loop step here as well, i.e. revert
this chunk of your patch ...
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
[email protected]