On Mon, Sep 26, 2022 at 4:21 PM Robin Dapp <rd...@linux.ibm.com> wrote:
>
> Hi,
>
> I'm locally testing a branch that enables vll/vstl for partial vector
> usage i.e. len_load and len_store on s390.  I see a FAIL in
> testsuite/gfortran.dg/power_3.f90.
> Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores
> and things go wrong there.  The problem seems to be that we evaluate a
> vector constant {-1, 1, -1, 1} loaded with length 11 + 1(bias) = 12 as
> {1, -1, 1} instead of {-1, 1, -1}.
>
> I found it a bit difficult to navigate through the logic due to several
> sizes, offsets, lengths and "amounts" :)  From what I can tell the
> culprit code is (guarded by BYTES_BIG_ENDIAN)
>
>    if (TREE_CODE (pd.rhs) != CONSTRUCTOR)
>      {
>          q = (this_buffer + len
>               - (ROUND_UP (size - amnt, BITS_PER_UNIT)
>                  / BITS_PER_UNIT));
>      }
>
> where, with pd.rhs = { 255, 255, 255, 255, 0, 0, 0, 1, 255, 255, 255,
> 255, 0, 0, 0, 1 }, len = 16 bytes, size = 96 bits, we read after the
> first 32 bits.  What is supposed to happen here?  It looks like going
> backwards (when size grows), but actually size shrinks for my example
> with each successive element via pd.offset 0, -32 and -64.
>
> When skipping the block with && TREE_CODE (pd.rhs) != VECTOR_CST the
> test and various others succeed but I didn't pursue testing further and
> figured I'd rather ask here for more insight.

The error is probably in vn_reference_lookup_3 which assumes that
'len' applies to the vector elements in element order.  See the part
of the code where it checks for internal_store_fn_p.  If 'len' is with
respect to the memory and thus endianess has to be taken into
account then for the IFN_LEN_STORE

              else if (fn == IFN_LEN_STORE)
                {
                  pd.rhs_off = 0;
                  pd.offset = offset2i;
                  pd.size = (tree_to_uhwi (len)
                             + -tree_to_shwi (bias)) * BITS_PER_UNIT;
                  if (ranges_known_overlap_p (offset, maxsize,
                                              pd.offset, pd.size))
                    return data->push_partial_def (pd, set, set,
                                                   offseti, maxsizei);

likely needs to adjust rhs_off from zero for big endian?

>
> Regards
>  Robin

Reply via email to