On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <li...@linux.ibm.com> wrote:
>
> This patch series follows Richi's suggestion at the link [1],
> which suggest structuring vectorizable_load to make costing
> next to the transform, in order to make it easier to keep
> costing and the transform in sync.  For now, it's a known
> issue that what we cost can be inconsistent with what we
> transform, as the case in PR82255 and some other associated
> test cases in the patches of this series show.
>
> Basically this patch series makes costing not call function
> vect_model_load_cost any more.  To make the review and
> bisection easy, I organized the changes according to the
> memory access types of vector load.  For each memory access
> type, firstly it follows the handlings in the function
> vect_model_load_costto avoid any missing, then refines
> further by referring to the transform code, I also checked
> them with some typical test cases to verify.  Hope the
> subjects of patches are clear enough.
>
> The whole series can be bootstrapped and regtested
> incrementally on:
>   - x86_64-redhat-linux
>   - aarch64-linux-gnu
>   - powerpc64-linux-gnu P7, P8 and P9
>   - powerpc64le-linux-gnu P8, P9 and P10
>
> By considering the current vector test buckets are mainly
> tested without cost model, I also verified the whole patch
> series was neutral for SPEC2017 int/fp on Power9 at O2,
> O3 and Ofast separately.

I went through the series now and I like it overall (well, I suggested
the change).
Looking at the changes I think we want some followup to reduce the
mess in the final loop nest.  We already have some VMAT_* cases handled
separately, maybe we can split out some more cases.  Maybe we should
bite the bullet and duplicate that loop nest for the different VMAT_* cases.
Maybe we can merge some of the if (!costing_p) checks by clever
re-ordering.  So what
this series doesn't improve is overall readability of the code (indent and our
80 char line limit).

The change also makes it more difficult(?) to separate analysis and transform
though in the end I hope that analysis will actually "code generate" to a (SLP)
data structure so the target will have a chance to see the actual flow of insns.

That said, I'd like to hear from Richard whether he thinks this is a step
in the right direction.

Are you willing to followup with doing the same re-structuring to
vectorizable_store?

OK from my side with the few comments addressed.  The patch likely needs refresh
after the RVV changes in this area?

Thanks,
Richard.

> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html
>
> Kewen Lin (9):
>   vect: Move vect_model_load_cost next to the transform in vectorizable_load
>   vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && 
> gs_info.decl
>   vect: Adjust vectorizable_load costing on VMAT_INVARIANT
>   vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and 
> VMAT_STRIDED_SLP
>   vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER
>   vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS
>
>  .../vect/costmodel/ppc/costmodel-pr82255.c    |  31 +
>  .../costmodel/ppc/costmodel-vect-reversed.c   |  22 +
>  gcc/testsuite/gcc.target/i386/pr70021.c       |   2 +-
>  gcc/tree-vect-stmts.cc                        | 651 ++++++++++--------
>  4 files changed, 432 insertions(+), 274 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c
>  create mode 100644 
> gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c
>
> --
> 2.31.1
>

Reply via email to