https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98855

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rgue...@gcc.gnu.org>:

https://gcc.gnu.org/g:63538886d1f7fc7cbf066b4c2d6d7fd4da537259

commit r11-7123-g63538886d1f7fc7cbf066b4c2d6d7fd4da537259
Author: Richard Biener <rguent...@suse.de>
Date:   Fri Feb 5 09:54:00 2021 +0100

    tree-optimization/98855 - redo BB vectorization costing

    The following attempts to account for the fact that BB vectorization
    regions now can span multiple loop levels and that an unprofitable
    inner loop vectorization shouldn't be offsetted by a profitable
    outer loop vectorization to make it overall profitable.

    For now I've implemented a heuristic based on the premise that
    vectorization should be profitable even if loops may not be entered
    or if they iterate any number of times.  Especially the first
    assumption then requires that stmts directly belonging to loop A
    need to be costed separately from stmts belonging to another loop
    which also simplifies the implementation.

    On x86 the added testcase has in the outer loop

    t.c:38:20: note: Cost model analysis for part in loop 1:
      Vector cost: 56
      Scalar cost: 192

    and the inner loop

    t.c:38:20: note: Cost model analysis for part in loop 2:
      Vector cost: 132
      Scalar cost: 48

    and thus the vectorization is considered not profitable
    (note the same would happen in case the 2nd cost were for
    a loop outer to the 1st costing).

    Future enhancements may consider static knowledge of whether
    a loop is always entered which would allow some inefficiency
    in the vectorization of its loop header.  Likewise stmts only
    reachable from a loop exit can be treated this way.

    2021-02-05  Richard Biener  <rguent...@suse.de>

            PR tree-optimization/98855
            * tree-vectorizer.h (add_stmt_cost): New overload.
            * tree-vect-slp.c (li_cost_vec_cmp): New.
            (vect_bb_slp_scalar_cost): Cost individual loop regions
            separately.  Account for the scalar instance root stmt.

            * g++.dg/vect/slp-pr98855.cc: New testcase.

Reply via email to