https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98235

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rgue...@gcc.gnu.org>:

https://gcc.gnu.org/g:fc7b4248172561a9ee310e2d43d8a485a5c9e108

commit r11-5928-gfc7b4248172561a9ee310e2d43d8a485a5c9e108
Author: Richard Biener <rguent...@suse.de>
Date:   Fri Dec 11 10:52:58 2020 +0100

    tree-optimization/98235 - limit SLP discovery

    With following backedges and the SLP discovery cache not being
    permute aware we have to put some discovery limits in place again.
    That's also the opportunity to ditch the separate limit on the
    number of permutes we try, so the patch limits the overall work
    done (as in vect_build_slp_tree cache misses) to what we compute
    as max_tree_size which is based on the number of scalar stmts in
    the vectorized region.

    Note the limit is global and there's no attempt to divide the
    allowed work evenly amongst opportunities, so one degenerate
    can eat it all up.  That's probably only relevant for BB
    vectorization where the limit is based on up to the size of the
    whole function.

    2020-12-11  Richard Biener  <rguent...@suse.de>

            PR tree-optimization/98235
            * tree-vect-slp.c (vect_build_slp_tree): Exchange npermutes
            for limit.  Decrement that for each cache miss and fail
            discovery when it reaches zero.
            (vect_build_slp_tree_2): Remove npermutes handling and
            simply pass down limit.
            (vect_build_slp_instance): Use pass down limit.
            (vect_analyze_slp_instance): Likewise.
            (vect_analyze_slp): Base the SLP discovery limit on
            max_tree_size and pass it down.

            * gcc.dg/torture/pr98235.c: New testcase.

Reply via email to