https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98235
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Richard Biener <rgue...@gcc.gnu.org>: https://gcc.gnu.org/g:fc7b4248172561a9ee310e2d43d8a485a5c9e108 commit r11-5928-gfc7b4248172561a9ee310e2d43d8a485a5c9e108 Author: Richard Biener <rguent...@suse.de> Date: Fri Dec 11 10:52:58 2020 +0100 tree-optimization/98235 - limit SLP discovery With following backedges and the SLP discovery cache not being permute aware we have to put some discovery limits in place again. That's also the opportunity to ditch the separate limit on the number of permutes we try, so the patch limits the overall work done (as in vect_build_slp_tree cache misses) to what we compute as max_tree_size which is based on the number of scalar stmts in the vectorized region. Note the limit is global and there's no attempt to divide the allowed work evenly amongst opportunities, so one degenerate can eat it all up. That's probably only relevant for BB vectorization where the limit is based on up to the size of the whole function. 2020-12-11 Richard Biener <rguent...@suse.de> PR tree-optimization/98235 * tree-vect-slp.c (vect_build_slp_tree): Exchange npermutes for limit. Decrement that for each cache miss and fail discovery when it reaches zero. (vect_build_slp_tree_2): Remove npermutes handling and simply pass down limit. (vect_build_slp_instance): Use pass down limit. (vect_analyze_slp_instance): Likewise. (vect_analyze_slp): Base the SLP discovery limit on max_tree_size and pass it down. * gcc.dg/torture/pr98235.c: New testcase.