The rtx iterators start out using a stack worklist but fall back to a heap worklist if the stack one turns out to be too small (which is rare). PR 66076 showed up a bug in the code that makes the switch from the stack worklist to the heap worklist: the heap one might already have been allocated if the worklist structure was shared with a previous FOR_EACH_SUBRTX and if that FOR_EACH_SUBRTX also blew the stack worklist. Fixed by making the vec_safe_grow conditional on a vec_safe_length check. (Note that vec_safe_grow starts with a call to vec_safe_length, so this is no less efficient than a check for a null pointer.)
If one FOR_EACH_SUBRTX needs to use the heap worklist, it might be slightly more efficient for future FOR_EACH_SUBRTXes on the same worklist structure to start out using the heap worklist, in order to avoid any future stack-to-heap memcpy. The problem is that that would require extra set-up code for all FOR_EACH_SUBRTXes, and the idea is to optimise for the common case where the stack worklist is enough. The situation handled by the patch should be very rare -- backed up by the fact that the ICE would always trigger in that case, yet it didn't show up in pre-release testing. Maybe it would be worth bumping up the size of the stack worklist to cope with those long AVX512 constants though. Tested on x86_64-linux-gnu. Also tested that the testcase passes on aarch64-elf. OK to install? Thanks, Richard gcc/ PR rtl-optimization/66076 * rtlanal.c (generic_subrtx_iterator <T>::add_single_to_queue): Don't grow the heap array if it is already big enough from a previous iteration. gcc/testsuite/ PR rtl-optimization/66076 * gcc.dg/torture/pr66076.c: New test. Index: gcc/rtlanal.c =================================================================== --- gcc/rtlanal.c 2015-05-10 10:44:24.091059530 +0100 +++ gcc/rtlanal.c 2015-05-10 20:49:56.618230132 +0100 @@ -104,7 +104,10 @@ generic_subrtx_iterator <T>::add_single_ return base; } gcc_checking_assert (i == LOCAL_ELEMS); - vec_safe_grow (array.heap, i + 1); + /* A previous iteration might also have moved from the stack to the + heap, in which case the heap array will already be big enough. */ + if (vec_safe_length (array.heap) <= i) + vec_safe_grow (array.heap, i + 1); base = array.heap->address (); memcpy (base, array.stack, sizeof (array.stack)); base[LOCAL_ELEMS] = x; Index: gcc/testsuite/gcc.dg/torture/pr66076.c =================================================================== --- /dev/null 2015-04-20 08:05:53.260830006 +0100 +++ gcc/testsuite/gcc.dg/torture/pr66076.c 2015-05-10 20:49:56.626230040 +0100 @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "" } */ +/* { dg-options "-mno-prefer-avx128 -march=bdver4" { target i?86-*-* x86_64-*-* } } */ + +void +f0a (char *result, char *arg1, char *arg4, char temp_6) +{ + int idx = 0; + for (idx = 0; idx < 416; idx += 1) + result[idx] = (arg1[idx] + arg4[idx]) * temp_6; +}