https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102591
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|WAITING |NEW Component|target |tree-optimization Blocks| |53947 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Gabriel Ravier from comment #2) > memcpy can fail on unaligned memory ??? I used it specifically to avoid this > problem ! > > (also, LLVM's code, I am pretty sure, does not have any issue with > alignment, as it uses either AVX instructions which care not for it, or > specifically does a movdqu (i.e. unaligned load) of the memory) Ah, sorry - I was reading the loop as for (int at = 0; at < 16; at++) if (tpl[at] == 0) { found = 1; break; } thus as if the suggested transform would eventually access storage that is not accessed originally... Btw, we vectorize bool match8(char *tpl) { char found = 0; for (int at = 0; at < 16; at++) if (tpl[at] == 0) found = 1; return found; } but use vector(16) char vect_found_4.8; vect__3.7_29 = MEM <vector(16) char> [(char *)tpl_10(D)]; _32 = vect__3.7_29 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; vect_found_4.8_33 = VEC_COND_EXPR <_32, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }>; _35 = .REDUC_MAX (vect_found_4.8_33); _8 = (bool) _35; return _8; where we fail to apply "magic" to the .REDUC_MAX as we know the values are all 0 or 1. The conditional reduction support doesn't support producing 'int' from char compares and we fail to narrow the reduction vector. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations