https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679
--- Comment #2 from Tejas Belagod <belagod at gcc dot gnu.org> --- foo.c.optimized: 5.0: ;; prev block 0, next block 1, flags: (NEW, REACHABLE) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) # .MEM_4 = VDEF <.MEM_3(D)> aD.1380 = *.LC0D.1387; # VUSE <.MEM_4> vect__6.6_13 = MEM[(intD.7 *)&aD.1380]; # VUSE <.MEM_4> vect__6.6_10 = MEM[(intD.7 *)&aD.1380 + 16B]; _27 = BIT_FIELD_REF <vect__6.6_13, 32, 0>; _16 = BIT_FIELD_REF <vect__6.6_10, 32, 0>; _15 = _16 + _27; _18 = BIT_FIELD_REF <vect__6.6_13, 32, 32>; _14 = BIT_FIELD_REF <vect__6.6_10, 32, 32>; _5 = _14 + _18; _12 = BIT_FIELD_REF <vect__6.6_13, 32, 64>; _2 = BIT_FIELD_REF <vect__6.6_10, 32, 64>; _29 = _2 + _12; _30 = BIT_FIELD_REF <vect__6.6_13, 32, 96>; _31 = BIT_FIELD_REF <vect__6.6_10, 32, 96>; _32 = _30 + _31; vect_sum_7.7_17 = {_15, _5, _29, _32}; stmp_sum_7.8_19 = _15; stmp_sum_7.8_20 = _5; stmp_sum_7.8_21 = stmp_sum_7.8_19 + stmp_sum_7.8_20; stmp_sum_7.8_22 = _29; stmp_sum_7.8_23 = stmp_sum_7.8_21 + _29; stmp_sum_7.8_24 = _32; stmp_sum_7.8_25 = stmp_sum_7.8_23 + _32; vect_sum_7.9_26 = stmp_sum_7.8_25; # .MEM_9 = VDEF <.MEM_4> aD.1380 ={v} {CLOBBER}; # VUSE <.MEM_9> return vect_sum_7.9_26; ;; succ: EXIT [100.0%] Very strange that vectorizer seems to be kicking in with -mgeneral-regs-only 4.9.2: ;; basic block 2, loop depth 0, count 0, freq 1111, maybe hot ;; prev block 0, next block 1, flags: (NEW, REACHABLE) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) # .MEM_4 = VDEF <.MEM_3(D)> aD.1374[0] = 0; # .MEM_5 = VDEF <.MEM_4> aD.1374[1] = 1; # .MEM_6 = VDEF <.MEM_5> aD.1374[2] = 2; # .MEM_7 = VDEF <.MEM_6> aD.1374[3] = 3; # .MEM_8 = VDEF <.MEM_7> aD.1374[4] = 4; # .MEM_9 = VDEF <.MEM_8> aD.1374[5] = 5; # .MEM_10 = VDEF <.MEM_9> aD.1374[6] = 6; # VUSE <.MEM_10> _20 = aD.1374[0]; # VUSE <.MEM_10> _29 = aD.1374[1]; sum_30 = _20 + _29; # VUSE <.MEM_10> _36 = aD.1374[2]; sum_37 = sum_30 + _36; # VUSE <.MEM_10> _43 = aD.1374[3]; sum_44 = sum_37 + _43; # VUSE <.MEM_10> _50 = aD.1374[4]; sum_51 = sum_44 + _50; # VUSE <.MEM_10> _57 = aD.1374[5]; sum_58 = sum_51 + _57; # VUSE <.MEM_10> _64 = aD.1374[6]; sum_65 = sum_58 + _64; sum_14 = sum_65 + 7; # .MEM_17 = VDEF <.MEM_10> aD.1374 ={v} {CLOBBER}; # VUSE <.MEM_17> return sum_14; ;; succ: EXIT [100.0%] 4.9's much saner.