https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81719
Bug ID: 81719 Summary: Range-based for loop on short fixed size array generates long unrolled loop Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: jzwinck at gmail dot com Target Milestone: --- C++11 range-based for loops over arrays of size known at compile time result in bloated, branchy, and unreachable code with -O3 optimization. For example: typedef int Items[2]; struct ItemArray { Items items; int sum_x2() const; }; int ItemArray::sum_x2() const { int total = 0; for (int item : items) { total += item; } return total; } Clang compiles the above to [mov, add, ret]. GCC with -O2 compiles it to a few more than that, and with -O3, a whopping 81 instructions. Add -march=haswell and behold about 130 instructions to add two ints. GCC (all versions, 4 to 7) generates code to handle a variable-sized array up to about 6 to 14 elements, depending on -march. The number of elements is known at compile time to be 2 (other small values also elicit the bug). GCC should generate three instructions in both -O2 and -O3. It actually does, if sum_x2() is a free function instead of a member function. The problem also goes away if you use a C-style loop. There are lots of permutations of this, including using a range-based for loop to assign a common value to every element of an array whose size is known at compile time (120 instructions to assign a single int: https://godbolt.org/g/BGYggD). Discussion on Stack Overflow: https://stackoverflow.com/questions/45496987/gcc-optimizes-fixed-range-based-for-loop-as-if-it-had-longer-variable-length