As you can see, the compiler uses r9 to store data and then uses that
for data[0] but also loads in r7 data+8 instead of directly using r9.
If I remove the loop then it does not do this.

This optimization is done by CSE only, currently. That's why it cannot look through loops.

Paolo

Reply via email to