https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92455
Bug ID: 92455 Summary: Unnecessary memory read in a loop Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: antoshkka at gmail dot com Target Milestone: --- Consider the example: typedef struct { int* ptr_; } int_ptr; int_ptr f1(int_ptr* x) { int_ptr* max = x; for (int i =0 ; i < 5; ++ i) { ++ x; if (*max->ptr_ < *x->ptr_) { max = x; } } return *max; } GCC with -O2 generates the following assembly: f1(int_ptr*): lea rsi, [rdi+40] mov rax, rdi .L3: mov rcx, QWORD PTR [rax] ; <== This could be removed from the loop mov rdx, QWORD PTR [rdi+8] add rdi, 8 mov edx, DWORD PTR [rdx] cmp DWORD PTR [rcx], edx cmovl rax, rdi cmp rsi, rdi jne .L3 mov rax, QWORD PTR [rax] ret If we rewrite the example to avoid int_ptr: int* f2(int** x) { int** max = x; for (int i =0 ; i < 5; ++ i) { ++ x; if (**max < **x) { max = x; } } return *max; } Then there'll be less memory accesses in a loop: f2(int**): mov rax, QWORD PTR [rdi] ; <=== Not in a loop any more lea rcx, [rdi+40] .L8: mov rdx, QWORD PTR [rdi+8] add rdi, 8 mov esi, DWORD PTR [rdx] cmp DWORD PTR [rax], esi cmovl rax, rdx cmp rcx, rdi jne .L8 ret Please improve the memory accesses for the first case Godbolt playground: https://godbolt.org/z/CaGbT2