On Thu, Jun 26, 2014 at 10:44 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > Hi, > I ran into PR60947, in which GCC understands the return value of > memset is the first argument passed in, according to standard, then > does optimization like below: > mov ip, sp > stmfd sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc} > sub fp, ip, #4 > sub sp, sp, #20 > ldr r8, [r0, #112] > add r3, r8, #232 > add r4, r8, #328 > .L1064: > mov r0, r3 > mov r1, #255 > mov r2, #8 > bl memset > add r3, r0, #32 <----X > cmp r3, r4 > bne .L1064 > > For X insn, GCC takes advantage of standard by using the returned r0 directly. > > My question is, is it always safe for GCC to do such optimization? Do
Yes, I think so. > we have an option to disable such standard dependent optimization? -fno-builtin / -ffreestanding, but for memset/memcpy/memove that are also generated by GCC that won't help. > BTW, the sample should be further optimized into below (with Y redundant now): > > mov ip, sp > stmfd sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc} > sub fp, ip, #4 > sub sp, sp, #20 > ldr r8, [r0, #112] > add r0, r8, #232 > add r4, r8, #328 > .L1064: > mov r0, r0 <------Y > mov r1, #255 > mov r2, #8 > bl memset > add r0, r0, #32 > cmp r0, r4 > bne .L1064 Probably because the feature was not integrated into the register allocator (telling it that r3 can be materialized from r0) but in some other way. Richard. > > Thanks, > bin