Thanks for elaborating. On Thu, Jun 26, 2014 at 5:18 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Jun 26, 2014 at 10:44 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> Hi, >> I ran into PR60947, in which GCC understands the return value of >> memset is the first argument passed in, according to standard, then >> does optimization like below: >> mov ip, sp >> stmfd sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc} >> sub fp, ip, #4 >> sub sp, sp, #20 >> ldr r8, [r0, #112] >> add r3, r8, #232 >> add r4, r8, #328 >> .L1064: >> mov r0, r3 >> mov r1, #255 >> mov r2, #8 >> bl memset >> add r3, r0, #32 <----X >> cmp r3, r4 >> bne .L1064 >> >> For X insn, GCC takes advantage of standard by using the returned r0 >> directly. >> >> My question is, is it always safe for GCC to do such optimization? Do > > Yes, I think so. > >> we have an option to disable such standard dependent optimization? > > -fno-builtin / -ffreestanding, but for memset/memcpy/memove that are also > generated by GCC that won't help. > >> BTW, the sample should be further optimized into below (with Y redundant >> now): >> >> mov ip, sp >> stmfd sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc} >> sub fp, ip, #4 >> sub sp, sp, #20 >> ldr r8, [r0, #112] >> add r0, r8, #232 >> add r4, r8, #328 >> .L1064: >> mov r0, r0 <------Y >> mov r1, #255 >> mov r2, #8 >> bl memset >> add r0, r0, #32 >> cmp r0, r4 >> bne .L1064 > > Probably because the feature was not integrated into the register allocator > (telling it that r3 can be materialized from r0) but in some other way. I agree it's IRA that still doesn't understand thus allocates the pseudo register to r3, rather than r0.
Thanks, bin > > Richard. > >> >> Thanks, >> bin