On Thu, 2013-06-20 at 03:16 -0700, Stephen Clarke wrote: > > why (and where) did ivopts decide to move the post-increments above the > > usages in the first loop? > > It looks rather like the transformation described next to > tree-ssa-loop-ivopts.c/adjust_iv_update_pos.
Yes, that looks like the place. Unfortunately, at that point everything is controlled by the cost of the various IV choices and I think that is what I need to change. Just to explain what is happening, the MIPS ProAptiv chip has what is known as 'memory bonding' where two sequential 4 byte loads can be handled as one 8 byte load (if the alignment is right). So I want to keep all the loads in a sequence and with no intervening instructions in order to improve the chances of memory bonding happening. With a memcpy style loop and loop unrolling the IV optimization is changing my ideal loop: .L4: addiu $8,$8,8 lw $14,0($7) lw $13,4($7) lw $12,8($7) lw $11,12($7) lw $15,16($7) lw $24,20($7) lw $5,24($7) lw $25,28($7) addiu $7,$7,32 sw $14,0($3) sw $13,4($3) sw $12,8($3) sw $11,12($3) sw $15,16($3) sw $24,20($3) sw $5,24($3) sw $25,28($3) bne $8,$6,.L4 addiu $3,$3,32 into this: .L4: addiu $8,$8,8 lw $13,0($7) lw $12,4($7) lw $14,8($7) lw $15,12($7) lw $24,16($7) lw $5,20($7) lw $25,24($7) addiu $7,$7,32 sw $13,0($3) sw $12,4($3) sw $14,8($3) sw $15,12($3) sw $24,16($3) sw $5,20($3) sw $25,24($3) addiu $3,$3,32 lw $4,-4($7) bne $8,$6,.L4 sw $4,-4($3) And so I lose one of my bonding opportunities because of the lw/sw that are done after incrementing registers $7 and $3. I think what I need to fix this is a target specific way to modify the cost calculations before the IV variables are chosen. Steve Ellcey sell...@mips.com