On 03/10/2010 10:48 PM, fanqifei wrote:
For below piece of code, the instruction "clr.w a15" obviously doesn't
belong to the inner loop.
6: bd f4 clr.w a15; #clear to zero
8: 80 af 00 std.w a10 0x0 a15;
There is info lacking here. Did you compile with optimization? What
does the RTL look like before and after the loop opt passes?
I'd guess that your movsi pattern is defined wrong. You probably have
predicates that allow either registers or constants in the set source,
which is normal, and constraints that only allow registers when the dest
is a mem. But constraints are only used by the reload pass, so a store
zero to mem rtl insn will be generated early, and then fixed late during
the reload pass. So the loop opt did not move the clear insn out of the
loop because there was no clear insn at this time.
The way to fix this is to add a condition to the movsi pattern that
excludes this case. For instance, something like this:
"(register_operand (operands[0], SImode)
|| register_operand (operands[1], SImode))"
This will prevent a store zero to mem RTL insn from being accepted. In
order to make this work, you need to make movsi an expander that accepts
anything, and then forces the source to a register if you have a store
constant to memory. See for instance the sparc_expand_move function or
the mips_legitimize_move function.
Use -da (old) or -fdump-rtl-all (new) to see the RTL dumps to see what
is going on.
Jim