Hi,
I have a mips-like architecture which has prefetch instructions. I'm writing an optimization pass that inserts prefetch instructions for all array reads. The catch is that I'm trying to do this even if the reads are not in a loop.
I have two questions:

1. Is there any work out there that has tried to do this before? All I found in the latest gcc-svn was tree-ssa-loop-prefetch.c, but since my references are not in a loop, a lot of the things done in there will not apply to me.

2. Right now I am inserting a __builting_prefetch(...) call immediately before the actual read, getting something like:
 D.1117_12 = &A[D.1101_14];
 __builtin_prefetch (D.1117_12, 0, 1);
 D.1102_16 = A[D.1101_14];

However, if I enable the instruction scheduler pass, it doesn't realize there's a dependency between the prefetch and the load, and it actually moves the prefetch after the load, rendering it useless. How can I instruct the scheduler of this dependence?

My thinking is to also specify a latency for prefetch, so that the scheduler will hopefully place the prefetch somewhere earlier in the code to partially hide this latency. Do you see anything wrong with this approach?

The prefetch instruction in the .md file is defined as:
(define_insn "prefetch"
 [(prefetch (match_operand:QI 0 "address_operand" "p")
        (match_operand 1 "const_int_operand" "n")
        (match_operand 2 "const_int_operand" "n"))]
 ""
{
 operands[1] = mips_prefetch_cookie (operands[1], operands[2]);
 return "pref\t%1,%a0";
}
 [(set_attr "type" "prefetch")])

Thanks,
George

Reply via email to