Hi,
I have a mips-like architecture which has prefetch instructions. I'm
writing an optimization pass that inserts prefetch instructions for all
array reads. The catch is that I'm trying to do this even if the reads
are not in a loop.
I have two questions:
1. Is there any work out there that has tried to do this before? All I
found in the latest gcc-svn was tree-ssa-loop-prefetch.c, but since my
references are not in a loop, a lot of the things done in there will not
apply to me.
2. Right now I am inserting a __builting_prefetch(...) call immediately
before the actual read, getting something like:
D.1117_12 = &A[D.1101_14];
__builtin_prefetch (D.1117_12, 0, 1);
D.1102_16 = A[D.1101_14];
However, if I enable the instruction scheduler pass, it doesn't realize
there's a dependency between the prefetch and the load, and it actually
moves the prefetch after the load, rendering it useless. How can I
instruct the scheduler of this dependence?
My thinking is to also specify a latency for prefetch, so that the
scheduler will hopefully place the prefetch somewhere earlier in the
code to partially hide this latency. Do you see anything wrong with this
approach?
The prefetch instruction in the .md file is defined as:
(define_insn "prefetch"
[(prefetch (match_operand:QI 0 "address_operand" "p")
(match_operand 1 "const_int_operand" "n")
(match_operand 2 "const_int_operand" "n"))]
""
{
operands[1] = mips_prefetch_cookie (operands[1], operands[2]);
return "pref\t%1,%a0";
}
[(set_attr "type" "prefetch")])
Thanks,
George