> On Dec 13, 2014, at 5:22 AM, Ajit Kumar Agarwal 
> <ajit.kumar.agar...@xilinx.com> wrote:
> 
> Hello All:
> 
> Since the prefetch instruction have no direct consumers  in the code stream, 
> they provide considerable freedom to the 
> Instruction scheduler. They are typically assigned lower priorities than most 
> of the instructions in the code stream.
> This tends to cause all the prefetch instructions to be placed together in 
> the final schedule. This causes the performance
> Degradations by placing them in clumps rather than evenly spreading the 
> prefetch instructions.
> 
> The evenly spreading the prefetch instruction gives better speed up ratios as 
> compared to be placing in clumps for dirty
> Misses.

I can believe that’s true for some processors; is it true for all of them?  I 
have the impression that some MIPS processors don’t mind clumped prefetches, so 
long as you don’t exceed the limit on total number of concurrently pending 
memory accesses.

        paul

Reply via email to