On Fri, Oct 21, 2022 at 11:17:39AM +0100, Richard Earnshaw wrote:
> On 20/10/2022 18:37, Andrew Pinski via Gcc-patches wrote:
> >On aarch64 (armv8), it is actually the same instruction: PRFM. It
> >might be the only one which is that way though.
> >It even allows to specify the level for the instruction prefetch too
> >(which is actually useful for say OcteonTX2 which has an interesting
> >cache hierarchy).
> 
> Just because the encodings are similar doesn't mean that the 
> instructions are the same, although it's true that once you reach 
> unification in the cache hierarchy the end behaviour /might/ be 
> indistinguishable.

"Might", yes: for good results the hardware has to use very different
heuristics.  And of course it interacts with the hardware prefetchers
anyway (which are very different for code and data, and work a lot
better than software prefetch almost always for that matter).

> Really, Segher's point seems to be 'why overload the existing builtin 
> for this'?  It's not like the new parameter is something that users 
> would really need to pass in as a run-time choice; and that wouldn't 
> work anyway because in the end we do need distinct instructions.

Right.  The builtin as well as the RTL expressions.  But having nasty
builtin definitions hurts our users, and nasty RTL only ourselves ;-)


Segher

Reply via email to