On 20/10/2022 18:37, Andrew Pinski via Gcc-patches wrote:
On Thu, Oct 20, 2022 at 10:28 AM Segher Boessenkool
<seg...@kernel.crashing.org> wrote:

On Thu, Oct 20, 2022 at 01:44:15AM +0000, Jiang, Haochen wrote:
Maybe the testcase change cause some misunderstanding and concern.

Actually, the patch did not disrupt the previous builtins, as the 
builtin_prefetch
uses vargs. I set the default value of the new parameter as data prefetch, which
means that if we are not using the fourth parameter, just like how we use
prefetch previously, it is still what it is.

I still think it is a mistake to have one builtin do two very distinct
operations, only very superficially related.  Instruction fetch and data
demand loads are almosty entirely unrelated, and so is the prefetch
machinery for them, on all machines I am familiar with.

On aarch64 (armv8), it is actually the same instruction: PRFM. It
might be the only one which is that way though.
It even allows to specify the level for the instruction prefetch too
(which is actually useful for say OcteonTX2 which has an interesting
cache hierarchy).


Just because the encodings are similar doesn't mean that the instructions are the same, although it's true that once you reach unification in the cache hierarchy the end behaviour /might/ be indistinguishable.

Really, Segher's point seems to be 'why overload the existing builtin for this'? It's not like the new parameter is something that users would really need to pass in as a run-time choice; and that wouldn't work anyway because in the end we do need distinct instructions.

R.

Though I agree it is a mistake to have one builtin which handles both
data and instruction prefetch.

Thanks,
Andrew


Which makes
sense anyway, since instruction prefetch and data prefetch have
completely different performance characteristics and considerations.
Maybe if you start with the mistake of having unified L1 caches it
seems natural, but thankfully most machines do not do that.


Segher

Reply via email to