> On Sep 10, 2015, at 6:29 PM, Richard Smith <rich...@metafoo.co.uk> wrote:
> 
> On Thu, Sep 10, 2015 at 5:57 PM, Michael Zolotukhin <mzolotuk...@apple.com 
> <mailto:mzolotuk...@apple.com>> wrote:
> mzolotukhin added inline comments.
> 
> ================
> Comment at: docs/LanguageExtensions.rst:1802-1807
> @@ +1801,8 @@
> +
> +For example, on AArch64 in the following code::
> +
> +  LDR X1, [X2]
> +  LDNP X3, X4, [X1]
> +
> +the ``LDNP`` might be executed before the ``LDR``. In this case the load 
> would
> +be performed from a wrong address (see 6.3.8 in `Programmer's Guide for 
> ARMv8-A
> ----------------
> rsmith wrote:
> > This seems to make the feature essentially useless, since you cannot 
> > guarantee that the address register is set up sufficiently far before the 
> > non-temporal load. Should the compiler not be required to insert the 
> > necessary barrier itself in this case?
> Yes, we can require targets to only use corresponding NT instructions when 
> it's safe, and then remove this remark from the documentation. For ARM64 that 
> would mean either not to emit LDNP at all, or conservatively emit barriers 
> before each LDNP (which probably removes all performance benefits of using 
> it) - that is, yes, non-temporal loads would be useless on this target.
> 
> I think this should already be the case -- according to the definition of 
> !nontemporal in the LangRef 
> (http://llvm.org/docs/LangRef.html#load-instruction 
> <http://llvm.org/docs/LangRef.html#load-instruction>), using an LDNP without 
> an accompanying barrier would not be correct on AArch64, as it does not have 
> the right semantics.
I removed the paragraph in updated patch.
>  
> But I think we want to keep the builtin for NT-load, as it's a generic 
> feature, not ARM64 specific. It can be used on other targets - e.g. we can 
> use this in x86 stream builtins, and hopefully simplify their current 
> implementation. I don't know about non-temporal operations on other targets, 
> but if there are others, they can use it too right out of the box.
> 
> Yes, I'm not arguing for removing the builtin, just that the AArch64 backend 
> needs to be very careful when mapping it to LDNP, because that will 
> frequently not be correct.
Yes, that's true, and that was the reason why we only implemented STNP for now.

Michael

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to