> On Sep 10, 2015, at 6:29 PM, Richard Smith <rich...@metafoo.co.uk> wrote: > > On Thu, Sep 10, 2015 at 5:57 PM, Michael Zolotukhin <mzolotuk...@apple.com > <mailto:mzolotuk...@apple.com>> wrote: > mzolotukhin added inline comments. > > ================ > Comment at: docs/LanguageExtensions.rst:1802-1807 > @@ +1801,8 @@ > + > +For example, on AArch64 in the following code:: > + > + LDR X1, [X2] > + LDNP X3, X4, [X1] > + > +the ``LDNP`` might be executed before the ``LDR``. In this case the load > would > +be performed from a wrong address (see 6.3.8 in `Programmer's Guide for > ARMv8-A > ---------------- > rsmith wrote: > > This seems to make the feature essentially useless, since you cannot > > guarantee that the address register is set up sufficiently far before the > > non-temporal load. Should the compiler not be required to insert the > > necessary barrier itself in this case? > Yes, we can require targets to only use corresponding NT instructions when > it's safe, and then remove this remark from the documentation. For ARM64 that > would mean either not to emit LDNP at all, or conservatively emit barriers > before each LDNP (which probably removes all performance benefits of using > it) - that is, yes, non-temporal loads would be useless on this target. > > I think this should already be the case -- according to the definition of > !nontemporal in the LangRef > (http://llvm.org/docs/LangRef.html#load-instruction > <http://llvm.org/docs/LangRef.html#load-instruction>), using an LDNP without > an accompanying barrier would not be correct on AArch64, as it does not have > the right semantics. I removed the paragraph in updated patch. > > But I think we want to keep the builtin for NT-load, as it's a generic > feature, not ARM64 specific. It can be used on other targets - e.g. we can > use this in x86 stream builtins, and hopefully simplify their current > implementation. I don't know about non-temporal operations on other targets, > but if there are others, they can use it too right out of the box. > > Yes, I'm not arguing for removing the builtin, just that the AArch64 backend > needs to be very careful when mapping it to LDNP, because that will > frequently not be correct. Yes, that's true, and that was the reason why we only implemented STNP for now.
Michael
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits