Anastasia added a comment.

In https://reviews.llvm.org/D30810#702443, @bruno wrote:

> > As a result, I think it would be good for clang to have both of features 
> > and I would like to stick to the option "-fpresereve-vec3' to change the 
> > behavior easily.
>
> The motivation doesn't seem solid to me, who else is going to benefit from 
> this flag?


There are some off the main tree implementation that would benefit. But in the 
case of AMD GPU 3 loads/stores will be produced instead of 4. Sounds like a 
good optimization to me. As I said in my previous comment I think it should 
have been the default behavior from the beginning, but since different 
implementation landed first we can integrate this one now with an additional 
option.


https://reviews.llvm.org/D30810



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to