Anastasia added a comment. In https://reviews.llvm.org/D30810#710069, @jaykang10 wrote:
> > I believe the argument lacks numbers (or at least you have them but didn't > > mention). I didn't hear about performance results, or validation that this > > was actually tested for correctness. Small test cases prove a point, but > > can't be considered general. > > > > OTOH, it seems like this is exactly why you want the flag, to hide any > > potential issues and play with it. I'm not opposed to adding the flag if > > there's a commitment to actually get the result and change the default to > > whatever seems to be better, does that seems reasonable? > > To be honest, I did not start this discussion to get better performance for > GPU. But the goal has moved while we discussing it. I think I have showed you > the gain from AMD GPU target on previous comments. On current version of > clang/llvm, it generates one less load/store to preserve vec3 type instead of > vec4 type on GPU target. But it could cause more load/store on another > targets which have vector register because they usually expect aligned vector > load/store. It means that if we want to change default to vec3 on clang, we > need to change llvm's codegen's default legalization or backend targets need > to be modified in order to get aligned vector load/store with vec3 type. I > think It is too big change at this moment. In the future, we could move this > discussion to llvm-dev. Up to that time, as you mentioned, I would like to > hide the potential issues with vec3 type and play with it. Just to summarize. There seems to be no issue in adding this feature by a special flag considering that changing default behaviors requires changing the backend that we might consider at some later point of time. We can start a separate discussion on that. The change can be used to generate more generic and optimal IR resulting in 1/4 less instructions generated in the final binary per each load and store of vec2 types in some architectures (e.g. AMD) . I am not sure what kind of further numbers are needed and if there is any measuring infrastructure LLVM project provides for that. Considering all the above I don't feel there is any reason this can't be committed. https://reviews.llvm.org/D30810 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits