carlo.bertolli added a comment.

Hi
If I understand correctly the problem, I would like to add something on top of 
Samuel's comment.
My understanding is that Alexey is suggesting that we pass a reference type to 
kernels for every pointer input parameter, regardless of the actual type and 
data sharing attribute of the related variable. Ignore the remained of this 
comment if this is not the case.

In my viewpoint, this violates the basic logic for which target firstprivate 
was introduced in the OpenMP specification: to avoid having to pass to GPU 
kernels a reference to every pointer variable. In fact, having to pass a 
reference results in ptxas generating an additional register for each input 
parameter. We saw significant performance differences in this respect, and 
reported about this in a paper at SC15. This has been reported by various 
members of the OpenMP committee multiple times as the reason why the target 
firstprivate logic is defined the way it is.

In conclusion, I do not see this patch as an optimization, but as a way of 
correctly implementing what the OpenMP specification clearly state. Any other 
implementation is wrong, not a simpler, non optimized version.


http://reviews.llvm.org/D18110



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to