On 02/10/2016 03:39 PM, Thomas Schwinge wrote:
Yes, we need a hammer that big: we have to ensure consistency between data regions on the device and code offloading to the device, as otherwise we'll very easily run into inconsistencies, because of the non-shared memory. In the general case, it's "all or nothing": you either have to offload all kernels or none of them.
That's unfortunately not the impression I got from the earlier discussion, and this seems to imply that one unprofitable kernel would disable all the others - IMO this is not acceptable. There need to be more compiler smarts to figure out whether a kernel is a valid candidate for skipping the offloading.
Bernd