This patch optimizes GOMP_MAP_TO_PSET in libgomp by installing the remapped pointer to the array data directly in the PSET, instead of uploading it separately with GOMP_MAP_POINTER. Effectively this eliminates the GOMP_MAP_POINTER that is associated with the PSET, thereby eliminating an additional host2dev data transfer.
While it does work, it's not quite as effective as I had hope it would be. I'm only observing about 0.05s, if that, in CloverLeaf, and arguably that's statistical noise. This is probably because CloverLeaf makes use of ACC DATA regions in the critical sections, so all of those PSETs and POINTERs are already preset on the accelerator. One thing I don't like about this patch is that I'm updating the host's copy of the PSET prior to uploading it. The host's PSET does get restored prior to returning from gomp_map_vars, however that might impact things if the host were to run in multi-threaded applications. Maybe I'll drop this patch from gomp4 since it's not very effective. Cesar