https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121760
--- Comment #16 from Benjamin Schulz <schulz.benjamin at googlemail dot com> --- sorry for so many attachments, But if I have not forgotten anything you should now be able to reproduce this strange message libgomp: Trying to map into device [0x7ffdd1ebaac0..0x7ffdd1ebaaf0) object when [0x7ffdd1ebaaa0..0x7ffdd1ebaad0) is already mapped by starting the cmakelists.txt and then executing arraytest Given that libomp has a reference counter, I dont understand this bug really. I also honestly dont understand why the differend start and endpoints of the adresses when the same objects are mapped twice... There is something else of interest.. in gpu_mathfunctions.h on line 338, i have the following micro kernel between two large teams distribute distribute parallel for loops: #pragma omp target map(to:tmp)map(from:temp4) device(dev) { temp4=A(c, c)-tmp; temp4=sqrt(temp4); L(c,c)=temp4; } This works, but, I thought, why starting an entire kernel for this, and tried to get the values on the host with omp_target_memcpy... removing the micro kernel, and uncommenting the following snipped above line 338, which should mimic what the operator () does, will also fail, even if you set omp_requires unified_address in the header, which should enable pointer arithmetic with device pointers... // size_t offset_A =c * A.dpstrides[0]+c*A.dpstrides[1]; // host-side // size_t offset_L =c * L.dpstrides[0]+c*L.dpstrides[1]; // T* Adevptr=(T*)omp_get_mapped_ptr(A.dpdata,dev); // T* Ldevptr=(T*)omp_get_mapped_ptr(L.dpdata,dev); // T Acc=0; // omp_target_memcpy(&Acc, Adevptr, sizeof(T), 0, offset_A,omp_get_initial_device(),dev); // // temp4=Acc-tmp; // temp4=sqrt(temp4); // // omp_target_memcpy(Ldevptr, &temp4, sizeof(T), offset_L,0,dev, omp_get_initial_device());