Hi, We're working on design for offloading support in GCC (part of OpenMP4), and I have a question regarding libgomp part.
Suppose we expand '#pragma omp target' like we expand '#pragma omp parallel', i.e. the compiler expands the following code: #pragma omp target { body; } to this: void subfunction (void *data) { use data; body; } setup data; function_name = "subfunction"; GOMP_offload (subfunction, &data, function_name); GOMP_offload is a call to libgomp, which will be implemented somehow like this: void GOMP_offload (void (*fn)(void*), void *data, const char *fname) { if (gomp_offload_available ()) { handler = gomp_upload_data (data); gomp_offload_call (fname, handler); gomp_download_data (&data, handler); } else { fn (data); } } Routines gomp_upload_data, gomp_offload_call and similar could, for example, use COI (see http://download-software.intel.com/sites/default/files/article/334766/intel-xeon-phi-systemssoftwaredevelopersguide_0.pdf) functions to perform actual data marshalling and calling routines on the target side. Does this generic scheme sounds ok to you? We'd probably want to be able to use the same compiler for different offload-targets, so it's important to decide how we would invoke different implementations of these routines with the same compiler. One way to do it is to use dlopen-routines - i.e. we try to load, say, "libtargetiface.so" and if it fails, we use some default (dummy) implementations - otherwise we use the versions from the library. In this approach, along with libgomp.so we'll need to have libtargetiface.so for each target we want to offload to. Is this way viable, or should it be done in some other way? -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation.