Hi Jakub, thanks for the prompt review. On 2020/10/23 8:13 PM, Jakub Jelinek wrote:
In general, upon encountering a construct, we can't statically determine and insert alloc/release maps for each element of a structure variable, since we don't really know which region of the structure is currently mapped or not, hence this probably can't be properly implemented in the compiler.Instead this patch tries to do the equivalent in the runtime: I've modified the handling of the (GOMP_MAP_STRUCT, <field-map1>, <field-map2>, ...) sequence to: (1) Create just a single splay_tree_key to represent the entire structure's mapped-region (all element target_var_desc's now reference this same key instead of creating their own), andI'm not sure that is what we want. If we create just a single splay_tree_key spanning the whole structure mapped region, then we can't diagnose various mapping errors. E.g. if I have: void bar (struct S *); struct S { int a, b, c, d, e; }; void foo (struct S s) { #pragma omp target data map(tofrom: s.b, s.d) #pragma omp target map (s.b, s.c) bar (&s); } then target data maps the &s.b to &s.d + 1 region of the struct, but s.c wasn't mapped and so the target region's mapping should fail, even when it is in the middle of the mapped region.
Are you really sure this is what we want? I don't quite see anything harmful about implicitly mapping "middle fields" like s.c, in fact the corresponding memory is actually "mapped" anyways.
The structure mapping wording was written in a way to give implementations a choice, either map the whole struct (space inefficient), or the region from the first to last element in the struct the needs mapping (what GCC implements, also space inefficient, but less so), or only map the fields individually and somehow remap all uses of the struct in the region (I think that is only theoretically possible if one can analyze the whole target region and rewrite anything that could refer to it in there).
That seems to imply that rejecting "middle fields" are not really required behavior.
So, I'd think instead of having just one splay_tree_key, we need multiple (we could merge adjacent ones though if we want) but we need some way to tie them together (e.g. represent them as one master entry (perhaps the first one) and slaves entries and use the refcount of the master entry for all of them.
I did think of that route before, but it's just too complex and unwieldly compared to an elegant solution like using one single splay_tree_key. I can try to think more about a "composite-key" like design, but please reconsider the current patch. It's already very close to the 5.0 spec, with what you mention not really "that" large an issue. Thanks, Chung-Lin
