Hi Kwok, hi Jakub, hi all, some first comments based on both playing around and reading the patch - and some generic comments to any patch reader.
In general, the patch looks good. I just observe: * There is an issue with [[omp::decl(...)]]' * <clause>(<boolean expression>) - there is a C/C++ inconsistency in what is expected; it possibly affects more such conditions * Missed optimization for the host? * Bunch of minor comments On 08.10.23 15:13, Kwok Cheung Yeung wrote:
This patch adds support for the 'indirect' clause in the 'declare target' directive in C/C++ (Fortran to follow) and adds the necessary infrastructure to support indirect calls in target regions. This allows one to pass in pointers to functions that have been declared as indirect from the host to the target, then invoked via the passed-in pointer on the target device. [...] The C++ support is currently limited to normal indirect calls - virtual calls on objects do not currently work. I believe the main issue is that the vtables are not currently copied across to the target. I have added some handling for OBJ_TYPE_REF to prevent the compiler from ICEing when it encounters a virtual call, but without the vtable this cannot work properly.
Side remark: Fortran polymorphic variables are similar. For them also a vtable needs to be copied. (For vtables, see also comment to 'libgomp.texi' far below.) * * * C++11 (and C23) attribute do not seem to be properly handled: [[omp::decl (declare target,indirect(1))]] int foo(void) { return 5; } [[omp::decl (declare target indirect)]] int bar(void) { return 8; } [[omp::directive (begin declare target,indirect)]]; int baz(void) { return 11; } [[omp::directive (end declare target)]]; While I get for the last one ("baz"): __attribute__((omp declare target, omp declare target block, omp declare target indirect)) the first two (foo and bar) do not have any attribute; if I remove the "indirect", I do get "__attribute__((omp declare target))". Hence, the omp::decl support seems to partially work. NOTE: C23 omp:: attribute support is still WIP and not yet in mainline. Recent draft: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633007.html The following works - but there is not a testcase for either syntax: int bar(void) { return 8; } [[omp::directive(declare target to(bar) , indirect(1))]]; int baz(void) { return 11; } [[omp::directive ( declare target indirect enter(baz))]]; int bar(void) { return 8; } #pragma omp declare target to(bar) , indirect(1) int baz(void) { return 11; } #pragma omp declare target indirect enter(baz) (There is one for #pragma + 'to' in gomp/declare-target-indirect-2.c, however.) Side remark: OpenMP 5.2 renamed 'to' to 'enter' (with deprecated alias 'to); hence, I also use 'enter' above. The current testcases for indiredt use 'enter'. (Not that it should make a difference as the to/enter do work.) The following seems to work fine, but I think we do not have a testcase for it ('bar' has no indirect, foo and baz have it): #pragma omp begin declare target indirect(1) int foo(void) { return 5; } #pragma omp begin declare target indirect(0) int bar(void) { return 8; } int baz(void) { return 11; } #pragma omp declare target indirect enter(baz) #pragma omp end declare target #pragma omp end declare target * * * Possibly affecting other logical flags as well, but I do notice that gcc but not g++ accepts the following: #pragma omp begin declare target indirect("abs") #pragma omp begin declare target indirect(5.5) g++ shows: error: expected constant integer expression OpenMP requires 'constant boolean' expr (OpenMP 5.1) or 'expression of logical type','constant' (OpenMP 5.2), where for the latter it has: "The OpenMP *logical type* supports logical variables and expressions in any base language. "[C / C++] Any OpenMP logical expression is a scalar expression. This document uses true as a generic term for a non-zero integer value and false as a generic term for an integer value of zero." I am not quite sure what to expect here; in terms of C++, conv.bool surely permits those for those pvalues "Boolean conversions". For C, I don't find the wording in the standard but 'if("abc")' and 'if (5.5)' is accepted. * * * I notice that the {__builtin_,}GOMP_target_map_indirect_ptr call is inserted quite late, i.e. in omp-offload.cc. A dump and also looking at the *.s files shows that the __builtin_GOMP_target_map_indirect_ptr / call GOMP_target_map_indirect_ptr do not only show up for the device but also for the host-fallback code. I think the latter is not required as a host pointer can be directly executed on the host - and device -> host pointer like in omp target device(ancestor:1) do not need to be supported. Namely the current glossary (here git version but OpenMP 5.2 is very similar); note the "other than the host device": "indirect device invocation - An indirect call to the device version of a procedure on a device other than the host device, through a function pointer (C/C++), a pointer to a member function (C++) or a procedure pointer (Fortran) that refers to the host version of the procedure. Can't we use #ifdef ACCEL_COMPILER to optimize the host fallback? That way, we can also avoid generating the splay-tree on the host cf. LIBGOMP_OFFLOADED_ONLY. * * * #pragma omp begin declare target indirect(1) device_type(host) is accepted but it violates: OpenMP 5.1: "Restrictions to the declare target directive are as follows:" "If an indirect clause is present and invoked-by-fptr evaluates to true then the only permitted device_type clause is device_type(any)" [215:1-2] In OpenMP 5.2 that's in "7.8.3 indirect Clause" itself. * * * OpenMP permits pointers to member functions. Can you also a test for those? I bet it simply works but we should still test those. (For vtables, see also comment below.) class Foo { public: int f(int x); }; typedef int (Foo::*FooFptr)(int x); ... int my_call(Foo &foo) { FooFptr fn_ptr = &Foo::f; ... return std::invoke(fn_ptr, foo, 42); } * * * Side remarks for patch readers: Besides the existing offload functions/variables tables, a new indirect-functions table is created; functions there do aren't added to the function table (in the compiler-generated code). While the code obviously affects the performance, it does not hamper optimizations such as (modified libgomp.c-c++-common/declare-target-indirect-1.c): #pragma omp target map (from: x) { int (*foo_ptr) (void) = &foo; int (*bar_ptr) (void) = &bar; int (*baz_ptr) (void) = &baz; x = (*foo_ptr) () + (*bar_ptr) () + (*baz_ptr) (); } which still gives with -O1: _8 = *.omp_data_i_7(D).x; *_8 = 24; There was some discussion at OpenMP spec level to avoid the overhead via new assumptions which tell that 'indirect' can or cannot be encountered. That was a more resent side discussion of the generic questions in my Issue #3540. I think in general there aren't that many function pointers around and if the extra function call happens late enough (which seems to be the case, see above), it shouldn't be a real problem in most code, also because the indirect-reverse-lookup table is short and hopefully there won't be too many repeated lookups with either function pointers or C++ classes/Fortran polymorphic calls. * * *
--- a/gcc/omp-offload.cc +++ b/gcc/omp-offload.cc } + if (omp_redirect_indirect_calls + && gimple_call_fndecl (stmt) == NULL_TREE) + { + gcall *orig_call = dyn_cast <gcall *> (stmt); + tree call_fn = gimple_call_fn (stmt); + tree map_ptr_fn + = builtin_decl_explicit (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR);
This line is too long. Maybe use a 'enum built_in_function' temporary?
--- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -304,7 +304,7 @@ The OpenMP 4.5 specification is fully supported. @item Iterators in @code{target update} motion clauses and @code{map} clauses @tab N @tab @item Indirect calls to the device version of a procedure or function in - @code{target} regions @tab N @tab + @code{target} regions @tab P @tab Only C and C++
I think we need a new entry to handle the virtual part. However, it looks as if that's a new OpenMP 5.2 feature. Can you add an entry under "Other new OpenMP 5.2 features2? At least I cannot find any existing entry and I only see in OpenMP 5.2: "Invoking a virtual member function of an object on a device other than the device on which the object was constructed results in unspecified behavior, unless the object is accessible and was constructed on the host device." [OpenMP 5.2, 287:10-12] in "Restrictions to the target construct".
--- a/libgomp/target.c +++ b/libgomp/target.c @@ -2256,11 +2256,14 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, void **host_funcs_end = ((void ***) host_table)[1]; void **host_var_table = ((void ***) host_table)[2]; void **host_vars_end = ((void ***) host_table)[3]; + void **host_ind_func_table = ((void ***) host_table)[4]; + void **host_ind_funcs_end = ((void ***) host_table)[5];
This code assumes that all calls have now 6 arguments. But that's not true for old code. It seems as if you have to bump the version number and only access those values when the version number is sufficiently large. Thanks, Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955