Hi! I have tried a few things, and got things somewhat working, but I'm not satisfied with my results so far, so I'd like to ask for help. OpenACC specifics are not relevant to my question, which I'm thus formulating in a very generic way. (But find an illustrative example at the end of the email.)
If attached to a function declaration X (using a function attribute, basically), the OpenACC bind clause specifies that when compiling for an offloading target, all calls to function X should be diverted to function Y, and the body of function X be discarded. X remains the call target when compiling for the host. Y may be different per offloading target. In the generic case, Y will be identified with an assembler name. The requirements mandate an implementation in the LTO front end (which is the entry point for every offloading compiler), or later. Is the LTO front end the right place to do this? After read_cgraph_and_symbols or somewhere else? As we're not going to use it in the offloaded code (it's unreachable), my first thought was: for all decls (X) that have a bind (Y) clause attached, set the decl X's assembler name to Y's (using symtab->change_decl_assembler_name -- or gcc/varasm.c:set_user_assembler_name?). That somewhat works, but Y will then be compiled to X's name, and I saw problems if not only X's declaration but also its definition were available, because we'd then get two function definitions with X's (assembler) name, and I didn't manage to discard only the original (unreachable) X definition while keeping its decl alive (with assembler name Y), which is still used at all call sites. Maybe the wrong approach after all... I'm able to look up cgraph_node::get_for_asmname([Y]), and I tried experimenting with cgraph_node::create_alias and resolve_alias (in the LTO front end) but that also hasn't been completely successful: this worked if compiling with optimizations (Y even got inlined at the call site of X, good!), but it didn't work with -O0. I found the redirect_callee and redirect_call_stmt_to_callee functions of cgraph_edge -- is that something I should be using? (Still in the LTO front end?) Or, should I do this redirection after the LTO front end, in an early pass (execute_oacc_device_lower?). That is, for every current_function_decl, locate all calls to all functions tagged with a bind clause, and then rewrite the call sites to Y instead of X? An illustrative example: #pragma acc routine int Y() { return 2; } #pragma acc routine bind(Y) int X() { return 1; } int main() { int ret; #pragma acc parallel copyout(ret) ret = X(); return ret; } If running with ACC_DEVICE_TYPE=host, this should return 1, and if running with ACC_DEVICE_TYPE=not_host, it should return 2. Grüße Thomas