Hi, For the following test (adapted from pr96390.c): __attribute__((noipa)) int foo () { return 42; } int bar () __attribute__((alias ("foo"))); int baz () __attribute__((alias ("bar")));
int main () { int n; #pragma omp target map(from:n) n = baz (); return n; } Compiling with -fopenmp -foffload=nvptx-none -foffload=-malias -foffload=-mptx=6.3 results in: ptxas fatal : Internal error: alias to unknown symbol nvptx-as: ptxas returned 255 exit status nvptx mkoffload: fatal error: ../../install/bin/aarch64-unknown-linux-gnu-accel-nvptx-none-gcc returned 1 exit status compilation terminated. lto-wrapper: fatal error: /home/prathameshk/gnu-toolchain/gcc/grcogcc-38/install/libexec/gcc/aarch64-unknown-linux-gnu/15.0.0//accel/nvptx-none/mkoffload returned 1 exit status compilation terminated. This happens because ptx code-gen shows: // BEGIN GLOBAL FUNCTION DEF: foo .visible .func (.param.u32 %value_out) foo { .reg.u32 %value; mov.u32 %value, 42; st.param.u32 [%value_out], %value; ret; } .visible .func (.param.u32 %value_out) bar; .alias bar,foo; .visible .func (.param.u32 %value_out) baz; .alias baz,bar; .alias baz, bar is invalid since PTX requires aliasee to be a defined function: https://sw-docs-dgx-station.nvidia.com/cuda-latest/parallel-thread-execution/latest-internal/#kernel-and-function-directives-alias The patch uses cgraph_node::get(name)->ultimate_alias_target () instead of the provided value in nvptx_asm_output_def_from_decls. For the above case, it now generates the following ptx: .alias baz,foo; instead of: .alias baz,bar; which fixes the issue. Does the patch look in the right direction ? Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> Thanks, Prathamesh
[nvptx] Fix code-gen for alias attribute. For the following test (adapted from pr96390.c): __attribute__((noipa)) int foo () { return 42; } int bar () __attribute__((alias ("foo"))); int baz () __attribute__((alias ("bar"))); int main () { int n; #pragma omp target map(from:n) n = baz (); return n; } gcc emits following ptx for baz: .visible .func (.param.u32 %value_out) bar; .alias bar,foo; .visible .func (.param.u32 %value_out) baz; .alias baz,bar; which is incorrect since PTX requires aliasee to be a defined function. The patch instead uses cgraph_node::get(name)->ultimate_alias_target, which generates the following PTX: .visible .func (.param.u32 %value_out) baz; .alias baz,foo; gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls): Use cgraph_node::get(name)->ultimate_alias_target instead of value. Signed-off-by: Prathamesh Kulkarni <prathame...@nvidia.com> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc index 2a8f713c680..9688b0e6f2d 100644 --- a/gcc/config/nvptx/nvptx.cc +++ b/gcc/config/nvptx/nvptx.cc @@ -7583,7 +7583,8 @@ nvptx_mem_local_p (rtx mem) while (0) void -nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value) +nvptx_asm_output_def_from_decls (FILE *stream, tree name, + tree value ATTRIBUTE_UNUSED) { if (nvptx_alias == 0 || !TARGET_PTX_6_3) { @@ -7618,7 +7619,8 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value) return; } - if (!cgraph_node::get (name)->referred_to_p ()) + cgraph_node *cnode = cgraph_node::get (name); + if (!cnode->referred_to_p ()) /* Prevent "Internal error: reference to deleted section". */ return; @@ -7627,8 +7629,10 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value) fputs (s.str ().c_str (), stream); tree id = DECL_ASSEMBLER_NAME (name); + symtab_node *alias_target_node = cnode->ultimate_alias_target (); + tree alias_target_id = DECL_ASSEMBLER_NAME (alias_target_node->decl); NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id), - IDENTIFIER_POINTER (value)); + IDENTIFIER_POINTER (alias_target_id)); } #undef NVPTX_ASM_OUTPUT_DEF