On 9/6/24 11:30 AM, Jin Ma wrote:
When we use flto, the function list of rvv will be generated twice,
once in the cc1 phase and once in the lto phase. However, due to
the different generation methods, the two lists are different.
For example, when there is no zvfh or zvfhmin in arch, it is
generated by calling function "riscv_pragma_intrinsic". since the
TARGET_VECTOR_ELEN_FP_16 is enabled before rvv function generation,
a list of rvv functions related to float16 will be generated. In
the lto phase, the rvv function list is generated only by calling
the function "riscv_init_builtins", but the TARGET_VECTOR_ELEN_FP_16
is disabled, so that the float16-related rvv function list cannot
be generated like cc1. This will cause confusion, resulting in
matching tothe wrong function due to inconsistent fcode in the lto
phase, eventually leading to ICE.
So I think we should be consistent with their generated lists, which
is exactly what this patch does.
But there is still a problem here. If we use "-fchecking", we still
have ICE. This is because in the lto phase, after the rvv function
list is generated and before the expand_builtin, the ggc_grow will
be called to clean up the memory, resulting in
"(* registered_functions)[code]->decl" being cleaned up to
"<ggc_freed 0x7ffff6830c00>, and finally ICE".
I think this is wrong and needs to be fixed, maybe we shouldn't
use "ggc_alloc<registered_function> ()", or is there another better
way to implement it?
In general allocating things with the collector API is safe.
But it's ultimately a garbage collector, so if the object is not
reachable via the registered GC roots, then it'll get collected. This
is the most common issue that folks run into.
jeff