On Mon, Mar 8, 2021 at 4:46 PM Alex Coplan <alex.cop...@arm.com> wrote: > > On 08/03/2021 16:21, Richard Biener wrote: > > On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.cop...@arm.com> wrote: > > > > > > On 08/03/2021 14:57, Richard Biener wrote: > > > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > As discussed in the PR, we currently have two different numbering > > > > > schemes for SVE builtins: one for C, and one for C++. This is > > > > > problematic for LTO, where we end up getting confused about which > > > > > intrinsic we're talking about. This patch inserts placeholders into > > > > > the > > > > > registered_functions vector to ensure that there is a consistent > > > > > numbering scheme for both C and C++. > > > > > > > > > > Initially I tried using a null decl for these placeholders, but this > > > > > trips up some validation when the builtin trees are streamed into > > > > > lto1, > > > > > > > > Care to elaborate? Note we stream builtins by their function code > > > > and thus expect to be able to materialize them later - materializing > > > > to NULL obviously fails but then the issue is the proper decl isn't > > > > there > > > > in that case. Materializing a "dummy" decl just will delay the > > > > inevitable > > > > breakage. > > > > > > So the validation I was referring to is this part of > > > tree-streamer-in.c:unpack_ts_function_decl_value_fields: > > > > > > else if (cl == BUILT_IN_MD) > > > { > > > tree result = targetm.builtin_decl (fcode, true); > > > if (!result || result == error_mark_node) > > > fatal_error (input_location, "target specific builtin not > > > available"); > > > } > > > > > > Because we stream the original decl and reconstruct it, I was assuming > > > that we would be using this (real) node everywhere and we would never > > > need to map from function code to tree node. Indeed, grepping for > > > "targetm.builtin_decl", the only results are this use, plus a few uses > > > in the avr backend, so it seems that right now this doesn't matter for > > > AArch64 (unless I'm missing some other path). > > > > Hmm, I thought we were streaming those by reference - indeed > > this probably got changed when we stopped doing that for > > the middle-end builtins which we did because those are often > > "misrecognized". > > > > > However, it seems that returning a subtly incorrect result for this > > > taget hook is asking for trouble, and we should return the real node > > > here. > > > > Yeah, the intent of this target hook is exactly that. There's now of course > > the option to simply remove it if it just exists for the checking done above > > (need to check the avr backend, of course). I suppose the cost in the > > LTO IR stream isn't so big thus we can safely ditch the optimization > > forever? > > The option of just removing the target hook seems appealing, if that > would be acceptable. It looks like avr could easily be changed to just > call its internal implementation (avr_builtin_decl) instead of the > target hook.
OK, let's not rush things but do this for next stage1 if desired. Given you know how the target hook works you can also return integer_zero_node to make the code happy ... (just building fake decls feels somewhat odd) > > > > > Richard, do you have a feeling for how we should do this? Perhaps > > > instantiate real nodes (instead of placeholders) if in_lto_p? Or > > > unconditionally? > > > > > > Thanks, > > > Alex > > > > > > > > > > > > so I ended up building dummy FUNCTION_DECLs as placeholders. > > > > > > > > > > To get better test coverage here, I'm wondering if we could make some > > > > > of > > > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on > > > > > patch? > > > > > For now I've just added the specific testcase from the PR. > > > > > > > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > > > > > > > > > Thanks, > > > > > Alex > > > > > > > > > > --- > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR target/99216 > > > > > * config/aarch64/aarch64-sve-builtins.cc > > > > > (function_builder::add_function): Add placeholder_p argument, > > > > > build > > > > > placeholder decls if this is set. > > > > > (function_builder::add_unique_function): Instead of > > > > > conditionally adding > > > > > direct overloads, unconditionally add either a direct > > > > > overload or a > > > > > placeholder. > > > > > (function_builder::add_overloaded_function): Set > > > > > placeholder_p if we're > > > > > using C++ overloads. > > > > > (function_builder::add_overloaded_functions): Don't return > > > > > early for > > > > > m_direct_overloads: we need to add placeholders. > > > > > * config/aarch64/aarch64-sve-builtins.h > > > > > (function_builder::add_function): Add placeholder_p argument. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR target/99216 > > > > > * g++.target/aarch64/sve/pr99216.C: New test.