On Tue, 22 Aug 2017, Richard Biener wrote:

> On August 22, 2017 6:38:55 PM GMT+02:00, "Richard Earnshaw (lists)" 
> <richard.earns...@arm.com> wrote:
> >On 22/08/17 13:55, Kyrill Tkachov wrote:
> >> Hi Richard,
> >> [roping in more aarch64 maintainers]
> >> 
> >> On 22/08/17 13:27, Richard Biener wrote:
> >>> On Tue, 22 Aug 2017, Uros Bizjak wrote:
> >>>
> >>>> On Tue, Aug 22, 2017 at 12:15 PM, Richard Biener
> ><rguent...@suse.de>
> >>>> wrote:
> >>>>> The following patch fixes PR81921 (and LTO build of libgo) which I
> >ran
> >>>>> into when trying to enable free-lang-data for non-LTO compiles.
> >>>>>
> >>>>> free-lang-data forces a DECL_FUNCTION_SPECIFIC_TARGET for all
> >functions
> >>>>> so we have them ending up with target_option_default_node
> >eventually
> >>>>> which is something ix86_can_inline_p doesn't expect (I tried
> >forcing
> >>>>> a compare of the actual options but that fails as well as we get
> >>>>> spurious differences in use-fpmath, the default node with -m32
> >>>>> -march=x86_64 doesn't have it while non-default nodes have it).
> >>>>>
> >>>>> The patch is what I consider safe for branches, we might want to
> >work
> >>>>> on sth better (actually comparing always and fixing the fpmath
> >issue)
> >>>>> on trunk as followup.
> >>>>>
> >>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for
> >trunk
> >>>>> and active branches?
> >>>>>
> >>>>> Note the change to the ref = false conditional isn't strictly
> >necessary
> >>>>> but it makes -flto and non-flto behave consistently.
> >>>>>
> >>>>> Thanks,
> >>>>> Richard.
> >>>>>
> >>>>> 2017-08-22  Richard Biener  <rguent...@suse.de>
> >>>>>
> >>>>>          PR target/81921
> >>>>>          * config/i386/i386.c (ix86_can_inline_p): Treat
> >>>>>          target_option_default_node as non-existent.
> >>>> LGTM.
> >>>>
> >>>> Please give the patch some soaking time in the mainline before
> >>>> backporting it to release branches.
> >>> Thanks.  Of course this was copied by other targets (and the x86
> >>> one maybe from the default).  So the following is an extended patch.
> >>>
> >>> Ok for the rs6000 and aarch64 bits?
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >>> 2017-08-22  Richard Biener  <rguent...@suse.de>
> >>>
> >>>     PR target/81921
> >>>     * config/i386/i386.c (ix86_can_inline_p): Treat
> >>>     target_option_default_node as non-existent.
> >>>     * targhooks.c (default_target_can_inline_p): Likewise.
> >>>     * config/aarch64/aarch64.c (aarch64_can_inline_p): Likewise.
> >>>     * config/powerpcspe/powerpcspe.c (rs6000_can_inline_p):
> >Likewise.
> >>>     * config/rs6000/rs6000.c (rs6000_can_inline_p): Likewise.
> >>>
> >>> Index: gcc/config/i386/i386.c
> >>> ===================================================================
> >>> --- gcc/config/i386/i386.c    (revision 251266)
> >>> +++ gcc/config/i386/i386.c    (working copy)
> >>> @@ -7507,12 +7507,12 @@ ix86_can_inline_p (tree caller, tree cal
> >>>     tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> >>>       /* If callee has no option attributes, then it is ok to
> >inline.  */
> >>> -  if (!callee_tree)
> >>> +  if (!callee_tree || callee_tree == target_option_default_node)
> >>>       ret = true;
> >>>       /* If caller has no option attributes, but callee does then it
> >>> is not ok to
> >>>        inline.  */
> >>> -  else if (!caller_tree)
> >>> +  else if (!caller_tree || caller_tree ==
> >target_option_default_node)
> >>>       ret = false;
> >>>       else
> >>> Index: gcc/testsuite/gcc.target/i386/pr81921.c
> >>> ===================================================================
> >>> --- gcc/testsuite/gcc.target/i386/pr81921.c    (nonexistent)
> >>> +++ gcc/testsuite/gcc.target/i386/pr81921.c    (working copy)
> >>> @@ -0,0 +1,14 @@
> >>> +/* { dg-do compile } */
> >>> +/* { dg-require-effective-target lto } */
> >>> +/* { dg-options "-flto -march=x86-64" } */
> >>> +
> >>> +extern __inline int  __attribute__((__gnu_inline__,
> >>> __always_inline__, __artificial__, target("sse2")))
> >>> +_mm_loadu_si128 (int const *__P)
> >>> +{
> >>> +    return *__P;
> >>> +}
> >>> +
> >>> +void __attribute__((target("ssse3"))) foo (void *p)
> >>> +{
> >>> +  volatile int x = _mm_loadu_si128 (p);
> >>> +}
> >>> Index: gcc/targhooks.c
> >>> ===================================================================
> >>> --- gcc/targhooks.c    (revision 251274)
> >>> +++ gcc/targhooks.c    (working copy)
> >>> @@ -1447,12 +1447,12 @@ default_target_can_inline_p (tree caller
> >>>     tree caller_opts = DECL_FUNCTION_SPECIFIC_TARGET (caller);
> >>>       /* If callee has no option attributes, then it is ok to inline
> >*/
> >>> -  if (!callee_opts)
> >>> +  if (!callee_opts || callee_tree == target_option_default_node)
> >>>       ret = true;
> >>>       /* If caller has no option attributes, but callee does then it
> >>> is not ok to
> >>>        inline */
> >>> -  else if (!caller_opts)
> >>> +  else if (!caller_opts || caller_tree ==
> >target_option_default_node)
> >>>       ret = false;
> >>>       /* If both caller and callee have attributes, assume that if
> >the
> >>> Index: gcc/config/aarch64/aarch64.c
> >>> ===================================================================
> >>> --- gcc/config/aarch64/aarch64.c    (revision 251274)
> >>> +++ gcc/config/aarch64/aarch64.c    (working copy)
> >>> @@ -10074,7 +10074,7 @@ aarch64_can_inline_p (tree caller, tree
> >>>     tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> >>>       /* If callee has no option attributes, then it is ok to
> >inline.  */
> >>> -  if (!callee_tree)
> >>> +  if (!callee_tree || callee_tree == target_option_default_node)
> >>>       return true;
> >>>   
> >> 
> >> The aarch64 changes looks ok to me FWIW (since I wrote that function)
> >> 
> >
> >I *think* this is probably OK for AArch64, but I don't think it's safe
> >in general.
> >
> >Consider, for example, the arm port where we can have functions built
> >for ARM and functions built for Thumb.  You cannot, in general, inline
> >either into the other, (think inline asm, or isa-specific intrinsic
> >operations).
> >
> >I think this approach only works at all because it's generally
> >meaningless to have some ISA that is the default and then construct
> >other functions later in the compilation that deliberately target a
> >smaller subset of the ISA.  But such an assumption doesn't apply when
> >you construct functions in completely separate ISAs which are
> >call-compatible but otherwise distinct.
> 
> Note the patch doesn't change anything for NON-LTO, where you never see 
> target_option_default_node. Just LTO forces those in place of NULL to 
> make cross-TU behavior more like you suggest.

So thinking about this some more we lack the ability to distinguish
between no explicit target attribute and an explicit target attribute
that matches the global defaults with LTO.  Without LTO it is the
target that decides whether to put target_option_default_node into
DECL_FUNCTION_SPECIFIC_TARGET but LTO treats the global options
as explicit target attribute for all functions of a TU.

That said, the

  /* If callee has no option attributes, then it is ok to inline */
  if (!callee_opts)
    ret = true;

tries to allow inlining of functions with no explicit attribute
into any other function which IMHO is reasonable?

That argues for sth like DECL_FUNCTION_SPECIFIC_TARGET_USER_P and
DECL_FUNCTION_SPECIFIC_OPTIMIZATION_USER_P.

Does that make any sense?  The question is whether, in LTO context,
differing global options are USER_P or not ...

OTOH as you say usually function specific opts are a superset of
global opts which means if the target properly checks two option
sets against each other the effect should be similar.

Thanks,
Richard.

Reply via email to