On 10/14/19 2:57 PM, Segher Boessenkool wrote:
> On Mon, Oct 14, 2019 at 06:35:06PM +0200, Richard Biener wrote:
>> The general case should be that if the caller ISA supports the callee one
>> then inlining is OK. If this is not wanted in some cases then there are
>> options like using a noinline attribute. 
> 
> I agree, and that is what we already do afaik.

I agree on the making sure the caller's ISA supports the callee's ISA
before allowing inlining...and I think that's what our code it "trying"
to do.  It just has some bugs.


> But in this case, the callee explicitly disables something (-mno-vsx),
> while the caller has it enabled (all modern cpus have VSX).  If it ends
> up being inlined, it will get VSX insns generated for it.  Which is what
> Jiu Fu's patch aims to prevent.
> 
> So you are saying the GCC policy is that you should use noinline on the
> callee in such cases?

I don't think sprinkling noinline's around will work given LTO can
inline random functions from one object file into a function in another
object file and we have no idea what the options were used for both files.
I think that would just force us to end up putting nolines on all fuctions
that might be compiled with LTO.

I think we just need to fix the bug in the current logic when checking
whether the caller's ISA flags supports the callee's ISA flags. ...and
for that, I think we just need to add a test that enforces that the
caller's ISA flags match exactly the callee's flags, for those flags
that were explicitly set in the callee.  The patch below seems to fix
the issue (regtesting now).  Does this look like what we want?

Peter


gcc/
        * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit
        options.

gcc.testsuite/
        * gcc.target/powerpc/pr70010.c: New test.


Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 276975)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -23976,13 +23976,18 @@ rs6000_can_inline_p (tree caller, tree c
   else
     {
       struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+      HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
       struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
+      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
 
-      /* Callee's options should a subset of the caller's, i.e. a vsx function
-        can inline an altivec function but a non-vsx function can't inline a
-        vsx function.  */
-      if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags)
-         == callee_opts->x_rs6000_isa_flags)
+      /* The callee's options must be a subset of the caller's options, i.e.
+        a vsx function may inline an altivec function, but a non-vsx function
+        must not inline a vsx function.  However, for those options that the
+        callee has explicitly set, then we must enforce that the callee's
+        and caller's options match exactly; see PR70010.  */
+      if (((caller_isa & callee_isa) == callee_isa)
+         && (caller_isa & explicit_isa) == (callee_isa & explicit_isa))
        ret = true;
     }
 
Index: gcc/testsuite/gcc.target/powerpc/pr70010.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr70010.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr70010.c  (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -finline" } */
+/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */
+
+typedef int vec_t __attribute__((vector_size(16)));
+
+static vec_t
+__attribute__((__target__("no-vsx")))
+vadd_no_vsx (vec_t a, vec_t b)
+{
+  return a + b;
+}
+
+vec_t
+__attribute__((__target__("vsx")))
+call_vadd_no_vsx (vec_t x, vec_t y, vec_t z)
+{
+  return vadd_no_vsx (x, y) - z;
+}

 

bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.i 
typedef int vec_t __attribute__((vector_size(16)));

static vec_t
__attribute__((__target__("no-vsx")))
vadd_no_vsx (vec_t a, vec_t b)
{
  return a + b;
}

vec_t
__attribute__((__target__("vsx")))
call_vadd_no_vsx (vec_t x, vec_t y, vec_t z)
{
  return vadd_no_vsx (x, y) - z;
}

bergner@pike:~/gcc/BUGS/CPU$ old-gcc -O2 -finline -S pr70010.i
bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s 
call_vadd_no_vsx:
        vadduwm 2,2,3
        vsubuwm 2,2,4
        blr

bergner@pike:~/gcc/BUGS/CPU$ 
/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc/xgcc 
-B/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc -O2 -finline -S 
pr70010.i 
bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s 
vadd_no_vsx:
        vadduwm 2,2,3
        blr
        .long 0

call_vadd_no_vsx:
0:      addis 2,12,.TOC.-.LCF1@ha
        addi 2,2,.TOC.-.LCF1@l
        mflr 0
        std 0,16(1)
        stdu 1,-48(1)
        li 0,32
        stvx 31,1,0
        xxlor 63,36,36
        bl vadd_no_vsx
        addi 1,1,48
        li 0,-16
        vsubuwm 2,2,31
        lvx 31,1,0
        ld 0,16(1)
        mtlr 0
        blr

Reply via email to