Thanks for all your reviews and comments, very helpful!
Peter Bergner <berg...@linux.ibm.com> writes: > I think we just need to fix the bug in the current logic when checking > whether the caller's ISA flags supports the callee's ISA flags. ...and > for that, I think we just need to add a test that enforces that the > caller's ISA flags match exactly the callee's flags, for those flags > that were explicitly set in the callee. The patch below seems to fix > the issue (regtesting now). Does this look like what we want? Great sugguestions, thanks again! I was trying to figure out how to check 'explicitly disabled feature', after checking your sugguestion deeperly. I notice that, rs6000_isa_flags_explicit is set no matter a feature is "explicitly disabled" or "explicitly enabled". And to check a feature (e.g. VSX) is disabled explicitly, below code could help: (explicit_isa & VSX) && (callee_isa & explicit_isa & VSX == 0) This patch is indeed what I want. > > Peter > > > gcc/ > * config/rs6000/rs6000.c (rs6000_can_inline_p): Handle explicit > options. > > gcc.testsuite/ > * gcc.target/powerpc/pr70010.c: New test. > > > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 276975) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -23976,13 +23976,18 @@ rs6000_can_inline_p (tree caller, tree c > else > { > struct cl_target_option *caller_opts = TREE_TARGET_OPTION > (caller_tree); > + HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags; > struct cl_target_option *callee_opts = TREE_TARGET_OPTION > (callee_tree); > + HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags; > + HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit; > > - /* Callee's options should a subset of the caller's, i.e. a vsx > function > - can inline an altivec function but a non-vsx function can't inline a > - vsx function. */ > - if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags) > - == callee_opts->x_rs6000_isa_flags) > + /* The callee's options must be a subset of the caller's options, i.e. > + a vsx function may inline an altivec function, but a non-vsx function > + must not inline a vsx function. However, for those options that the > + callee has explicitly set, then we must enforce that the callee's > + and caller's options match exactly; see PR70010. */ > + if (((caller_isa & callee_isa) == callee_isa) > + && (caller_isa & explicit_isa) == (callee_isa & explicit_isa)) > ret = true; > } > > Index: gcc/testsuite/gcc.target/powerpc/pr70010.c > =================================================================== > --- gcc/testsuite/gcc.target/powerpc/pr70010.c (nonexistent) > +++ gcc/testsuite/gcc.target/powerpc/pr70010.c (working copy) > @@ -0,0 +1,21 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -finline" } */ > +/* { dg-final { scan-assembler "bl vadd_no_vsx" } } */ > + > +typedef int vec_t __attribute__((vector_size(16))); > + > +static vec_t > +__attribute__((__target__("no-vsx"))) > +vadd_no_vsx (vec_t a, vec_t b) > +{ > + return a + b; > +} > + > +vec_t > +__attribute__((__target__("vsx"))) > +call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > +{ > + return vadd_no_vsx (x, y) - z; > +} > > > > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.i > typedef int vec_t __attribute__((vector_size(16))); > > static vec_t > __attribute__((__target__("no-vsx"))) > vadd_no_vsx (vec_t a, vec_t b) > { > return a + b; > } > > vec_t > __attribute__((__target__("vsx"))) > call_vadd_no_vsx (vec_t x, vec_t y, vec_t z) > { > return vadd_no_vsx (x, y) - z; > } > > bergner@pike:~/gcc/BUGS/CPU$ old-gcc -O2 -finline -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > call_vadd_no_vsx: > vadduwm 2,2,3 > vsubuwm 2,2,4 > blr > > bergner@pike:~/gcc/BUGS/CPU$ > /home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc/xgcc > -B/home/bergner/gcc/build/gcc-fsf-mainline-cpu-flags-debug/gcc -O2 -finline > -S pr70010.i > bergner@pike:~/gcc/BUGS/CPU$ cat pr70010.s > vadd_no_vsx: > vadduwm 2,2,3 > blr > .long 0 > > call_vadd_no_vsx: > 0: addis 2,12,.TOC.-.LCF1@ha > addi 2,2,.TOC.-.LCF1@l > mflr 0 > std 0,16(1) > stdu 1,-48(1) > li 0,32 > stvx 31,1,0 > xxlor 63,36,36 > bl vadd_no_vsx > addi 1,1,48 > li 0,-16 > vsubuwm 2,2,31 > lvx 31,1,0 > ld 0,16(1) > mtlr 0 > blr