Re: 1.76% performance loss in VRP due to inlining

Jakub Jelinek via Gcc Tue, 30 Apr 2024 12:23:04 -0700

On Tue, Apr 30, 2024 at 03:09:51PM -0400, Jason Merrill via Gcc wrote:
> On Fri, Apr 26, 2024 at 5:44 AM Aldy Hernandez via Gcc <gcc@gcc.gnu.org> 
> wrote:
> >
> > In implementing prange (pointer ranges), I have found a 1.74% slowdown
> > in VRP, even without any code path actually using the code.  I have
> > tracked this down to irange::get_bitmask() being compiled differently
> > with and without the bare bones patch.  With the patch,
> > irange::get_bitmask() has a lot of code inlined into it, particularly
> > get_bitmask_from_range() and consequently the wide_int_storage code.
> ...
> > +static irange_bitmask
> > +get_bitmask_from_range (tree type,
> > +                     const wide_int &min, const wide_int &max)
> ...
> > -irange_bitmask
> > -irange::get_bitmask_from_range () const
> 
> My guess is that this is the relevant change: the old function has
> external linkage, and is therefore interposable, which inhibits
> inlining.  The new function has internal linkage, which allows
> inlining.


Even when a function is exported, when not compiled with -fpic/-fPIC
if we know the function is defined in current TU, it can't be interposed,
Try
int
foo (int x)
{
  return x + 1;
}

int
bar (int x, int y)
{
  return foo (x) + foo (y);
}
with -O2 -fpic -fno-semantic-interposition vs. -O2 -fpic vs. -O2 -fpie vs.
-O2.

> Relatedly, I wonder if we want to build GCC with -fno-semantic-interposition?

It could be useful just for libgccjit.  And not sure if libgccjit users
don't want to interpose something.

        Jakub

Re: 1.76% performance loss in VRP due to inlining

Reply via email to