On Thu, 14 Mar 2024, Jan Hubicka wrote:

> > > We have wrong code with LTO, too.
> > 
> > I know.
> > 
> > > The problem is that IPA passes (and
> > > not only that, loop analysis too) does analysis at compile time (with
> > > value numbers in) and streams the info separately.
> > 
> > And that is desirable, because otherwise it simply couldn't derive any
> > ranges.
> > 
> > >  Removal of value ranges
> > > (either by LTO or by your patch) happens between computing these
> > > summaries and using them, so this can be used to trigger wrong code,
> > > sadly.
> > 
> > Yes.  But with LTO, I don't see how the IPA ICF comparison whether
> > two functions are the same or not could be done with the
> > SSA_NAME_{RANGE,PTR}_INFO in, otherwise it could only ICF merge functions
> > from the same TUs.  So the comparison IMHO (and the assert checks in my
> > patch prove that) is done when the SSA_NAME_{RANGE,PTR}_INFO aren't in
> > anymore.  So, one just needs to compare and punt or union whatever
> > is or could be influenced in the IPA streamed data from the ranges etc.
> > And because one has to do it for LTO, doing it for non-LTO should be
> > sufficient too.
> 
> Sorry, this was bit of a misunderstanding: I tought you still considered
> the original patch to be full fix, while I tought I should look into it
> more and dig out more issues.  This is bit of can of worms.  Overall I
> think the plan is:
> 
> This stage4
> 1) fix VR divergences by either comparing or droping them
> 2) fix jump function differences by comparing them
>    (I just constructed a testcase showing that jump functions can
>     diverge for other reasons than just VR, so this is deeper problem,
>     too)
> 3) try to construct aditional wrong code testcases (finite_p
>    divergences, trapping)
> Next stage1
> 4) implement VR and PTR info streaming
> 5) make ICF to compare VRs and punt otherwise
> 6) implement optimize_size feature to ICF that will not punt on
>    diverging TBAA or value ranges and do merging instead.
>    We need some extra infrastructure for that, being able to keep the
>    maps between basic blocks and SSA names from comparsion stage to
>    merging stage
> 7) measure how effective ICF becomes in optimize_size mode and implement
>    heuristics on how much metadata merging we want to do with -O2/-O3 as
>    well.
>    Based on older data Martin collected for his thesis, ICF used to be
>    must more effective on libreoffice then it is today, so hopefully we
>    can recover 10-15% binary size improvmeents here.
> 8) General ICF TLC.  There are many half finished things for a while in
>    this pass (such as merging with different BB or stmt orders).
> 
> I am attaching the compare patch which also fixes the original wrong
> code. If you preffer your version, just go ahead to commit it. Otherwise
> I will add your testcase for this patch and commit this one.
> Statistically we almost never merge functions with different value
> ranges (three in testsuite, 0 during bootstrap, 1 during LTO bootstrap
> and probably few in LLVM build - there are 15 cases reported, but some
> are false positives caused by hash function conflicts).
> 
> Honza
> 
> gcc/ChangeLog:
> 
>       * ipa-icf-gimple.cc (func_checker::compare_ssa_name): Compare value 
> ranges.
> 
> diff --git a/gcc/ipa-icf-gimple.cc b/gcc/ipa-icf-gimple.cc
> index 8c2df7a354e..37c416d777d 100644
> --- a/gcc/ipa-icf-gimple.cc
> +++ b/gcc/ipa-icf-gimple.cc
> @@ -39,9 +39,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cfgloop.h"
>  #include "attribs.h"
>  #include "gimple-walk.h"
> +#include "value-query.h"
> +#include "value-range-storage.h"
>  
>  #include "tree-ssa-alias-compare.h"
>  #include "ipa-icf-gimple.h"
>  
>  namespace ipa_icf_gimple {
>  
> @@ -109,6 +116,35 @@ func_checker::compare_ssa_name (const_tree t1, 
> const_tree t2)
>    else if (m_target_ssa_names[i2] != (int) i1)
>      return false;
>  
> +  if (POINTER_TYPE_P (TREE_TYPE (t1)))
> +    {
> +      if (SSA_NAME_PTR_INFO (t1))
> +     {
> +       if (!SSA_NAME_PTR_INFO (t2)
> +           || SSA_NAME_PTR_INFO (t1)->align != SSA_NAME_PTR_INFO (t2)->align
> +           || SSA_NAME_PTR_INFO (t1)->misalign != SSA_NAME_PTR_INFO 
> (t2)->misalign)

You want to compare SSA_NAME_PTR_INFO ()->pt.zero as well I think since
we store pointer non-null-ness from VRP there.

You are not comparing the actual points-to info - but of course I'd
expect that to differ as soon as local decls are involved.  Still looks
like a hole to me.

> +         return false;
> +     }
> +      else if (SSA_NAME_PTR_INFO (t2))
> +     return false;
> +    }
> +  else
> +    {
> +      if (SSA_NAME_RANGE_INFO (t1))
> +     {
> +       if (!SSA_NAME_RANGE_INFO (t2))
> +         return false;
> +       Value_Range r1 (TREE_TYPE (t1));
> +       Value_Range r2 (TREE_TYPE (t2));
> +       SSA_NAME_RANGE_INFO (t1)->get_vrange (r1, TREE_TYPE (t1));
> +       SSA_NAME_RANGE_INFO (t2)->get_vrange (r2, TREE_TYPE (t2));
> +       if (r1 != r2)
> +         return false;
> +     }
> +      else if (SSA_NAME_RANGE_INFO (t2))
> +     return false;
> +    }
> +
>    if (SSA_NAME_IS_DEFAULT_DEF (t1))
>      {
>        tree b1 = SSA_NAME_VAR (t1);
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to