On Fri, 4 Dec 2015, Jan Hubicka wrote:

> Hi,
> this is the last patch of the series.  It makes operand_equal_p to compare
> alias sets even in !flag_strict_aliasing before inlining so inlining 
> !flag_strict_aliasing to flag_strict_aliasing is possible when callee is
> merged comdat.  I tried to explain it in greater detail in the comment
> in ipa-inline-tranform.
> 
> While working on the code I noticed that I managed to overload merged with
> two meanings. One is that the function had bodies defined in multiple units
> (and thus its inlining should not be considered cross-modulo) and other is
> that it used to be comdat.  This is usually the same, but not always - one
> can manually define weak functions where the bypass for OPTIMIZAITON_NODE
> checks can not apply.
> 
> Since the first only affects heuristics and I do not think I need to care
> about weaks much, I dropped it and renamed the flag to merged_comdat to make
> it more obvious what it means.

I wonder if you can split out the re-naming at this stage.  Further
comments below.

> Bootstrapped/regtested x86_64-linux, OK?
> 
> I will work on some testcases for the ICF and fold-const that would lead
> to wrong code if alias sets was ignored early.

Would be nice to have a wrong-code testcase go with the commit.

> Honza
>       * fold-const.c (operand_equal_p): Before inlining do not permit
>       transformations that would break with strict aliasing.
>       * ipa-inline.c (can_inline_edge_p) Use merged_comdat.
>       * ipa-inline-transform.c (inline_call): When inlining merged comdat do
>       not drop strict_aliasing flag of caller.
>       * cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
>       * cgraph.c (cgraph_node::dump): Dump merged_comdat.
>       * ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
>       comdat and non-comdat.
>       * cgraph.h (cgraph_node): Rename merged to merged_comdat.
>       * ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
>       and icf_merged.
> 
>       * lto-symtab.c (lto_cgraph_replace_node): Update code computing
>       merged_comdat.
> Index: fold-const.c
> ===================================================================
> --- fold-const.c      (revision 231239)
> +++ fold-const.c      (working copy)
> @@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_
>                                          flags)))
>               return 0;
>             /* Verify that accesses are TBAA compatible.  */
> -           if (flag_strict_aliasing
> +           if ((flag_strict_aliasing || !cfun->after_inlining)
>                 && (!alias_ptr_types_compatible_p
>                       (TREE_TYPE (TREE_OPERAND (arg0, 1)),
>                        TREE_TYPE (TREE_OPERAND (arg1, 1)))

Sooo....  first of all the code is broken anyway as it guards
the restrict checking (MR_DEPENDENCE_*) stuff with flag_strict_aliasing
(ick).  Second, I wouldn't mind if we drop the flag_strict_aliasing
check alltogether, a cfun->after_inlining checks makes me just too
nervous.

> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c      (revision 231239)
> +++ ipa-inline.c      (working copy)
> @@ -466,7 +466,7 @@ can_inline_edge_p (struct cgraph_edge *e
>           optimized with the optimization flags of module they are used in.
>        Also do not care about mixing up size/speed optimization when
>        DECL_DISREGARD_INLINE_LIMITS is set.  */
> -      else if ((callee->merged
> +      else if ((callee->merged_comdat
>               && !lookup_attribute ("optimize",
>                                     DECL_ATTRIBUTES (caller->decl)))
>              || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
> Index: ipa-inline-transform.c
> ===================================================================
> --- ipa-inline-transform.c    (revision 231239)
> +++ ipa-inline-transform.c    (working copy)
> @@ -322,11 +322,26 @@ inline_call (struct cgraph_edge *e, bool
>    if (DECL_FUNCTION_PERSONALITY (callee->decl))
>      DECL_FUNCTION_PERSONALITY (to->decl)
>        = DECL_FUNCTION_PERSONALITY (callee->decl);
> +
> +  /* merged_comdat indicate that function was originally COMDAT and merged
> +     from multiple units.  Because every unit using COMDAT must also define 
> it,
> +     we know that the function is safe to build with each of the optimization
> +     flags used used to compile them.
> +
> +     If one unit is compiled with -fstrict-aliasing and
> +     other with -fno-strict-aliasing we may bypass dropping the
> +     flag_strict_aliasing because we know it would be valid to inline
> +     -fstrict-aliaisng variant of the calee, too.  Unless optimization
> +     attribute was used, the caller and COMDAT callee must have been
> +     compiled with the same flags.  */

So your logic relies on the fact that the -fno-strict-aliasing was
not necessary on copy A if copy B was compiled without that flag
because otherwise copy B would invoke undefined behavior?

This menans it's a language semantics thing but you simply look at
whether it's "comdat"?  Shouldn't this use some ODR thing instead?

Also as undefined behavior only applies at runtime consider copy A
(with -fno-strict-aliasing) is used in contexts where undefined
behavior would occur while copy B not.  Say,

int foo (int *p, short *q)
{
  *p = 1;
  return *q;
}

and the copy A use is foo (&x, &x) while the copy B use foo (&x, &y).

Yes, the case is lame here as we'd miscompile this in copy B and
comdat makes us eventually use that copy for A.  But if we don't
manage to miscompile this without inlining there isn't any undefined
behavior (at runtime) you can rely on.

Just want to know whether you thought about the above cases, I would
declare them invalid but I am not sure the C++ standard agrees here.
Do other FEs end up emitting comdats?  Can the user produce them for C?

>    if (!opt_for_fn (callee->decl, flag_strict_aliasing)
> -      && opt_for_fn (to->decl, flag_strict_aliasing))
> +      && opt_for_fn (to->decl, flag_strict_aliasing)
> +      && (!callee->merged_comdat
> +       || lookup_attribute ("optimization",
> +                            DECL_ATTRIBUTES (e->caller->decl))
> +       || lookup_attribute ("optimization", DECL_ATTRIBUTES (callee->decl))))
>      {
>        struct gcc_options opts = global_options;
> -
>        cl_optimization_restore (&opts,
>        TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)));
>        opts.x_flag_strict_aliasing = false;
> Index: cgraphclones.c
> ===================================================================
> --- cgraphclones.c    (revision 231239)
> +++ cgraphclones.c    (working copy)
> @@ -433,7 +433,7 @@ cgraph_node::create_clone (tree new_decl
>    new_node->tp_first_run = tp_first_run;
>    new_node->tm_clone = tm_clone;
>    new_node->icf_merged = icf_merged;
> -  new_node->merged = merged;
> +  new_node->merged_comdat = merged_comdat;
>  
>    new_node->clone.tree_map = NULL;
>    new_node->clone.args_to_skip = args_to_skip;
> Index: lto/lto-symtab.c
> ===================================================================
> --- lto/lto-symtab.c  (revision 231239)
> +++ lto/lto-symtab.c  (working copy)
> @@ -63,8 +63,9 @@ lto_cgraph_replace_node (struct cgraph_n
>        gcc_assert (!prevailing_node->global.inlined_to);
>        prevailing_node->mark_address_taken ();
>      }
> -  if (node->definition && prevailing_node->definition)
> -    prevailing_node->merged = true;
> +  if (node->definition && prevailing_node->definition
> +      && DECL_COMDAT (node->decl) && DECL_COMDAT (prevailing_node->decl))
> +    prevailing_node->merged_comdat = true;
>  
>    /* Redirect all incoming edges.  */
>    compatible_p
> Index: cgraph.c
> ===================================================================
> --- cgraph.c  (revision 231239)
> +++ cgraph.c  (working copy)
> @@ -1994,6 +1994,8 @@ cgraph_node::dump (FILE *f)
>      fprintf (f, " tm_clone");
>    if (icf_merged)
>      fprintf (f, " icf_merged");
> +  if (merged_comdat)
> +    fprintf (f, " merged_comdat");
>    if (nonfreeing_fn)
>      fprintf (f, " nonfreeing_fn");
>    if (DECL_STATIC_CONSTRUCTOR (decl))
> Index: ipa-icf.c
> ===================================================================
> --- ipa-icf.c (revision 231239)
> +++ ipa-icf.c (working copy)
> @@ -1352,10 +1352,15 @@ sem_function::merge (sem_item *alias_ite
>    gcc_assert (alias->icf_merged || remove || redirect_callers);
>    original->icf_merged = true;
>  
> -  /* Inform the inliner about cross-module merging.  */
> -  if ((original->lto_file_data || alias->lto_file_data)
> -      && original->lto_file_data != alias->lto_file_data)
> -    local_original->merged = original->merged = true;
> +  /* We use merged flag to track cases where COMDAT function is known to be
> +     compatible its callers.  If we merged in non-COMDAT, we need to give up
> +     on this optimization.  */
> +  if (original->merged_comdat && !alias->merged_comdat)
> +    {
> +      if (dump_file)
> +     fprintf (dump_file, "Dropping merged_comdat flag.\n\n");
> +      local_original->merged_comdat = original->merged_comdat = false;
> +    }
>  
>    if (remove)
>      {
> Index: cgraph.h
> ===================================================================
> --- cgraph.h  (revision 231239)
> +++ cgraph.h  (working copy)
> @@ -1333,7 +1333,7 @@ public:
>       accesses trapping.  */
>    unsigned nonfreeing_fn : 1;
>    /* True if there was multiple COMDAT bodies merged by lto-symtab.  */
> -  unsigned merged : 1;
> +  unsigned merged_comdat : 1;
>    /* True if function was created to be executed in parallel.  */
>    unsigned parallelized_function : 1;
>    /* True if function is part split out by ipa-split.  */
> Index: ipa-inline-analysis.c
> ===================================================================
> --- ipa-inline-analysis.c     (revision 231239)
> +++ ipa-inline-analysis.c     (working copy)
> @@ -3708,7 +3708,7 @@ simple_edge_hints (struct cgraph_edge *e
>  
>    if (callee->lto_file_data && edge->caller->lto_file_data
>        && edge->caller->lto_file_data != callee->lto_file_data
> -      && !callee->merged)
> +      && !callee->merged_comdat && !callee->icf_merged)
>      hints |= INLINE_HINT_cross_module;
>  
>    return hints;
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to