While working on UPC, we ran into an interesting problem
where if -O1 is enabled, and -funit-at-a-time is disabled
(which is not the default configuration) a static variable
declaration was not emitted by the assembler.  I haven't
quite worked out why this is the case, but reading the
code did notice some awkwardness in how "used" variables
are detected and handled by the call graph (cgraph) pass(es).

The gist of the issue we ran into was the handling of
this UPC construct:
  {static shared strict int x; x = x; }
In UPC, "strict" is similar to volatile.  The assignment
of the dummy variable to itself above doesn't do anything
very useful, but it does enforce a memory fence that
ensures that remote reads and writes to UPC shared
space can't flow past the assignment above.

The UPC compiler runs a gimplify pass which finds
all UPC-isms and rewrites them into C-isms, which
then flow through the backend.  The assignment
above is loosely translated into:
  upc_put([0, 0, &x], upc_get([0, 0, &x], sizeof(x)));
where [0, 0, &x] is an aggregate consstructor that
builds the representation of a shared pointer having
a thread number of 0, a phase of 0, and a virtual
address of &x.  All UPC shared vairables are located
in a special linkage section.  In this way, &x points
to a location in the global shared address space,
and the linker lays out each thread's contribution to
the global shared address.

The difficulty comes in when we generate the runtime
calls above referring to &x, by referring to a shadow
variable we create (by necessity, to prevent infinite
recursion in the gimplify pass) that has the same external
name as 'x', with the shared qualifier removed.

What happens is that cgraph has already been run and
determined that 'x' isn't needed and therefore
it doesn't emit the declaration of 'x' into the generated
assembler code.  We tried asserting TREE_USED()
on 'x' when it was declared, but it turns out
that instead of referring to TREE_USED() or
even DECL_PRESERVE_P(), cgraph instead refers
directly to the "used" attribute. 

Because of this, if __attribute__ ((used)) is added to the
declaration above, all is well.  That is because the front-end
checks directly for the "used" attribute in various places
but seems not to check various tree flags.

Here are the relevant references (in the HEAD branch):

c-decl.c-    }
c-decl.c-
c-decl.c-  /* If this was marked 'used', be sure it will be output.  */
c-decl.c:  if (!flag_unit_at_a_time && lookup_attribute ("used", 
DECL_ATTRIBUTES (decl)))
c-decl.c-    mark_decl_referenced (decl);
c-decl.c-
c-decl.c-  if (TREE_CODE (decl) == TYPE_DECL)
--
cgraphunit.c-  if (node->local.externally_visible)
cgraphunit.c-    return true;
cgraphunit.c-
cgraphunit.c:  if (!flag_unit_at_a_time && lookup_attribute ("used", 
DECL_ATTRIBUTES (decl)))
cgraphunit.c-    return true;
cgraphunit.c-
cgraphunit.c-  /* ??? If the assembler name is set by hand, it is possible to 
assemble
--
cgraphunit.c-  for (node = cgraph_nodes; node != first; node = node->next)
cgraphunit.c-    {
cgraphunit.c-      tree decl = node->decl;
cgraphunit.c:      if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
cgraphunit.c-   {
cgraphunit.c-     mark_decl_referenced (decl);
cgraphunit.c-     if (node->local.finalized)
--
cgraphunit.c-  for (vnode = varpool_nodes; vnode != first_var; vnode = 
vnode->next)
cgraphunit.c-    {
cgraphunit.c-      tree decl = vnode->decl;
cgraphunit.c:      if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
cgraphunit.c-   {
cgraphunit.c-     mark_decl_referenced (decl);
cgraphunit.c-     if (vnode->finalized)
--
ipa-pure-const.c-{
ipa-pure-const.c-  /* If the variable has the "used" attribute, treat it as if 
it had a
ipa-pure-const.c-     been touched by the devil.  */
ipa-pure-const.c:  if (lookup_attribute ("used", DECL_ATTRIBUTES (t)))
ipa-pure-const.c-    {
ipa-pure-const.c-      local->pure_const_state = IPA_NEITHER;
ipa-pure-const.c-      return;
--
ipa-reference.c-{
ipa-reference.c-  /* If the variable has the "used" attribute, treat it as if 
it had a
ipa-reference.c-     been touched by the devil.  */
ipa-reference.c:  if (lookup_attribute ("used", DECL_ATTRIBUTES (t)))
ipa-reference.c-    return false;
ipa-reference.c-
ipa-reference.c-  /* Do not want to do anything with volatile except mark any
--
ipa-type-escape.c-  tree type = get_canon_type (TREE_TYPE (t), false, false);
ipa-type-escape.c-  if (!type) return;
ipa-type-escape.c-
ipa-type-escape.c:  if (lookup_attribute ("used", DECL_ATTRIBUTES (t)))
ipa-type-escape.c-    {
ipa-type-escape.c-      mark_interesting_type (type, FULL_ESCAPE);
ipa-type-escape.c-      return;
--
varpool.c-  if (node->externally_visible || node->force_output)
varpool.c-    return true;
varpool.c-  if (!flag_unit_at_a_time
varpool.c:      && lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
varpool.c-    return true;
varpool.c-
varpool.c-  /* ??? If the assembler name is set by hand, it is possible to 
assemble

Given that the processing of the "used" attribute is as follows
(in handle_used_attribute()), I question whether any other
code should be referring to the "used" attribute directly:

  if (TREE_CODE (node) == FUNCTION_DECL
      || (TREE_CODE (node) == VAR_DECL && TREE_STATIC (node)))
    {
      TREE_USED (node) = 1;
      DECL_PRESERVE_P (node) = 1;
    }

I'm also not certain why DECL_PRESERVE_P() was introduced
when TREE_USED() seems to imply the same thing (that the
variable/function is used and shouldn't be eliminated.

I was surprised that the call graph code to check for
variable usage isn't better isolated and modularized.
It seems out of place in c-decl.c for example.

Also, note how often flag_unit_at_a_time and the
"used" attribute are checked together in various
combinations, in varous files.  This makes it difficult
to understand exactly when, where, and why flag_unit_at_a_time
is checked and how that interacts with the "used" processing.

The reason things seem to work when -O is asserted,
and -funit-at-a-time is also asserted is due to the
fact that  this code is executed (in build_cgraph_edges):

  /* Look for initializers of constant variables and private statics.  */
  for (step = cfun->unexpanded_var_list;
       step;
       step = TREE_CHAIN (step))
    {
      tree decl = TREE_VALUE (step);
      if (TREE_CODE (decl) == VAR_DECL
          && (TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
          && flag_unit_at_a_time)
        varpool_finalize_decl (decl);
      else if (TREE_CODE (decl) == VAR_DECL && DECL_INITIAL (decl))
        walk_tree (&DECL_INITIAL (decl), record_reference, node, visited_nodes);
    }

Since we have a static varible declaration, and flag_unit_at_a_time
is asserted, then varpool_finalize_decl (decl) is called.  But if
flag_unit_at_a_time isn't true, the static variable declaration
is silently ignored.  I haven't quite figured out why that's the
case, but I think it is somehow due to the fact that in spite
of the fact that we set TREE_USED(), the code is checking for
the explicit "used" attribute. And since optimization is asserted
some call graph passes are run, but they behave differently
because flag_unit_at_a_time is false, and this is not the
typical case.

All of this is somewhat specific to UPC's use of the gimplify
pass to implement language semantics and the particular nature
of the code it generates.  However, I think the general
observation of the way that the "used" attribute is queried
directly, and how it interacts with the presence/absence
of flag_unit_at_a_time might be worth some level of review
and possible rework.

Reply via email to