While working on UPC, we ran into an interesting problem where if -O1 is enabled, and -funit-at-a-time is disabled (which is not the default configuration) a static variable declaration was not emitted by the assembler. I haven't quite worked out why this is the case, but reading the code did notice some awkwardness in how "used" variables are detected and handled by the call graph (cgraph) pass(es).
The gist of the issue we ran into was the handling of this UPC construct: {static shared strict int x; x = x; } In UPC, "strict" is similar to volatile. The assignment of the dummy variable to itself above doesn't do anything very useful, but it does enforce a memory fence that ensures that remote reads and writes to UPC shared space can't flow past the assignment above. The UPC compiler runs a gimplify pass which finds all UPC-isms and rewrites them into C-isms, which then flow through the backend. The assignment above is loosely translated into: upc_put([0, 0, &x], upc_get([0, 0, &x], sizeof(x))); where [0, 0, &x] is an aggregate consstructor that builds the representation of a shared pointer having a thread number of 0, a phase of 0, and a virtual address of &x. All UPC shared vairables are located in a special linkage section. In this way, &x points to a location in the global shared address space, and the linker lays out each thread's contribution to the global shared address. The difficulty comes in when we generate the runtime calls above referring to &x, by referring to a shadow variable we create (by necessity, to prevent infinite recursion in the gimplify pass) that has the same external name as 'x', with the shared qualifier removed. What happens is that cgraph has already been run and determined that 'x' isn't needed and therefore it doesn't emit the declaration of 'x' into the generated assembler code. We tried asserting TREE_USED() on 'x' when it was declared, but it turns out that instead of referring to TREE_USED() or even DECL_PRESERVE_P(), cgraph instead refers directly to the "used" attribute. Because of this, if __attribute__ ((used)) is added to the declaration above, all is well. That is because the front-end checks directly for the "used" attribute in various places but seems not to check various tree flags. Here are the relevant references (in the HEAD branch): c-decl.c- } c-decl.c- c-decl.c- /* If this was marked 'used', be sure it will be output. */ c-decl.c: if (!flag_unit_at_a_time && lookup_attribute ("used", DECL_ATTRIBUTES (decl))) c-decl.c- mark_decl_referenced (decl); c-decl.c- c-decl.c- if (TREE_CODE (decl) == TYPE_DECL) -- cgraphunit.c- if (node->local.externally_visible) cgraphunit.c- return true; cgraphunit.c- cgraphunit.c: if (!flag_unit_at_a_time && lookup_attribute ("used", DECL_ATTRIBUTES (decl))) cgraphunit.c- return true; cgraphunit.c- cgraphunit.c- /* ??? If the assembler name is set by hand, it is possible to assemble -- cgraphunit.c- for (node = cgraph_nodes; node != first; node = node->next) cgraphunit.c- { cgraphunit.c- tree decl = node->decl; cgraphunit.c: if (lookup_attribute ("used", DECL_ATTRIBUTES (decl))) cgraphunit.c- { cgraphunit.c- mark_decl_referenced (decl); cgraphunit.c- if (node->local.finalized) -- cgraphunit.c- for (vnode = varpool_nodes; vnode != first_var; vnode = vnode->next) cgraphunit.c- { cgraphunit.c- tree decl = vnode->decl; cgraphunit.c: if (lookup_attribute ("used", DECL_ATTRIBUTES (decl))) cgraphunit.c- { cgraphunit.c- mark_decl_referenced (decl); cgraphunit.c- if (vnode->finalized) -- ipa-pure-const.c-{ ipa-pure-const.c- /* If the variable has the "used" attribute, treat it as if it had a ipa-pure-const.c- been touched by the devil. */ ipa-pure-const.c: if (lookup_attribute ("used", DECL_ATTRIBUTES (t))) ipa-pure-const.c- { ipa-pure-const.c- local->pure_const_state = IPA_NEITHER; ipa-pure-const.c- return; -- ipa-reference.c-{ ipa-reference.c- /* If the variable has the "used" attribute, treat it as if it had a ipa-reference.c- been touched by the devil. */ ipa-reference.c: if (lookup_attribute ("used", DECL_ATTRIBUTES (t))) ipa-reference.c- return false; ipa-reference.c- ipa-reference.c- /* Do not want to do anything with volatile except mark any -- ipa-type-escape.c- tree type = get_canon_type (TREE_TYPE (t), false, false); ipa-type-escape.c- if (!type) return; ipa-type-escape.c- ipa-type-escape.c: if (lookup_attribute ("used", DECL_ATTRIBUTES (t))) ipa-type-escape.c- { ipa-type-escape.c- mark_interesting_type (type, FULL_ESCAPE); ipa-type-escape.c- return; -- varpool.c- if (node->externally_visible || node->force_output) varpool.c- return true; varpool.c- if (!flag_unit_at_a_time varpool.c: && lookup_attribute ("used", DECL_ATTRIBUTES (decl))) varpool.c- return true; varpool.c- varpool.c- /* ??? If the assembler name is set by hand, it is possible to assemble Given that the processing of the "used" attribute is as follows (in handle_used_attribute()), I question whether any other code should be referring to the "used" attribute directly: if (TREE_CODE (node) == FUNCTION_DECL || (TREE_CODE (node) == VAR_DECL && TREE_STATIC (node))) { TREE_USED (node) = 1; DECL_PRESERVE_P (node) = 1; } I'm also not certain why DECL_PRESERVE_P() was introduced when TREE_USED() seems to imply the same thing (that the variable/function is used and shouldn't be eliminated. I was surprised that the call graph code to check for variable usage isn't better isolated and modularized. It seems out of place in c-decl.c for example. Also, note how often flag_unit_at_a_time and the "used" attribute are checked together in various combinations, in varous files. This makes it difficult to understand exactly when, where, and why flag_unit_at_a_time is checked and how that interacts with the "used" processing. The reason things seem to work when -O is asserted, and -funit-at-a-time is also asserted is due to the fact that this code is executed (in build_cgraph_edges): /* Look for initializers of constant variables and private statics. */ for (step = cfun->unexpanded_var_list; step; step = TREE_CHAIN (step)) { tree decl = TREE_VALUE (step); if (TREE_CODE (decl) == VAR_DECL && (TREE_STATIC (decl) && !DECL_EXTERNAL (decl)) && flag_unit_at_a_time) varpool_finalize_decl (decl); else if (TREE_CODE (decl) == VAR_DECL && DECL_INITIAL (decl)) walk_tree (&DECL_INITIAL (decl), record_reference, node, visited_nodes); } Since we have a static varible declaration, and flag_unit_at_a_time is asserted, then varpool_finalize_decl (decl) is called. But if flag_unit_at_a_time isn't true, the static variable declaration is silently ignored. I haven't quite figured out why that's the case, but I think it is somehow due to the fact that in spite of the fact that we set TREE_USED(), the code is checking for the explicit "used" attribute. And since optimization is asserted some call graph passes are run, but they behave differently because flag_unit_at_a_time is false, and this is not the typical case. All of this is somewhat specific to UPC's use of the gimplify pass to implement language semantics and the particular nature of the code it generates. However, I think the general observation of the way that the "used" attribute is queried directly, and how it interacts with the presence/absence of flag_unit_at_a_time might be worth some level of review and possible rework.