On Fri, 23 Sep 2016, Jakub Jelinek wrote:

> On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote:
> > While I agree with the goal to reduce the number of static constructors
> > esp. the vNULL cases I disagree with.  This is just introducing
> > undefined behavior (uninitialized object use), and in case we end up
> > making vNULL not all-zeros it's bound to break.  So IMHO your patch
> > is very much premature optimization here.
> 
> No, there is no undefined behavior, and we rely on the zero initialization
> in > 100 cases, see
> grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*'
> or anywhere where we have vec part of a struct that we allocate cleared or
> clear after allocation.
> The thing is, vec is (intentionally) a POD, thus doesn't have a ctor,
> and the = vNULL "construction" is
>   template <typename T, typename A, typename L>
>   operator vec<T, A, L> () { return vec<T, A, L>(); }
> actually clearing of the vector.  If we ever wanted to initialize vec to
> something not-all-zeros, we'd actually need to turn it into non-POD and add
> ctor.
> 
> I'm attaching 3 further patches (not bootstrapped/regtested yet), which:
> 1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to
> be enough to get rid of the runtime initialization, both in the
> tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of
> _Z41__static_initialization_and_destruction_0ii ctor

Really?  We have

extern vnull vNULL;

thus all = vNULL initializers are really copies of an instance in vec.c.

This patch is nevertheless fine.

> 2) another patch to remove the useless = vNULL initializers from file scope
> vecs, we don't usually = 0 static vars either (though, apparently we have
> hundreds of those)

Ok.

> 3) a patch that attempts to clarify when = vNULL is needed and when it is
> not

Ok as well.

> > Maybe we can change vNULL to sth that avoids the constructor, if only
> > if C++14 is available (thus in stage2+)?
> > 
> > For cases like this I hope we could make GCC optimize away the static
> > construction, maybe you can reduce it to a simple testcase and file
> > a bugreport?  It will require making static constructors "first class"
> > in the cgraph so we know at some point the function body initializing
> > a specific global and some later cgraph phase builds up the global
> > constructor calling all remaining relevant individual constructor
> > functions (which can then still be inlined into that fn).
> 
> A short testcase is e.g.
> struct S { int a; };
> void baz (S *);
> #if __cpp_constexpr >= 200704
> constexpr
> #endif
> inline S foo () { return (S) { 2 }; }
> 
> S s = foo ();
> 
> void
> bar ()
> {
>   static S t = foo ();
>   baz (&t);
> }
> Here for -std=c++98 (or even -std=c++1z without the constexpr keyword)
> at -O0 there is both _Z41__static_initialization_and_destruction_0ii and
> the __cxa_guard*+_ZGV stuff, at -O2 we manage to inline
> _Z41__static_initialization_and_destruction_0ii but still end up with
> _GLOBAL__sub_I_s:
>         movl    $2, s(%rip)
>         ret
> registered in .ctors.

Yeah, I think this is where we could improve.

>  And the guard stuff is left too.
> Optimizing that away would be nice, but non-fun, we'd have to recognize
> something like:
> 
>   static struct S t;
>   unsigned char _1;
>   int _2;
> 
>   <bb 2>:
>   _1 = __atomic_load_1 (&_ZGVZ3barvE1t, 2);
>   if (_1 == 0)
>     goto <bb 4>;
>   else
>     goto <bb 3>;
> 
>   <bb 3>:
>   goto <bb 6>;
> 
>   <bb 4>:
>   _2 = __cxa_guard_acquire (&_ZGVZ3barvE1t);
>   if (_2 != 0)
>     goto <bb 5>;
>   else
>     goto <bb 3>;
> 
>   <bb 5>:
>   MEM[(struct S *)&t] = 2;
>   __cxa_guard_release (&_ZGVZ3barvE1t);
> 
>   <bb 6>:

Ick - I totally forgot about the local static initializer woes
with guards.  _Maybe_ we can represent this by putting the guard
into the initializer function (remember, I'd like to see all
static initializers be represented as functions, refered to
from their initializing varpool node as, say, ->initializer).
Thus we'd have the FE generate

bar ()
{
  __static_init_for_t ();
...
}

__static_init_for_t ()
{
  all the guard stuff plus the init
}

the initializers would be always-inline (at IPA time when optimizing
to be able to do the promotion to a static ctor).

> and turn those stores into the static var into a static initializer
> and throw away the _ZGV* var, plus if this is in an inline function
> (i.e. the static var and its _ZGV* var are comdat) it might be an
> ABI thing too (unfortunately the _ZGV* var and the actual var it is guarding
> aren't in the same comdat, so even a trick like keeping the runtime
> initialization in, but initialize the var to the expected values and
> preinitialize the _ZGV* guard into the "already initialized" state wouldn't
> work reliably, if the linker picks the _ZGV* var from one TU and the guarded
> var from another one, it could fail).  So, I'm afraid if we want to do this
> optimization, we'd need to limit it to the cases where the _ZGV* var isn't
> visible outside of the TU.
> 
> Ok for the 3 patches (or some subset of them) for trunk if they pass
> bootstrap/regtest?

Ok.

Richard.

Reply via email to