On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote: > > __attribute__((noinline)) > > struct ref_proxy f () > > { > > struct ref_proxy ptr; > > struct ref_proxy D.10036; > > struct ref_proxy type; > > struct ref_proxy type; > > struct qual_option D.10031; > > struct ref_proxy D.10030; > > struct qual_option inner; > > struct variant t; > > struct variant D.10026; > > struct variant D.10024; > > struct inplace_ref D.10023; > > struct inplace_ref ptr; > > struct ref_proxy D.9898; > > > > <bb 2> [local count: 1073741824]: > > MEM <char[40]> [(struct variant *)&D.10024] = {}; > > Without actually checking it might be that SRA chokes on the above. > The IL is basically a huge chain of aggregate copies interspersed > with clobbers and occasional scalar inits and I fear that we really > only have SRA dealing with this.
True. > Is there any reason to use the char[40] init instead of a aggregate > {} init of type variant? It is dse1 which introduces that: - D.10137 = {}; + MEM <char[40]> [(struct variant *)&D.10137] = {}; in particular maybe_trim_constructor_store. So, if SRA chokes on it, it better be fixed to deal with that, DSE can create that any time. Though, not sure how to differentiate that from the actual C++ zero initialization where it is supposed to clear also padding bits if any. I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE representing such clears including padding bits with MEM <char[sizeof (struct whatever)]> [(struct whatever *)&D.whatever] = {}; might be an option too. But then how to represent the DSE constructor trimming such that it is clear that it still doesn't initialize the padding bits? Anyway, even if padding bits are zero initialized, if SRA could see that nothing really inspects those padding bits, it would be nice to still optimize it. That said, it is a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct (1 byte, with padding) followed by size_t which_ field (the = 2 store). And, I believe when not constexpr evaluating this, there actually is no clearing before the = 2; store, void eggs::variants::detail::_storage<eggs::variants::detail::pack<eggs::variants::detail::empty, option_1, option_2>, true, true>::_storage<2, option_2> (struct _storage * const thi s, struct index which, struct option_2 & args#0) { struct index D.10676; *this = {CLOBBER(bob)}; { _1 = &this->D.9542; eggs::variants::detail::_union<eggs::variants::detail::pack<eggs::variants::detail::empty, option_1, option_2>, true>::_union<2, option_2> (_1, D.10676, args#0); this->_which = 2; } } and the call in there is just 3 nested inline calls which do some clobbers too, take address of something and call another inline and in the end it is just a clobber and nothing else. So, another thing to look at is whether the CONSTRUCTOR is CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether the gimplification is correct in that case (say if it would be struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; } u; unsigned long w; }; void bar (S *); void foo () { S s = { .u = { .v = {} }, .w = 2 }; bar (&s); } why do we expand it as s = {}; s.w = 2; when just s.w = 2; or maybe s.u.v = {}; s.w = 2; would be enough. Because when the large union has just a small member (in this case empty struct) active, clearing the whole union is really a waste of time. > I would suggest to open a bugreport. Yes. Jakub