On Fri, 27 Sep 2024, Jakub Jelinek wrote: > On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote: > > > __attribute__((noinline)) > > > struct ref_proxy f () > > > { > > > struct ref_proxy ptr; > > > struct ref_proxy D.10036; > > > struct ref_proxy type; > > > struct ref_proxy type; > > > struct qual_option D.10031; > > > struct ref_proxy D.10030; > > > struct qual_option inner; > > > struct variant t; > > > struct variant D.10026; > > > struct variant D.10024; > > > struct inplace_ref D.10023; > > > struct inplace_ref ptr; > > > struct ref_proxy D.9898; > > > > > > <bb 2> [local count: 1073741824]: > > > MEM <char[40]> [(struct variant *)&D.10024] = {}; > > > > Without actually checking it might be that SRA chokes on the above. > > The IL is basically a huge chain of aggregate copies interspersed > > with clobbers and occasional scalar inits and I fear that we really > > only have SRA dealing with this. > > True. > > > Is there any reason to use the char[40] init instead of a aggregate > > {} init of type variant? > > It is dse1 which introduces that: > - D.10137 = {}; > + MEM <char[40]> [(struct variant *)&D.10137] = {}; > in particular maybe_trim_constructor_store. > > So, if SRA chokes on it, it better be fixed to deal with that, > DSE can create that any time.
Agreed, it might be non-trivial though. > Though, not sure how to differentiate that from the actual C++ zero > initialization where it is supposed to clear also padding bits if any. > I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE > representing such clears including padding bits with > MEM <char[sizeof (struct whatever)]> [(struct whatever *)&D.whatever] = {}; > might be an option too. But then how to represent the DSE constructor > trimming such that it is clear that it still doesn't initialize the padding > bits? > Anyway, even if padding bits are zero initialized, if SRA could see that > nothing really inspects those padding bits, it would be nice to still optimize > it. > > That said, it is > a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct > (1 byte, with padding) followed by size_t which_ field (the = 2 store). > > And, I believe when not constexpr evaluating this, there actually is no > clearing before the = 2; store, > void > eggs::variants::detail::_storage<eggs::variants::detail::pack<eggs::variants::detail::empty, > option_1, option_2>, true, true>::_storage<2, option_2> (struct _storage * > const thi > s, struct index which, struct option_2 & args#0) > { > struct index D.10676; > > *this = {CLOBBER(bob)}; > { > _1 = &this->D.9542; > > eggs::variants::detail::_union<eggs::variants::detail::pack<eggs::variants::detail::empty, > option_1, option_2>, true>::_union<2, option_2> (_1, D.10676, args#0); > this->_which = 2; > } > } > and the call in there is just 3 nested inline calls which do some clobbers > too, take address of something and call another inline and in the end > it is just a clobber and nothing else. > > So, another thing to look at is whether the CONSTRUCTOR is > CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether > the gimplification is correct in that case (say if it would be > struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; > } u; unsigned long w; }; > void bar (S *); > > void > foo () > { > S s = { .u = { .v = {} }, .w = 2 }; > bar (&s); > } > why do we expand it as > s = {}; > s.w = 2; > when just > s.w = 2; > or maybe > s.u.v = {}; > s.w = 2; > would be enough. Because when the large union has just a small member > (in this case empty struct) active, clearing the whole union is really a > waste of time. > > > I would suggest to open a bugreport. > > Yes. I can investigate a bit when there's a testcase showing the issue. Richard.