On Fri, 27 Sep 2024, Jakub Jelinek wrote:

> On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote:
> > > __attribute__((noinline))
> > > struct ref_proxy f ()
> > > {
> > >    struct ref_proxy ptr;
> > >    struct ref_proxy D.10036;
> > >    struct ref_proxy type;
> > >    struct ref_proxy type;
> > >    struct qual_option D.10031;
> > >    struct ref_proxy D.10030;
> > >    struct qual_option inner;
> > >    struct variant t;
> > >    struct variant D.10026;
> > >    struct variant D.10024;
> > >    struct inplace_ref D.10023;
> > >    struct inplace_ref ptr;
> > >    struct ref_proxy D.9898;
> > > 
> > >    <bb 2> [local count: 1073741824]:
> > >    MEM <char[40]> [(struct variant *)&D.10024] = {};
> > 
> > Without actually checking it might be that SRA chokes on the above.
> > The IL is basically a huge chain of aggregate copies interspersed
> > with clobbers and occasional scalar inits and I fear that we really
> > only have SRA dealing with this.
> 
> True.
> 
> > Is there any reason to use the char[40] init instead of a aggregate
> > {} init of type variant?
> 
> It is dse1 which introduces that:
> -  D.10137 = {};
> +  MEM <char[40]> [(struct variant *)&D.10137] = {};
> in particular maybe_trim_constructor_store.
> 
> So, if SRA chokes on it, it better be fixed to deal with that,
> DSE can create that any time.

Agreed, it might be non-trivial though.

> Though, not sure how to differentiate that from the actual C++ zero
> initialization where it is supposed to clear also padding bits if any.
> I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE
> representing such clears including padding bits with
> MEM <char[sizeof (struct whatever)]> [(struct whatever *)&D.whatever] = {};
> might be an option too.  But then how to represent the DSE constructor
> trimming such that it is clear that it still doesn't initialize the padding
> bits?
> Anyway, even if padding bits are zero initialized, if SRA could see that
> nothing really inspects those padding bits, it would be nice to still optimize
> it.
>
> That said, it is
> a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct
> (1 byte, with padding) followed by size_t which_ field (the = 2 store).
> 
> And, I believe when not constexpr evaluating this, there actually is no
> clearing before the = 2; store,
> void 
> eggs::variants::detail::_storage<eggs::variants::detail::pack<eggs::variants::detail::empty,
>  option_1, option_2>, true, true>::_storage<2, option_2> (struct _storage * 
> const thi
> s, struct index which, struct option_2 & args#0)
> {
>   struct index D.10676;
> 
>   *this = {CLOBBER(bob)};
>   {
>     _1 = &this->D.9542;
>     
> eggs::variants::detail::_union<eggs::variants::detail::pack<eggs::variants::detail::empty,
>  option_1, option_2>, true>::_union<2, option_2> (_1, D.10676, args#0);
>     this->_which = 2;
>   }
> }
> and the call in there is just 3 nested inline calls which do some clobbers
> too, take address of something and call another inline and in the end
> it is just a clobber and nothing else.
> 
> So, another thing to look at is whether the CONSTRUCTOR is
> CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether
> the gimplification is correct in that case (say if it would be
> struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; 
> } u; unsigned long w; };
> void bar (S *);
> 
> void
> foo ()
> {
>   S s = { .u = { .v = {} }, .w = 2 };
>   bar (&s);
> }
> why do we expand it as
>       s = {};
>       s.w = 2;
> when just
>       s.w = 2;
> or maybe
>       s.u.v = {};
>       s.w = 2;
> would be enough.  Because when the large union has just a small member
> (in this case empty struct) active, clearing the whole union is really a
> waste of time.
> 
> > I would suggest to open a bugreport.
> 
> Yes.

I can investigate a bit when there's a testcase showing the issue.

Richard.

Reply via email to