https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Sat, 9 Jan 2021, jiangning.liu at amperecomputing dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598
> 
> --- Comment #7 from Jiangning Liu <jiangning.liu at amperecomputing dot com> 
> ---
> (In reply to rguent...@suse.de from comment #6)
> > On January 9, 2021 4:17:17 AM GMT+01:00, "jiangning.liu at amperecomputing
> > dot com" <gcc-bugzi...@gcc.gnu.org> wrote:
> > >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598
> > >
> > >--- Comment #5 from Jiangning Liu <jiangning.liu at amperecomputing dot
> > >com> ---
> > >> It has to be done with care of course, cost modeling is difficult
> > >> (we need to have a good estimate of n and m or need to version
> > >> the whole nest).  That said, usually we attempt the reverse
> > >transform.
> > >
> > >Before tuning the cost model good enough, we may implement this
> > >optimization by
> > >adding a new optimization command line option. This won't hurt gcc,
> > >right?
> > 
> > New options not enabled by default tend to bitrot, be broken from the start
> > and won't be used by the lazy user. So I see no point in doing that. 
> > 
> 
> Understand. I mean we can enable it by default eventually, but we need to
> implement and tune it step by step. It is unrealistic to work out the best 
> cost
> model at the very beginning.

Sure.  The "easiest" thing is to rely on a profile from PGO, we did
have some transforms only enabled by -fprofile-use by default.  That is,
the cost model needs to be conservative, esp. if you introduce dynamic
allocation for this.  In the end I guess only a variant that versions
the nest on the size of the temporary will be good enough to not trigger
OOM or excessive overhead for small sizes anyway.

Reply via email to