On Tue, 12 Jan 2021, Richard Biener wrote:

> >>> The point of the match.pd changes is to canonicalize GIMPLE on some form
> >>> when there are several from GIMPLE POV equivalent or better forms of
> >>> writing
> >>> the same thing.  The advantage of having one canonical way is that ICF,
> >>> SCCVN etc. optimizations can then understand the different forms are
> >>> equivalent.
> >> Fair enough, though in cases like this I think it is unclear which of the
> >> two forms is going to be ultimately better, especially as it may depend on
> >> the exact form of the operands used, e.g. values of any immediates, so I
> >> think a way to make the reverse transformation (whether to undo one made
> >> here or genuinely) needs to be available at a later compilation stage.
> >> One size doesn't fit all.
> >>
> >>  With this in mind...
> > So in this case the number of operations are the same before/after and
> > parallelism is the same before/after, register lifetimes, etc.   I doubt
> > either form is particularly better suited for CSE or gives better VRP
> > data, etc.   The fact that we can't always do ~(X +C) -> ~X + -C
> > probably argues against that form ever so slightly.

 FWIW I agree with Jakub here, that having one canonical form for the 
middle end to operate on is advantageous.  It is just that when we 
eventually get to the backend we may want to do the reverse transformation 
in some cases, which may be specific immediate operand values or whatever 
the backend may see fit.

> >>> If another form is then better for a particular machine, it should be done
> >>> either during expansion (being able to produce both RTLs and computing
> >>> their
> >>> costs), or during combine with either combine splitters or
> >>> define_insn_and_split in the backend, or, if it can't be done in RTL,
> >>> during
> >>> the isel pass.
> >> Hmm, maybe it has been discussed before, so please forgive me if I write
> >> something silly, but it seems to me like this should be done in a generic
> >> way like match.pd so that all the targets do not have to track the changes
> >> made there and then perhaps repeat the same or similar code each.  So I
> >> think it would make sense to make a change like this include that reverse
> >> transformation as well, so that ultimately both forms are tried with RTL,
> >> as there is no clear advantage to either here.
> > The idea we've kicked around in the past was to use the same syntax as
> > match.pd, but have it be target dependent to reform expressions in ways
> > that are beneficial to the target and have it run at the end of the
> > gimple/ssa pipeline.  Nobody's implemented this though.

 Hmm, but why does it have to be target dependent?  For match.pd we do 
things unconditionally, to have a uniform intermediate representation, 
however here we wouldn't have to, as we can check the costs respectively 
of the original and the transformed expression and choose the cheaper of 
the two.  Would that be so we don't waste cycles with targets we know 
beforehand a given transformation won't buy anything?

 In that case however no code quality regression would happen anyway, so I 
think it would be more productive if we still had all transformations 
defined in a generic manner and then possibly the hopeless ones excluded 
by hand for targets listed.  This way if anything is omitted by chance, 
i.e. not excluded for a given target, then good code will still be 
produced and only some compilation performance lost.

 While if we require all port maintainers to qualify individual 
transformations by hand as they are added by someone to their pet target, 
we'll end up with a lot of duplicate effort and missed bits.  Of course 
some very exotic transformations that match unique target machine 
instructions may indeed best be added to a single target only.

> Yes.  And while a gimple-to-gimple transform is indeed quite simple
> to make eventually a match.pd-like gimple-to-RTL would be more
> useful in the end.  Note RTL can eventually be emulated close enough
> via the use of internal functions mapping to optabs.  But still
> complex combined instructions will need expander help unless we
> expose all named instruction RTL patterns as target specific
> internal functions to use from that .pd file.

 Hmm, why aren't the standard named patterns we already have going to be 
sufficient?

  Maciej

Reply via email to