On Thu, 14 Jan 2021, Maciej W. Rozycki wrote: > On Tue, 12 Jan 2021, Richard Biener wrote: > > > >>> The point of the match.pd changes is to canonicalize GIMPLE on some form > > >>> when there are several from GIMPLE POV equivalent or better forms of > > >>> writing > > >>> the same thing. The advantage of having one canonical way is that ICF, > > >>> SCCVN etc. optimizations can then understand the different forms are > > >>> equivalent. > > >> Fair enough, though in cases like this I think it is unclear which of the > > >> two forms is going to be ultimately better, especially as it may depend > > >> on > > >> the exact form of the operands used, e.g. values of any immediates, so I > > >> think a way to make the reverse transformation (whether to undo one made > > >> here or genuinely) needs to be available at a later compilation stage. > > >> One size doesn't fit all. > > >> > > >> With this in mind... > > > So in this case the number of operations are the same before/after and > > > parallelism is the same before/after, register lifetimes, etc. I doubt > > > either form is particularly better suited for CSE or gives better VRP > > > data, etc. The fact that we can't always do ~(X +C) -> ~X + -C > > > probably argues against that form ever so slightly. > > FWIW I agree with Jakub here, that having one canonical form for the > middle end to operate on is advantageous. It is just that when we > eventually get to the backend we may want to do the reverse transformation > in some cases, which may be specific immediate operand values or whatever > the backend may see fit. > > > >>> If another form is then better for a particular machine, it should be > > >>> done > > >>> either during expansion (being able to produce both RTLs and computing > > >>> their > > >>> costs), or during combine with either combine splitters or > > >>> define_insn_and_split in the backend, or, if it can't be done in RTL, > > >>> during > > >>> the isel pass. > > >> Hmm, maybe it has been discussed before, so please forgive me if I write > > >> something silly, but it seems to me like this should be done in a generic > > >> way like match.pd so that all the targets do not have to track the > > >> changes > > >> made there and then perhaps repeat the same or similar code each. So I > > >> think it would make sense to make a change like this include that reverse > > >> transformation as well, so that ultimately both forms are tried with RTL, > > >> as there is no clear advantage to either here. > > > The idea we've kicked around in the past was to use the same syntax as > > > match.pd, but have it be target dependent to reform expressions in ways > > > that are beneficial to the target and have it run at the end of the > > > gimple/ssa pipeline. Nobody's implemented this though. > > Hmm, but why does it have to be target dependent? For match.pd we do > things unconditionally, to have a uniform intermediate representation, > however here we wouldn't have to, as we can check the costs respectively > of the original and the transformed expression and choose the cheaper of > the two. Would that be so we don't waste cycles with targets we know > beforehand a given transformation won't buy anything? > > In that case however no code quality regression would happen anyway, so I > think it would be more productive if we still had all transformations > defined in a generic manner and then possibly the hopeless ones excluded > by hand for targets listed. This way if anything is omitted by chance, > i.e. not excluded for a given target, then good code will still be > produced and only some compilation performance lost. > > While if we require all port maintainers to qualify individual > transformations by hand as they are added by someone to their pet target, > we'll end up with a lot of duplicate effort and missed bits. Of course > some very exotic transformations that match unique target machine > instructions may indeed best be added to a single target only. > > > Yes. And while a gimple-to-gimple transform is indeed quite simple > > to make eventually a match.pd-like gimple-to-RTL would be more > > useful in the end. Note RTL can eventually be emulated close enough > > via the use of internal functions mapping to optabs. But still > > complex combined instructions will need expander help unless we > > expose all named instruction RTL patterns as target specific > > internal functions to use from that .pd file. > > Hmm, why aren't the standard named patterns we already have going to be > sufficient?
Because two standard name pattern expansions can later be combined to a non-standard define-insn which may have lower cost and we'd of course like to see such instruction selection opportunities here. Richard.