On Tue, Nov 04, 2025 at 02:54:44PM +0000, Wilco Dijkstra wrote:
> > There are 2 options.
> > Lower the type-generic builtin into a CAS loop and pattern recognize it at
> > some late time (e.g. the widening_mul pass, certainly after IPA) into an IFN
> > if the corresponding optab is supported.
> > Or lower the type-generic builtin into IFN (ifns can have the min vs. max
> > argument and derive size and sign from the DESIRED argument) and at some
> > perhaps early (before IPA) point - forwprop? - pattern match a CAS loop into
> > the IFN too and then ideally shortly after IPA lower the IFN back into a CAS
> > loop if optab doesn't exist.
> > The reason for the pre vs. post-IPA is OpenMP/OpenACC, before IPA you don't
> > always know what the backend will be.
>
> Expanding builtins early and then pattern matching them again feels like
> adding
> complexity without good reason... Why not ask the backend whether it supports
> the builtin/IFN before using a generic mid-end expansion?
Because if it is before IPA, with OpenMP/OpenACC it could be in purely host
code (in that case checking optab is fine), or purely target code or code
for both host/target. In those cases querying optabs before IPA is wrong,
you could be asking e.g. about aarch64 being able to handle those, but in
the end it could be nvptx or amdgcn code and depend on that.
So, optabs should be better queried after IPA.
And either you keep code as IFN (for builtin lowered stuff) or CAS loop
(when user write it that way) until after IPA and then based on optab
either pattern match CAS loop into IFN or lower IFN to CAS loop, or
keep the gimplification -> post IPA IL to be always CAS loop. You need
to be able to pattern match it in any case at some point, and lower either
builtin or IFN to CAS loop.
Jakub