Hi Jakub/Matthew,

>> 1) Should we still implement the libatomic functions?  And still with the
>> unsigned/signed distinction and all sizes?
>> - I'd expect so, mostly for the `-fno-inline-atomics` flag.
>
> Do you really need that?  Can't you just emit a CAS loop for the non-inline
> atomics?  Because the function explosion will be there again on the
> libatomic side.  Or at least don't add such symbols on arches which are
> never going to benefit from those right now (i.e. all the implementations
> would be CAS loops anyway).  Some function explosion can be limited by only
> having the __atomic_fetch_{min,max}_{s,u}{1,2,4,8,16} variants and not
> the {min,max}_fetch ones, because that case can be implemented on the caller
> side by just returning the DESIRED argument separately.

Yes, I don't see the benefit of adding more functions to libatomic. 

Expanding new operations (integer or FP) in the mid-end using CAS loops seems
best if the target doesn't define an expander for the operation. There isn't any
gain in calling libatomic - it's just more complexity and overhead that is best
avoided. My goal is to always inline atomic operations on AArch64 and avoid ever
calling libatomic.

>> 2) Earlier you floated the idea of using an internal function to encode the
>> operation through the rest of the compiler (outside of the frontend).  Does
>> that approach still seem good to you?
>
> There are 2 options.
> Lower the type-generic builtin into a CAS loop and pattern recognize it at
> some late time (e.g. the widening_mul pass, certainly after IPA) into an IFN
> if the corresponding optab is supported.
> Or lower the type-generic builtin into IFN (ifns can have the min vs. max
> argument and derive size and sign from the DESIRED argument) and at some
> perhaps early (before IPA) point - forwprop? - pattern match a CAS loop into
> the IFN too and then ideally shortly after IPA lower the IFN back into a CAS
> loop if optab doesn't exist.
> The reason for the pre vs. post-IPA is OpenMP/OpenACC, before IPA you don't
> always know what the backend will be.

Expanding builtins early and then pattern matching them again feels like adding
complexity without good reason... Why not ask the backend whether it supports
the builtin/IFN before using a generic mid-end expansion?

Cheers,
Wilco

Reply via email to