On Wed, Mar 22, 2023 at 4:57 PM Andrew Stubbs <a...@codesourcery.com> wrote: > > On 22/03/2023 13:56, Richard Biener wrote: > >> Basically, the -ffast-math instructions will always be the fastest way, > >> but the goal is that the default optimization shouldn't just disable > >> vectorization entirely for any loop that has a divide in it. > > > > We try to express division as multiplication, but yes, I think there's > > currently no way to tell the vectorizer that vectorized division is > > available as libcall (nor for any other arithmetic operator that is not > > a call in the first place). > > I have considered creating a new builtin code, similar to the libm > functions, that would be enabled by a backend hook, or maybe just if > TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION doesn't return NULL. The > vectorizer would then use that, somehow. To treat it just like any other > builtin it would have to be set before the vectorizer pass encounters > it, which is probably not ideal for all the other passes that want to > handle divide operators. Alternatively, the vectorizable_operation > function could detect and introduce the builtin where appropriate. > > Would this be acceptable, or am I wasting my time planning something > that would get rejected?
So why not make it possible for the target to specify there's a libcall for a specific optab so the vectorizer would simply use vectorized {TRUNC_DIV,RDIV}_EXPR but the RTL expander would emit a libcall (in libgcc ways, thus divv2df3 or so)? It feels wrong to add some secondary machinery here (like for example having .RDIV internal function calls instead of a / operator) I think that for standard unops and binops that would be the default behavior already, the only piece missing is the vectorizer looking for a CODE_FOR_* optab handler and there's currently no way to say "yes I have a libcall fallback" or "no, no libcall fallback available" or for a target to specify those (maybe add a (define_libcall ...) alongside (define_expand ...)?) A short-circuit would be to use a new target hook to specify that libcall availability iff the libcall emission works. There's the remaining question of whether the libcall emission code works good enough for vector types, in cases the ABI for libcalls doesn't match the ABI for regular calls. Richard. > > Thanks > > Andrew