http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59835

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kyukhin at gcc dot gnu.org,
                   |                            |rth at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Regardless if there is a LRA bug or not, I'd say the *k<logic><mode>, maybe
kortestzhi, kortestchi, and definitely kunpckhi look problematic to me.
As shown on the testcase, kunpckhi can be very well matched by the combiner,
but the pattern doesn't have any GPR constraints, and if as in this testcase
the result isn't used as mask of any AVX512F vector insn, nor say input loaded
from memory and result immediately stored into memory, I'd say reloading it
into mask registers and back can't be cheap.  Can't the kunpckhi constraints be
"=Yk,Q", "Yk,Q", "Yk,0" and just emit "mov{b}\t{%1, %h0|%h0, %1}" in that case
(could be of course just limited to TARGET_AVX512F as is now).
As for kortest[cz]hi, dunno if the combiner can actually match them.  And for
*k<logic><mode>, my issue with that pattern is that it doesn't have (clobber
CC)
and in theory could be matched pre-RA by something and then would force RA to
choose the mask registers over something perhaps cheaper.
I wonder if the pattern can't be limited to reload_completed and perhaps there
can be a splitter that will split post-reload the any_logic SWI12 operation
with
(clobber CC) into the non-(clobber CC) variant if the operands are Yk.

Reply via email to