http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59835
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kyukhin at gcc dot gnu.org, | |rth at gcc dot gnu.org, | |uros at gcc dot gnu.org --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Regardless if there is a LRA bug or not, I'd say the *k<logic><mode>, maybe kortestzhi, kortestchi, and definitely kunpckhi look problematic to me. As shown on the testcase, kunpckhi can be very well matched by the combiner, but the pattern doesn't have any GPR constraints, and if as in this testcase the result isn't used as mask of any AVX512F vector insn, nor say input loaded from memory and result immediately stored into memory, I'd say reloading it into mask registers and back can't be cheap. Can't the kunpckhi constraints be "=Yk,Q", "Yk,Q", "Yk,0" and just emit "mov{b}\t{%1, %h0|%h0, %1}" in that case (could be of course just limited to TARGET_AVX512F as is now). As for kortest[cz]hi, dunno if the combiner can actually match them. And for *k<logic><mode>, my issue with that pattern is that it doesn't have (clobber CC) and in theory could be matched pre-RA by something and then would force RA to choose the mask registers over something perhaps cheaper. I wonder if the pattern can't be limited to reload_completed and perhaps there can be a splitter that will split post-reload the any_logic SWI12 operation with (clobber CC) into the non-(clobber CC) variant if the operands are Yk.