Hello! Attached patch tries to fix the mess with operations where register allocator is free to use either mask or general registers. There are plenty of problems with this approach: a) missing operations wth general registers - kxnor operation with general register does not exist - kandn operation with general registers exists for TARGET_BMI only
b) register allocation problems - DImode operations with 64bit general registers do not exist on 32bit target - register allocator is free to allocate operation with either register set, resulting in costly moves between general and mask register sets Mask operations can be generated in a consistent way using corresponding built-ins. To prevent RA problems, we have to separate mask ops from general ops - this way we always select operation with well defined register set. There is special problem with 64bit mask ops on 32bit targets: defining these operations as a named pattern, where only mask registers are available, will result in the situation where mask operation will be used for 64bit values living in a pair of general registers. RA will obviously generate many inter-regset moves in this situation. Another problem in point a) above is dependence of general-reg operations on various TARGET_AVX* flags. This can be seen with kxnor pattern, which has to be artificially enabled during register allocation just to split non-masked operation back to sequence of general operations. Similar situation arises with kandn on non-BMI targets. IMO the mentioned interferences do not warrant mixing of mask and general operations. I suspect here will be no end of "the value is moved between integer and mask registers unnecessary" type of bugs (as was the situation on 32bit targets in the past, where DImode values were moved through MMX registers). Users can and should use relevant builtins when operating with mask registers (n.b.: generic logic operations can still be used; RA will move values between regsets - but in a *consistent way*). FWIW, the few folding opportunities with mask builtins can be implemented in a target-folding function. The attached patch also paves the way for implementation of new builtins in patch [1] that are otherwise nearly to impossible to implement. 2016-11-25 Uros Bizjak <ubiz...@gmail.com> * config/i386/i386.md (UNSPEC_KMASKOP): New. (UNSPEC_KMOV): Remove. (kmovw): Expand to plain HImode move. (k<any_logic:code><mode>): Rename from *k<logic><mode>. Use register_operand predicates. Tag pattern with UNSPEC_KMASKOP. Remove corresponding clobber-removing splitter. (*anddi_1): Remove mask register alternatives. (*andsi_1): Ditto. (*andhi_1): Ditto. (*andqi_1): Ditto. (*<any_or:code><mode>_1): Ditto. (*<any_or:code>qi_1): Ditto. (kandn<mode>): Use SWI1248_AVX512BW mode iterator. Remove general register alternatives. Tag pattern with UNSPEC_KMASKOP. Remove corresponding splitter to operation with general registers. (*andn<SWI38:mode>): Rename from *bmi_andn_<mode>. (*andn<SWI12:mode>): New pattern. (*kxnor<mode>): Remove general register alternatives. Tag pattern with UNSPEC_KMASKOP. Remove corresponding splitter to operation with general registers. (knot<mode>): New insn pattern. (*one_cmpl<mode>2_1): Remove mask register alternatives. (one_cmplqi2_1): Ditto. (*k<any_lshift:code><mode>): Rename from *k<mshift><mode>3. Tag pattern with UNSPEC_KMASKOP. Add mode attribute. * config/i386/predicates.md (mask_reg_operand): Remove predicate. * config/i386/sse.md (vec_unpacks_hi_hi): Update pattern to generate kmaskop shift. (vec_unpacks_hi_<mode>): Ditto. * config/i386/i386-builtin.def (__builtin_ia32_kandhi): Use CODE_FOR_kandhi. (__builtin_ia32_knothi): Use CODE_FOR_knothi. (__builtin_ia32_korhi): Use CODE_FOR_kiorhi. (__builtin_ia32_kxorhi): Use CODE_FOR_kxorhi. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. There are a couple fo trivial scan-asm errors due to a rename and a missing kmov insn, where RA choose much more optimal movw <imm>,<mem> insn instead. [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01088.html Uros.
Index: config/i386/i386-builtin.def =================================================================== --- config/i386/i386-builtin.def (revision 242841) +++ config/i386/i386-builtin.def (working copy) @@ -1436,15 +1436,15 @@ BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_avx512f_roundpd_vec_pack_sfix512, "__builtin_ia32_ceilpd_vec_pack_sfix512", IX86_BUILTIN_CEILPD_VEC_PACK_SFIX512, (enum rtx_code) ROUND_CEIL, (int) V16SI_FTYPE_V8DF_V8DF_ROUND) /* Mask arithmetic operations */ -BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_andhi3, "__builtin_ia32_kandhi", IX86_BUILTIN_KAND16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kandhi, "__builtin_ia32_kandhi", IX86_BUILTIN_KAND16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kandnhi, "__builtin_ia32_kandnhi", IX86_BUILTIN_KANDN16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) -BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_one_cmplhi2, "__builtin_ia32_knothi", IX86_BUILTIN_KNOT16, UNKNOWN, (int) UHI_FTYPE_UHI) -BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_iorhi3, "__builtin_ia32_korhi", IX86_BUILTIN_KOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_knothi, "__builtin_ia32_knothi", IX86_BUILTIN_KNOT16, UNKNOWN, (int) UHI_FTYPE_UHI) +BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kiorhi, "__builtin_ia32_korhi", IX86_BUILTIN_KOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kortestchi, "__builtin_ia32_kortestchi", IX86_BUILTIN_KORTESTC16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kortestzhi, "__builtin_ia32_kortestzhi", IX86_BUILTIN_KORTESTZ16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kunpckhi, "__builtin_ia32_kunpckhi", IX86_BUILTIN_KUNPCKBW, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kxnorhi, "__builtin_ia32_kxnorhi", IX86_BUILTIN_KXNOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) -BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_xorhi3, "__builtin_ia32_kxorhi", IX86_BUILTIN_KXOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) +BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kxorhi, "__builtin_ia32_kxorhi", IX86_BUILTIN_KXOR16, UNKNOWN, (int) UHI_FTYPE_UHI_UHI) BDESC (OPTION_MASK_ISA_AVX512F, CODE_FOR_kmovw, "__builtin_ia32_kmov16", IX86_BUILTIN_KMOV16, UNKNOWN, (int) UHI_FTYPE_UHI) /* SHA */ Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 242852) +++ config/i386/i386.md (working copy) @@ -187,7 +187,7 @@ UNSPEC_PEXT ;; For AVX512F support - UNSPEC_KMOV + UNSPEC_KMASKOP UNSPEC_BNDMK UNSPEC_BNDMK_ADDR @@ -2489,18 +2489,10 @@ ] (const_string "SI")))]) -(define_insn "kmovw" - [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k") - (unspec:HI - [(match_operand:HI 1 "nonimmediate_operand" "r,km")] - UNSPEC_KMOV))] - "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "@ - kmovw\t{%k1, %0|%0, %k1} - kmovw\t{%1, %0|%0, %1}"; - [(set_attr "type" "mskmov") - (set_attr "prefix" "vex") - (set_attr "mode" "HI")]) +(define_expand "kmovw" + [(set (match_operand:HI 0 "nonimmediate_operand") + (match_operand:HI 1 "nonimmediate_operand"))] + "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))") (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,k,k ,r,m") @@ -8069,28 +8061,19 @@ operands[3] = gen_lowpart (QImode, operands[3]); }) -(define_split - [(set (match_operand:SWI1248x 0 "mask_reg_operand") - (any_logic:SWI1248x (match_operand:SWI1248x 1 "mask_reg_operand") - (match_operand:SWI1248x 2 "mask_reg_operand"))) - (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && reload_completed" - [(set (match_dup 0) - (any_logic:SWI1248x (match_dup 1) - (match_dup 2)))]) - -(define_insn "*k<logic><mode>" - [(set (match_operand:SWI1248_AVX512BW 0 "mask_reg_operand" "=k") +(define_insn "k<code><mode>" + [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k") (any_logic:SWI1248_AVX512BW - (match_operand:SWI1248_AVX512BW 1 "mask_reg_operand" "k") - (match_operand:SWI1248_AVX512BW 2 "mask_reg_operand" "k")))] + (match_operand:SWI1248_AVX512BW 1 "register_operand" "k") + (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)] "TARGET_AVX512F" - { - if (get_attr_mode (insn) == MODE_HI) - return "k<logic>w\t{%2, %1, %0|%0, %1, %2}"; - else - return "k<logic><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"; - } +{ + if (get_attr_mode (insn) == MODE_HI) + return "k<logic>w\t{%2, %1, %0|%0, %1, %2}"; + else + return "k<logic><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"; +} [(set_attr "type" "msklog") (set_attr "prefix" "vex") (set (attr "mode") @@ -8183,10 +8166,10 @@ }) (define_insn "*anddi_1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r,!k") + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r") (and:DI - (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm,k") - (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm,L,k"))) + (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm") + (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm,L"))) (clobber (reg:CC FLAGS_REG))] "TARGET_64BIT && ix86_binary_operator_ok (AND, DImode, operands)" { @@ -8195,9 +8178,6 @@ case TYPE_IMOVX: return "#"; - case TYPE_MSKLOG: - return "kandq\t{%2, %1, %0|%0, %1, %2}"; - default: gcc_assert (rtx_equal_p (operands[0], operands[1])); if (get_attr_mode (insn) == MODE_SI) @@ -8206,8 +8186,8 @@ return "and{q}\t{%2, %0|%0, %2}"; } } - [(set_attr "type" "alu,alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,*,0,0") + [(set_attr "type" "alu,alu,alu,imovx") + (set_attr "length_immediate" "*,*,*,0") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -8215,7 +8195,7 @@ (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "mode" "SI,DI,DI,SI,DI")]) + (set_attr "mode" "SI,DI,DI,SI")]) ;; Turn *anddi_1 into *andsi_1_zext if possible. (define_split @@ -8242,9 +8222,9 @@ (set_attr "mode" "SI")]) (define_insn "*andsi_1" - [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,Ya,!k") - (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:SI 2 "x86_64_general_operand" "re,rm,L,k"))) + [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,Ya") + (and:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,qm") + (match_operand:SI 2 "x86_64_general_operand" "re,rm,L"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (AND, SImode, operands)" { @@ -8253,15 +8233,12 @@ case TYPE_IMOVX: return "#"; - case TYPE_MSKLOG: - return "kandd\t{%2, %1, %0|%0, %1, %2}"; - default: gcc_assert (rtx_equal_p (operands[0], operands[1])); return "and{l}\t{%2, %0|%0, %2}"; } } - [(set_attr "type" "alu,alu,imovx,msklog") + [(set_attr "type" "alu,alu,imovx") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -8269,13 +8246,13 @@ (match_operand 1 "ext_QIreg_operand"))) (const_string "1") (const_string "*"))) - (set_attr "length_immediate" "*,*,0,0") + (set_attr "length_immediate" "*,*,0") (set_attr "mode" "SI")]) (define_insn "*andhi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,Ya,!k") - (and:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,qm,k") - (match_operand:HI 2 "general_operand" "rn,rm,L,k"))) + [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,Ya") + (and:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,qm") + (match_operand:HI 2 "general_operand" "rn,rm,L"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (AND, HImode, operands)" { @@ -8284,16 +8261,13 @@ case TYPE_IMOVX: return "#"; - case TYPE_MSKLOG: - return "kandw\t{%2, %1, %0|%0, %1, %2}"; - default: gcc_assert (rtx_equal_p (operands[0], operands[1])); return "and{w}\t{%2, %0|%0, %2}"; } } - [(set_attr "type" "alu,alu,imovx,msklog") - (set_attr "length_immediate" "*,*,0,*") + [(set_attr "type" "alu,alu,imovx") + (set_attr "length_immediate" "*,*,0") (set (attr "prefix_rex") (if_then_else (and (eq_attr "type" "imovx") @@ -8300,12 +8274,12 @@ (match_operand 1 "ext_QIreg_operand")) (const_string "1") (const_string "*"))) - (set_attr "mode" "HI,HI,SI,HI")]) + (set_attr "mode" "HI,HI,SI")]) (define_insn "*andqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r,!k") - (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qn,qmn,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r") + (and:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") + (match_operand:QI 2 "general_operand" "qn,qmn,rn"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (AND, QImode, operands)" { @@ -8316,23 +8290,12 @@ return "and{b}\t{%2, %0|%0, %2}"; case 2: return "and{l}\t{%k2, %k0|%k0, %k2}"; - case 3: - if (get_attr_mode (insn) == MODE_HI) - return "kandw\t{%2, %1, %0|%0, %1, %2}"; - else - return "kandb\t{%2, %1, %0|%0, %1, %2}"; default: gcc_unreachable (); } } - [(set_attr "type" "alu,alu,alu,msklog") - (set (attr "mode") - (cond [(eq_attr "alternative" "2") - (const_string "SI") - (and (eq_attr "alternative" "3") - (not (match_test "TARGET_AVX512DQ"))) - (const_string "HI")] - (const_string "QI"))) + [(set_attr "type" "alu") + (set_attr "mode" "QI,QI,SI") ;; Potential partial reg stall on alternative 2. (set (attr "preferred_for_speed") (cond [(eq_attr "alternative" "2") @@ -8665,56 +8628,28 @@ }) (define_insn "kandn<mode>" - [(set (match_operand:SWI12 0 "register_operand" "=r,&r,!k") - (and:SWI12 - (not:SWI12 - (match_operand:SWI12 1 "register_operand" "r,0,k")) - (match_operand:SWI12 2 "register_operand" "r,r,k"))) - (clobber (reg:CC FLAGS_REG))] + [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k") + (and:SWI1248_AVX512BW + (not:SWI1248_AVX512BW + (match_operand:SWI1248_AVX512BW 1 "register_operand" "k")) + (match_operand:SWI1248_AVX512BW 2 "register_operand" "k"))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)] "TARGET_AVX512F" { - switch (which_alternative) - { - case 0: - return "andn\t{%k2, %k1, %k0|%k0, %k1, %k2}"; - case 1: - return "#"; - case 2: - if (get_attr_mode (insn) == MODE_HI) - return "kandnw\t{%2, %1, %0|%0, %1, %2}"; - else - return "kandn<mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"; - default: - gcc_unreachable (); - } + if (get_attr_mode (insn) == MODE_HI) + return "kandnw\t{%2, %1, %0|%0, %1, %2}"; + else + return "kandn<mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"; } - [(set_attr "isa" "bmi,*,avx512f") - (set_attr "type" "bitmanip,*,msklog") - (set_attr "prefix" "*,*,vex") - (set_attr "btver2_decode" "direct,*,*") + [(set_attr "type" "msklog") + (set_attr "prefix" "vex") (set (attr "mode") - (cond [(and (eq_attr "alternative" "2") - (and (match_test "<MODE>mode == QImode") - (not (match_test "TARGET_AVX512DQ")))) - (const_string "HI") + (cond [(and (match_test "<MODE>mode == QImode") + (not (match_test "TARGET_AVX512DQ"))) + (const_string "HI") ] (const_string "<MODE>")))]) -(define_split - [(set (match_operand:SWI12 0 "general_reg_operand") - (and:SWI12 - (not:SWI12 - (match_dup 0)) - (match_operand:SWI12 1 "general_reg_operand"))) - (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && !TARGET_BMI && reload_completed" - [(set (match_dup 0) - (not:SWI12 (match_dup 0))) - (parallel [(set (match_dup 0) - (and:SWI12 (match_dup 0) - (match_dup 1))) - (clobber (reg:CC FLAGS_REG))])]) - (define_insn_and_split "*andndi3_doubleword" [(set (match_operand:DI 0 "register_operand" "=r") (and:DI @@ -8732,7 +8667,7 @@ (clobber (reg:CC FLAGS_REG))])] "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") -(define_insn "*bmi_andn_<mode>" +(define_insn "*andn<mode>_1" [(set (match_operand:SWI48 0 "register_operand" "=r,r") (and:SWI48 (not:SWI48 (match_operand:SWI48 1 "register_operand" "r,r")) @@ -8744,6 +8679,18 @@ (set_attr "btver2_decode" "direct, double") (set_attr "mode" "<MODE>")]) +(define_insn "*andn<mode>_1" + [(set (match_operand:SWI12 0 "register_operand" "=r") + (and:SWI12 + (not:SWI12 (match_operand:SWI12 1 "register_operand" "r")) + (match_operand:SWI12 2 "register_operand" "r"))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_BMI" + "andn\t{%k2, %k1, %k0|%k0, %k1, %k2}" + [(set_attr "type" "bitmanip") + (set_attr "btver2_decode" "direct") + (set_attr "mode" "SI")]) + (define_insn "*bmi_andn_<mode>_ccno" [(set (reg FLAGS_REG) (compare @@ -8813,17 +8760,14 @@ }) (define_insn "*<code><mode>_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,rm,k") - (any_or:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,k") - (match_operand:SWI48 2 "<general_operand>" "<g>,r<i>,k"))) + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=r,rm") + (any_or:SWI248 + (match_operand:SWI248 1 "nonimmediate_operand" "%0,0") + (match_operand:SWI248 2 "<general_operand>" "<g>,r<i>"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)" - "@ - <logic>{<imodesuffix>}\t{%2, %0|%0, %2} - <logic>{<imodesuffix>}\t{%2, %0|%0, %2} - k<logic><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "alu,alu,msklog") + "<logic>{<imodesuffix>}\t{%2, %0|%0, %2}" + [(set_attr "type" "alu") (set_attr "mode" "<MODE>")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand @@ -8849,33 +8793,18 @@ [(set_attr "type" "alu") (set_attr "mode" "SI")]) -(define_insn "*<code>hi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,rm,!k") - (any_or:HI - (match_operand:HI 1 "nonimmediate_operand" "%0,0,k") - (match_operand:HI 2 "general_operand" "rmn,rn,k"))) - (clobber (reg:CC FLAGS_REG))] - "ix86_binary_operator_ok (<CODE>, HImode, operands)" - "@ - <logic>{w}\t{%2, %0|%0, %2} - <logic>{w}\t{%2, %0|%0, %2} - k<logic>w\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "alu,alu,msklog") - (set_attr "mode" "HI")]) - (define_insn "*<code>qi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=q,m,r,!k") - (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0,k") - (match_operand:QI 2 "general_operand" "qmn,qn,rn,k"))) + [(set (match_operand:QI 0 "nonimmediate_operand" "=q,m,r") + (any_or:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0") + (match_operand:QI 2 "general_operand" "qmn,qn,rn"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (<CODE>, QImode, operands)" "@ <logic>{b}\t{%2, %0|%0, %2} <logic>{b}\t{%2, %0|%0, %2} - <logic>{l}\t{%k2, %k0|%k0, %k2} - k<logic>w\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "type" "alu,alu,alu,msklog") - (set_attr "mode" "QI,QI,SI,HI") + <logic>{l}\t{%k2, %k0|%k0, %k2}" + [(set_attr "type" "alu") + (set_attr "mode" "QI,QI,SI") ;; Potential partial reg stall on alternative 2. (set (attr "preferred_for_speed") (cond [(eq_attr "alternative" "2") @@ -9111,47 +9040,28 @@ (set_attr "mode" "QI")]) (define_insn "kxnor<mode>" - [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=r,!k") + [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k") (not:SWI1248_AVX512BW (xor:SWI1248_AVX512BW - (match_operand:SWI1248_AVX512BW 1 "register_operand" "0,k") - (match_operand:SWI1248_AVX512BW 2 "register_operand" "r,k")))) - (clobber (reg:CC FLAGS_REG))] + (match_operand:SWI1248_AVX512BW 1 "register_operand" "k") + (match_operand:SWI1248_AVX512BW 2 "register_operand" "k")))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)] "TARGET_AVX512F" { - if (which_alternative == 0) - return "#"; - if (get_attr_mode (insn) == MODE_HI) return "kxnorw\t{%2, %1, %0|%0, %1, %2}"; else return "kxnor<mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}"; } - [(set_attr "type" "*,msklog") - (set_attr "prefix" "*,vex") + [(set_attr "type" "msklog") + (set_attr "prefix" "vex") (set (attr "mode") - (cond [(and (eq_attr "alternative" "1") - (and (match_test "<MODE>mode == QImode") - (not (match_test "TARGET_AVX512DQ")))) + (cond [(and (match_test "<MODE>mode == QImode") + (not (match_test "TARGET_AVX512DQ"))) (const_string "HI") ] (const_string "<MODE>")))]) -(define_split - [(set (match_operand:SWI1248x 0 "general_reg_operand") - (not:SWI1248x - (xor:SWI1248x - (match_dup 0) - (match_operand:SWI1248x 1 "general_reg_operand")))) - (clobber (reg:CC FLAGS_REG))] - "TARGET_AVX512F && reload_completed" - [(parallel [(set (match_dup 0) - (xor:SWI1248x (match_dup 0) - (match_dup 1))) - (clobber (reg:CC FLAGS_REG))]) - (set (match_dup 0) - (not:SWI1248x (match_dup 0)))]) - ;;There are kortrest[bdq] but no intrinsics for them. ;;We probably don't need to implement them. (define_insn "kortestzhi" @@ -9604,6 +9514,27 @@ ;; One complement instructions +(define_insn "knot<mode>" + [(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k") + (not:SWI1248_AVX512BW + (match_operand:SWI1248_AVX512BW 1 "register_operand" "k"))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)] + "TARGET_AVX512F" +{ + if (get_attr_mode (insn) == MODE_HI) + return "knotw\t{%1, %0|%0, %1}"; + else + return "knot<mskmodesuffix>\t{%1, %0|%0, %1}"; +} + [(set_attr "type" "msklog") + (set_attr "prefix" "vex") + (set (attr "mode") + (cond [(and (match_test "<MODE>mode == QImode") + (not (match_test "TARGET_AVX512DQ"))) + (const_string "HI") + ] + (const_string "<MODE>")))]) + (define_expand "one_cmpl<mode>2" [(set (match_operand:SWIM 0 "nonimmediate_operand") (not:SWIM (match_operand:SWIM 1 "nonimmediate_operand")))] @@ -9611,15 +9542,11 @@ "ix86_expand_unary_operator (NOT, <MODE>mode, operands); DONE;") (define_insn "*one_cmpl<mode>2_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,k") - (not:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,k")))] + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm") + (not:SWI248 (match_operand:SWI248 1 "nonimmediate_operand" "0")))] "ix86_unary_operator_ok (NOT, <MODE>mode, operands)" - "@ - not{<imodesuffix>}\t%0 - knot<mskmodesuffix>\t{%1, %0|%0, %1}" - [(set_attr "isa" "*,avx512bw") - (set_attr "type" "negnot,msklog") - (set_attr "prefix" "*,vex") + "not{<imodesuffix>}\t%0" + [(set_attr "type" "negnot") (set_attr "mode" "<MODE>")]) ;; ??? Currently never generated - xor is used instead. @@ -9632,48 +9559,15 @@ [(set_attr "type" "negnot") (set_attr "mode" "SI")]) -(define_insn "*one_cmplhi2_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,!k") - (not:HI (match_operand:HI 1 "nonimmediate_operand" "0,k")))] - "ix86_unary_operator_ok (NOT, HImode, operands)" - "@ - not{w}\t%0 - knotw\t{%1, %0|%0, %1}" - [(set_attr "isa" "*,avx512f") - (set_attr "type" "negnot,msklog") - (set_attr "prefix" "*,vex") - (set_attr "mode" "HI")]) - (define_insn "*one_cmplqi2_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r,!k") - (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0,k")))] + [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,r") + (not:QI (match_operand:QI 1 "nonimmediate_operand" "0,0")))] "ix86_unary_operator_ok (NOT, QImode, operands)" -{ - switch (which_alternative) - { - case 0: - return "not{b}\t%0"; - case 1: - return "not{l}\t%k0"; - case 2: - if (get_attr_mode (insn) == MODE_HI) - return "knotw\t{%1, %0|%0, %1}"; - else - return "knotb\t{%1, %0|%0, %1}"; - default: - gcc_unreachable (); - } -} - [(set_attr "isa" "*,*,avx512f") - (set_attr "type" "negnot,negnot,msklog") - (set_attr "prefix" "*,*,vex") - (set (attr "mode") - (cond [(eq_attr "alternative" "1") - (const_string "SI") - (and (eq_attr "alternative" "2") - (not (match_test "TARGET_AVX512DQ"))) - (const_string "HI")] - (const_string "QI"))) + "@ + not{b}\t%0 + not{l}\t%k0" + [(set_attr "type" "negnot") + (set_attr "mode" "QI,SI") ;; Potential partial reg stall on alternative 1. (set (attr "preferred_for_speed") (cond [(eq_attr "alternative" "1") @@ -9757,14 +9651,17 @@ ;; shift pair, instead using moves and sign extension for counts greater ;; than 31. -(define_insn "*<mshift><mode>3" +(define_insn "*k<code><mode>" [(set (match_operand:SWI1248_AVX512BWDQ 0 "register_operand" "=k") - (any_lshift:SWI1248_AVX512BWDQ (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k") - (match_operand:QI 2 "immediate_operand" "i")))] + (any_lshift:SWI1248_AVX512BWDQ + (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k") + (match_operand:QI 2 "immediate_operand" "n"))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)] "TARGET_AVX512F" "k<mshift><mskmodesuffix>\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "msklog") - (set_attr "prefix" "vex")]) + (set_attr "prefix" "vex") + (set_attr "mode" "<MODE>")]) (define_expand "ashl<mode>3" [(set (match_operand:SDWIM 0 "<shift_operand>") Index: config/i386/predicates.md =================================================================== --- config/i386/predicates.md (revision 242841) +++ config/i386/predicates.md (working copy) @@ -53,11 +53,6 @@ (and (match_code "reg") (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))"))) -;; True if the operand is an AVX-512 mask register. -(define_predicate "mask_reg_operand" - (and (match_code "reg") - (match_test "MASK_REGNO_P (REGNO (op))"))) - ;; Return true if op is a QImode register. (define_predicate "any_QIreg_operand" (and (match_code "reg") Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 242841) +++ config/i386/sse.md (working copy) @@ -13712,19 +13712,22 @@ "ix86_expand_sse_unpack (operands[0], operands[1], true, true); DONE;") (define_expand "vec_unpacks_hi_hi" - [(set (subreg:HI (match_operand:QI 0 "register_operand") 0) - (lshiftrt:HI (match_operand:HI 1 "register_operand") - (const_int 8)))] + [(parallel + [(set (subreg:HI (match_operand:QI 0 "register_operand") 0) + (lshiftrt:HI (match_operand:HI 1 "register_operand") + (const_int 8))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)])] "TARGET_AVX512F") (define_expand "vec_unpacks_hi_<mode>" - [(set (subreg:SWI48x (match_operand:<HALFMASKMODE> 0 "register_operand") 0) - (lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand") - (match_dup 2)))] + [(parallel + [(set (subreg:SWI48x + (match_operand:<HALFMASKMODE> 0 "register_operand") 0) + (lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand") + (match_dup 2))) + (unspec [(const_int 0)] UNSPEC_KMASKOP)])] "TARGET_AVX512BW" -{ - operands[2] = GEN_INT (GET_MODE_BITSIZE (<HALFMASKMODE>mode)); -}) + "operands[2] = GEN_INT (GET_MODE_BITSIZE (<HALFMASKMODE>mode));") ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;