Will and Segher: This is the rest of the second patch which adds the 128-bit integer support for divide, modulo, shift, compare of 128-bit integers instructions and builtin support.
In the last round of changes, the flag for the 128-bit operations was removed. Per Will's comments, the BU_P10_128BIT_* builtin definitions can be removed. Instead we can just use P10V_BUILTIN. Similarly for the BU_P10_P builtin definition. The commit log was updated to reflect the change. There were a few change log entries for the 128-bit operations flag that needed removing. As well as other fixes noted by Will. The changes are all name changes not functional changes. No regression failures were found when run on a P9. Please let me know if this is ready for mainline. Carl -------------------------------------------------------- gcc/ChangeLog 2020-10/05 Carl Love <c...@us.ibm.com> * config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define for new builtins. * config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD, UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs. (altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud, altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq, altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm, altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq, altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New define_insn. (vec_widen_umult_even_v2di, vec_widen_smult_even_v2di, vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi, altivec_vrlqnm): New define_expands. * config/rs6000/rs6000-builtin.def (VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions. (VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI, CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P, VCMPAET_P, VSIGNEXTSD2Q, VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI, MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions. (VRLQ, VSLQ, VSRQ, VSRAQ, DIVE, MOD, SIGNEXT): New overload expansions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT, P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI, P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST, P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_VCMPLE_U1TI, P10V_BUILTIN_128BIT_DIV_V1TI, P10V_BUILTIN_128BIT_UDIV_V1TI, P10V_BUILTIN_128BIT_VMULESD, P10V_BUILTIN_128BIT_VMULEUD, P10V_BUILTIN_128BIT_VMULOSD, P10V_BUILTIN_128BIT_VMULOUD, P10V_BUILTIN_VNOR_V1TI, P10V_BUILTIN_VNOR_V1TI_UNS, P10V_BUILTIN_128BIT_VRLQ, P10V_BUILTIN_128BIT_VRLQMI, P10V_BUILTIN_128BIT_VRLQNM, P10V_BUILTIN_128BIT_VSLQ, P10V_BUILTIN_128BIT_VSRQ, P10V_BUILTIN_128BIT_VSRAQ, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_VCMPEQUT_P, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_CMPNET, P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P, P10V_BUILTIN_128BIT_VSIGNEXTSD2Q, P10V_BUILTIN_128BIT_DIVES_V1TI, P10V_BUILTIN_128BIT_MODS_V1TI, P10V_BUILTIN_128BIT_MODU_V1TI): New overloaded definitions. (rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI, P10_BUILTIN_CMPLE_U1TI]: New case statements. (rs6000_init_builtins) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]: New assignments. (altivec_init_builtins): New E_V1TImode case statement. (builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD, P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI, P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements. * config/rs6000/r6000.c (rs6000_handle_altivec_attribute)[E_TImode, E_V1TImode]: New case statements. * config/rs6000/r6000.h (RS6000_BTM_TI_VECTOR_OPS): New defines. (rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI. * config/rs6000/vector.md (vector_gtv1ti,vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p, vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3, vlshrv1ti3, vashrv1ti3): New define_expands. * config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ, UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ, UNSPEC_VSX_MODUQ, UNSPEC_XXSWAPD_V1TI): New unspecs. (vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti, vsx_sign_extend_v2di_v1ti): New define_insns. (vcmpnet): New define_expand. * gcc/doc/extend.texi: Add documentation for the new builtins vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo, vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt, vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt, vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt, vec_any_ge, vec_any_le. gcc/testsuite/ChangeLog 2020-10-05 Carl Love <c...@us.ibm.com> * gcc.target/powerpc/int_128bit-runnable.c: New test file. --- gcc/config/rs6000/altivec.h | 4 + gcc/config/rs6000/altivec.md | 240 ++ gcc/config/rs6000/rs6000-builtin.def | 46 +- gcc/config/rs6000/rs6000-call.c | 138 +- gcc/config/rs6000/rs6000.c | 1 + gcc/config/rs6000/rs6000.h | 4 +- gcc/config/rs6000/vector.md | 191 ++ gcc/config/rs6000/vsx.md | 97 + gcc/doc/extend.texi | 174 ++ .../gcc.target/powerpc/int_128bit-runnable.c | 2254 +++++++++++++++++ 10 files changed, 3145 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index cfa5eda4cd5..fc67073f79c 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -690,6 +690,10 @@ __altivec_scalar_pred(vec_any_nle, #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0) #ifdef _ARCH_PWR10 +#define vec_signextq __builtin_vec_vsignextq +#define vec_dive __builtin_vec_dive +#define vec_mod __builtin_vec_mod + /* May modify these macro definitions if future capabilities overload with support for different vector argument and result types. */ #define vec_cntlzm(a, b) __builtin_altivec_vclzdm (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 0a2e634d6b0..34a4731342a 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -39,12 +39,16 @@ UNSPEC_VMULESH UNSPEC_VMULEUW UNSPEC_VMULESW + UNSPEC_VMULEUD + UNSPEC_VMULESD UNSPEC_VMULOUB UNSPEC_VMULOSB UNSPEC_VMULOUH UNSPEC_VMULOSH UNSPEC_VMULOUW UNSPEC_VMULOSW + UNSPEC_VMULOUD + UNSPEC_VMULOSD UNSPEC_VPKPX UNSPEC_VPACK_SIGN_SIGN_SAT UNSPEC_VPACK_SIGN_UNS_SAT @@ -628,6 +632,14 @@ "vcmpequ<VI_char> %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_eqv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpequq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gt<mode>" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gt:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -636,6 +648,14 @@ "vcmpgts<VI_char> %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gt:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpgtsq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gtu<mode>" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gtu:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -644,6 +664,14 @@ "vcmpgtu<VI_char> %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_POWER10" + "vcmpgtuq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_eqv4sf" [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") @@ -1687,6 +1715,19 @@ DONE; }) +(define_expand "vec_widen_umult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_even_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1700,6 +1741,19 @@ DONE; }) +(define_expand "vec_widen_smult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_umult_odd_v16qi" [(use (match_operand:V8HI 0 "register_operand")) (use (match_operand:V16QI 1 "register_operand")) @@ -1765,6 +1819,19 @@ DONE; }) +(define_expand "vec_widen_umult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_odd_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1778,6 +1845,19 @@ DONE; }) +(define_expand "vec_widen_smult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_POWER10" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_insn "altivec_vmuleub" [(set (match_operand:V8HI 0 "register_operand" "=v") (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v") @@ -1859,6 +1939,15 @@ "vmuleuw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuleud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULEUD))] + "TARGET_POWER10" + "vmuleud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulouw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1868,6 +1957,15 @@ "vmulouw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuloud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOUD))] + "TARGET_POWER10" + "vmuloud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulesw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1877,6 +1975,15 @@ "vmulesw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulesd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULESD))] + "TARGET_POWER10" + "vmulesd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulosw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1886,6 +1993,15 @@ "vmulosw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulosd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOSD))] + "TARGET_POWER10" + "vmulosd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + ;; Vector pack/unpack (define_insn "altivec_vpkpx" [(set (match_operand:V8HI 0 "register_operand" "=v") @@ -1979,6 +2095,15 @@ "vrl<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vrlq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +;; rotate amount in needs to be in bits[57:63] of operand2. + "vrlq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vrl<VI_char>mi" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "0") @@ -1989,6 +2114,33 @@ "vrl<VI_char>mi %0,%2,%3" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqmi" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand") + (match_operand:V1TI 3 "vsx_register_operand")] + UNSPEC_VRLMI))] + "TARGET_POWER10" +{ + /* Mask bit begin, end fields need to be in bits [41:55] of 128-bit operand2. */ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[3])); + emit_insn(gen_altivec_vrlqmi_inst (operands[0], operands[1], operands[2], tmp)); + DONE; +}) + +(define_insn "altivec_vrlqmi_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "0") + (match_operand:V1TI 3 "vsx_register_operand" "v")] + UNSPEC_VRLMI))] + "TARGET_POWER10" + "vrlqmi %0,%1,%3" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vrl<VI_char>nm" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v") @@ -1998,6 +2150,31 @@ "vrl<VI_char>nm %0,%1,%2" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqnm" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand")] + UNSPEC_VRLNM))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vrlqnm_inst (operands[0], operands[1], tmp)); + DONE; +}) + +(define_insn "altivec_vrlqnm_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VRLNM))] + "TARGET_POWER10" + ;; rotate and mask bits need to be in upper 64-bits of operand2. + "vrlqnm %0,%1,%2" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vsl" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2042,6 +2219,15 @@ "vsl<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vslq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vslq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsr<VI_char>" [(set (match_operand:VI2 0 "register_operand" "=v") (lshiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2050,6 +2236,15 @@ "vsr<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsrq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsrq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsra<VI_char>" [(set (match_operand:VI2 0 "register_operand" "=v") (ashiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2058,6 +2253,15 @@ "vsra<VI_char> %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsraq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsraq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vsr" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2618,6 +2822,18 @@ "vcmpequ<VI_char>. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_vcmpequt_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpequq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgts<VI_char>_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gt:CC (match_operand:VI2 1 "register_operand" "v") @@ -2630,6 +2846,18 @@ "vcmpgts<VI_char>. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtst_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gt:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpgtsq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgtu<VI_char>_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gtu:CC (match_operand:VI2 1 "register_operand" "v") @@ -2642,6 +2870,18 @@ "vcmpgtu<VI_char>. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtut_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gtu:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_POWER10" + "vcmpgtuq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpeqfp_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(eq:CC (match_operand:V4SF 1 "register_operand" "v") diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 4c2e9460949..14345f9ea9d 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1178,7 +1178,6 @@ (RS6000_BTC_ ## ATTR /* ATTR */ \ | RS6000_BTC_TERNARY), \ CODE_FOR_ ## ICODE) /* ICODE */ - /* Insure 0 is not a legitimate index. */ BU_SPECIAL_X (RS6000_BUILTIN_NONE, NULL, 0, RS6000_BTC_MISC) @@ -2736,6 +2735,10 @@ BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST, vsx_sign_extend_hi_v2di) BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST, vsx_sign_extend_si_v2di) /* Builtins for scalar instructions added in ISA 3.1 (power10). */ +BU_P10V_AV_2 (VCMPEQUT_P, "vcmpequt_p", CONST, vector_eq_v1ti_p) +BU_P10V_AV_2 (VCMPGTST_P, "vcmpgtst_p", CONST, vector_gt_v1ti_p) +BU_P10V_AV_2 (VCMPGTUT_P, "vcmpgtut_p", CONST, vector_gtu_v1ti_p) + BU_P10_MISC_2 (CFUGED, "cfuged", CONST, cfuged) BU_P10_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm) BU_P10_MISC_2 (CNTTZDM, "cnttzdm", CONST, cnttzdm) @@ -2756,7 +2759,38 @@ BU_P10V_VSX_2 (XXGENPCVM_V16QI, "xxgenpcvm_v16qi", CONST, xxgenpcvm_v16qi) BU_P10V_VSX_2 (XXGENPCVM_V8HI, "xxgenpcvm_v8hi", CONST, xxgenpcvm_v8hi) BU_P10V_VSX_2 (XXGENPCVM_V4SI, "xxgenpcvm_v4si", CONST, xxgenpcvm_v4si) BU_P10V_VSX_2 (XXGENPCVM_V2DI, "xxgenpcvm_v2di", CONST, xxgenpcvm_v2di) - +BU_P10V_AV_2 (VCMPGTUT, "vcmpgtut", CONST, vector_gtuv1ti) +BU_P10V_AV_2 (VCMPGTST, "vcmpgtst", CONST, vector_gtv1ti) +BU_P10V_AV_2 (VCMPEQUT, "vcmpequt", CONST, eqvv1ti3) +BU_P10V_AV_2 (CMPNET, "vcmpnet", CONST, vcmpnet) +BU_P10V_AV_2 (CMPGE_1TI, "cmpge_1ti", CONST, vector_nltv1ti) +BU_P10V_AV_2 (CMPGE_U1TI, "cmpge_u1ti", CONST, vector_nltuv1ti) +BU_P10V_AV_2 (CMPLE_1TI, "cmple_1ti", CONST, vector_ngtv1ti) +BU_P10V_AV_2 (CMPLE_U1TI, "cmple_u1ti", CONST, vector_ngtuv1ti) +BU_P10V_AV_2 (VNOR_V1TI_UNS, "vnor_v1ti_uns",CONST, norv1ti3) +BU_P10V_AV_2 (VNOR_V1TI, "vnor_v1ti", CONST, norv1ti3) +BU_P10V_AV_2 (VCMPNET_P, "vcmpnet_p", CONST, vector_ne_v1ti_p) +BU_P10V_AV_2 (VCMPAET_P, "vcmpaet_p", CONST, vector_ae_v1ti_p) + +BU_P10V_AV_1 (VSIGNEXTSD2Q, "vsignext", CONST, vsx_sign_extend_v2di_v1ti) + +BU_P10V_AV_2 (VMULEUD, "vmuleud", CONST, vec_widen_umult_even_v2di) +BU_P10V_AV_2 (VMULESD, "vmulesd", CONST, vec_widen_smult_even_v2di) +BU_P10V_AV_2 (VMULOUD, "vmuloud", CONST, vec_widen_umult_odd_v2di) +BU_P10V_AV_2 (VMULOSD, "vmulosd", CONST, vec_widen_smult_odd_v2di) +BU_P10V_AV_2 (VRLQ, "vrlq", CONST, vrotlv1ti3) +BU_P10V_AV_2 (VSLQ, "vslq", CONST, vashlv1ti3) +BU_P10V_AV_2 (VSRQ, "vsrq", CONST, vlshrv1ti3) +BU_P10V_AV_2 (VSRAQ, "vsraq", CONST, vashrv1ti3) +BU_P10V_AV_2 (VRLQNM, "vrlqnm", CONST, altivec_vrlqnm) +BU_P10V_AV_2 (DIV_V1TI, "div_1ti", CONST, vsx_div_v1ti) +BU_P10V_AV_2 (UDIV_V1TI, "udiv_1ti", CONST, vsx_udiv_v1ti) +BU_P10V_AV_2 (DIVES_V1TI, "dives", CONST, vsx_dives_v1ti) +BU_P10V_AV_2 (DIVEU_V1TI, "diveu", CONST, vsx_diveu_v1ti) +BU_P10V_AV_2 (MODS_V1TI, "mods", CONST, vsx_mods_v1ti) +BU_P10V_AV_2 (MODU_V1TI, "modu", CONST, vsx_modu_v1ti) + +BU_P10V_AV_3 (VRLQMI, "vrlqmi", CONST, altivec_vrlqmi) BU_P10V_AV_3 (VEXTRACTBL, "vextdubvlx", CONST, vextractlv16qi) BU_P10V_AV_3 (VEXTRACTHL, "vextduhvlx", CONST, vextractlv8hi) BU_P10V_AV_3 (VEXTRACTWL, "vextduwvlx", CONST, vextractlv4si) @@ -2863,6 +2897,12 @@ BU_P10_OVERLOAD_2 (CLRR, "clrr") BU_P10_OVERLOAD_2 (GNB, "gnb") BU_P10_OVERLOAD_4 (XXEVAL, "xxeval") BU_P10_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm") +BU_P10_OVERLOAD_2 (VRLQ, "vrlq") +BU_P10_OVERLOAD_2 (VSLQ, "vslq") +BU_P10_OVERLOAD_2 (VSRQ, "vsrq") +BU_P10_OVERLOAD_2 (VSRAQ, "vsraq") +BU_P10_OVERLOAD_2 (DIVE, "dive") +BU_P10_OVERLOAD_2 (MOD, "mod") BU_P10_OVERLOAD_3 (EXTRACTL, "extractl") BU_P10_OVERLOAD_3 (EXTRACTH, "extracth") @@ -2882,6 +2922,8 @@ BU_P10_OVERLOAD_1 (VSTRIL_P, "stril_p") BU_P10_OVERLOAD_1 (XVTLSBB_ZEROS, "xvtlsbb_all_zeros") BU_P10_OVERLOAD_1 (XVTLSBB_ONES, "xvtlsbb_all_ones") +BU_P10_OVERLOAD_1 (SIGNEXT, "vsignextq") + BU_P10_OVERLOAD_1 (MTVSRBM, "mtvsrbm") BU_P10_OVERLOAD_1 (MTVSRHM, "mtvsrhm") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 9e514a01012..87fff5c1c80 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -839,6 +839,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10V_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10V_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, VSX_BUILTIN_XVCMPEQDP, @@ -885,6 +889,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPGE, VSX_BUILTIN_CMPGE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + + { ALTIVEC_BUILTIN_VEC_CMPGE, P10V_BUILTIN_CMPGE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPGE, P10V_BUILTIN_CMPGE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -899,8 +909,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10V_BUILTIN_VCMPGTUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTSD, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10V_BUILTIN_VCMPGTST, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, VSX_BUILTIN_XVCMPGTDP, @@ -943,6 +957,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPLE, VSX_BUILTIN_CMPLE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10V_BUILTIN_CMPLE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10V_BUILTIN_CMPLE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -995,6 +1014,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_UDIV_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10V_BUILTIN_DIV_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10V_BUILTIN_UDIV_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVSXDDP, RS6000_BTI_V2DF, RS6000_BTI_V2DI, 0, 0 }, { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVUXDDP, @@ -1789,6 +1814,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULE, P8V_BUILTIN_VMULEUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10V_BUILTIN_VMULESD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10V_BUILTIN_VMULEUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESB, ALTIVEC_BUILTIN_VMULESB, @@ -1812,6 +1842,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULO, P8V_BUILTIN_VMULOUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10V_BUILTIN_VMULOSD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10V_BUILTIN_VMULOUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH, @@ -1854,6 +1889,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10V_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, @@ -2115,6 +2160,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_RL, P8V_BUILTIN_VRLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10V_BUILTIN_VRLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10V_BUILTIN_VRLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, @@ -2133,12 +2183,23 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_RLMI, P9V_BUILTIN_VRLDMI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { P9V_BUILTIN_VEC_RLMI, P10V_BUILTIN_VRLQMI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI }, + { P9V_BUILTIN_VEC_RLMI, P10V_BUILTIN_VRLQMI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLWNM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLDNM, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10V_BUILTIN_VRLQNM, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10V_BUILTIN_VRLQNM, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, @@ -2155,6 +2216,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, P8V_BUILTIN_VSLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10V_BUILTIN_VSLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10V_BUILTIN_VSLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTDP, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTSP, @@ -2351,6 +2417,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SR, P8V_BUILTIN_VSRD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10V_BUILTIN_VSRQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10V_BUILTIN_VSRQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, @@ -2379,6 +2450,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SRA, P8V_BUILTIN_VSRAD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10V_BUILTIN_VSRAQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10V_BUILTIN_VSRAQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, @@ -3996,12 +4072,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10V_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10V_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, VSX_BUILTIN_XVCMPGTDP_P, @@ -4066,6 +4146,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P8V_BUILTIN_VCMPEQUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10V_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10V_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, VSX_BUILTIN_XVCMPEQDP_P, @@ -4117,12 +4201,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10V_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10V_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, ALTIVEC_BUILTIN_VCMPGEFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, VSX_BUILTIN_XVCMPGEDP_P, @@ -4771,6 +4859,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEW, RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10V_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10V_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, /* The following 2 entries have been deprecated. */ { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEB_P, @@ -4871,6 +4965,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, + { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, @@ -4976,7 +5072,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, - + { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P, @@ -5903,6 +6000,21 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P10_BUILTIN_VEC_XVTLSBB_ONES, P10V_BUILTIN_XVTLSBB_ONES, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P10_BUILTIN_VEC_SIGNEXT, P10V_BUILTIN_VSIGNEXTSD2Q, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, 0, 0 }, + + { P10_BUILTIN_VEC_DIVE, P10V_BUILTIN_DIVES_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_DIVE, P10V_BUILTIN_DIVEU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + + { P10_BUILTIN_VEC_MOD, P10V_BUILTIN_MODS_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_MOD, P10V_BUILTIN_MODU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 } }; @@ -12256,12 +12368,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPEQUH: case ALTIVEC_BUILTIN_VCMPEQUW: case P8V_BUILTIN_VCMPEQUD: + case P10V_BUILTIN_VCMPEQUT: fold_compare_helper (gsi, EQ_EXPR, stmt); return true; case P9V_BUILTIN_CMPNEB: case P9V_BUILTIN_CMPNEH: case P9V_BUILTIN_CMPNEW: + case P10V_BUILTIN_CMPNET: fold_compare_helper (gsi, NE_EXPR, stmt); return true; @@ -12273,6 +12387,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_2DI: case VSX_BUILTIN_CMPGE_U2DI: + case P10V_BUILTIN_CMPGE_1TI: + case P10V_BUILTIN_CMPGE_U1TI: fold_compare_helper (gsi, GE_EXPR, stmt); return true; @@ -12284,6 +12400,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: case P8V_BUILTIN_VCMPGTSD: + case P10V_BUILTIN_VCMPGTUT: + case P10V_BUILTIN_VCMPGTST: fold_compare_helper (gsi, GT_EXPR, stmt); return true; @@ -12295,6 +12413,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPLE_U4SI: case VSX_BUILTIN_CMPLE_2DI: case VSX_BUILTIN_CMPLE_U2DI: + case P10V_BUILTIN_CMPLE_1TI: + case P10V_BUILTIN_CMPLE_U1TI: fold_compare_helper (gsi, LE_EXPR, stmt); return true; @@ -13000,6 +13120,8 @@ rs6000_init_builtins (void) ? "__vector __bool long" : "__vector __bool long long", bool_long_long_type_node, 2); + bool_V1TI_type_node = rs6000_vector_type ("__vector __bool __int128", + intTI_type_node, 1); pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel", pixel_type_node, 8); @@ -13185,6 +13307,10 @@ altivec_init_builtins (void) = build_function_type_list (integer_type_node, integer_type_node, V2DI_type_node, V2DI_type_node, NULL_TREE); + tree int_ftype_int_v1ti_v1ti + = build_function_type_list (integer_type_node, + integer_type_node, V1TI_type_node, + V1TI_type_node, NULL_TREE); tree void_ftype_v4si = build_function_type_list (void_type_node, V4SI_type_node, NULL_TREE); tree v8hi_ftype_void @@ -13537,6 +13663,9 @@ altivec_init_builtins (void) case E_VOIDmode: type = int_ftype_int_opaque_opaque; break; + case E_V1TImode: + type = int_ftype_int_v1ti_v1ti; + break; case E_V2DImode: type = int_ftype_int_v2di_v2di; break; @@ -14136,6 +14265,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case P10V_BUILTIN_XXGENPCVM_V8HI: case P10V_BUILTIN_XXGENPCVM_V4SI: case P10V_BUILTIN_XXGENPCVM_V2DI: + case P10V_BUILTIN_VMULEUD: + case P10V_BUILTIN_VMULOUD: + case P10V_BUILTIN_DIVEU_V1TI: + case P10V_BUILTIN_MODU_V1TI: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; @@ -14235,10 +14368,13 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case VSX_BUILTIN_CMPGE_U8HI: case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_U2DI: + case P10V_BUILTIN_CMPGE_U1TI: case ALTIVEC_BUILTIN_VCMPGTUB: case ALTIVEC_BUILTIN_VCMPGTUH: case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: + case P10V_BUILTIN_VCMPGTUT: + case P10V_BUILTIN_VCMPEQUT: h.uns_p[1] = 1; h.uns_p[2] = 1; break; diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 6f204ca202a..cc629f2d938 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -19546,6 +19546,7 @@ rs6000_handle_altivec_attribute (tree *node, case 'b': switch (mode) { + case E_TImode: case E_V1TImode: result = bool_V1TI_type_node; break; case E_DImode: case E_V2DImode: result = bool_V2DI_type_node; break; case E_SImode: case E_V4SImode: result = bool_V4SI_type_node; break; case E_HImode: case E_V8HImode: result = bool_V8HI_type_node; break; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index bbd8060e143..32ed95cc813 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2322,7 +2322,7 @@ extern int frame_pointer_needed; #define RS6000_BTM_FLOAT128_HW MASK_FLOAT128_HW /* IEEE 128-bit float h/w. */ #define RS6000_BTM_MMA MASK_MMA /* ISA 3.1 MMA. */ #define RS6000_BTM_P10 MASK_POWER10 - +#define RS6000_BTM_TI_VECTOR_OPS MASK_TI_VECTOR_OPS /* 128-bit integer support */ #define RS6000_BTM_COMMON (RS6000_BTM_ALTIVEC \ | RS6000_BTM_VSX \ @@ -2436,6 +2436,7 @@ enum rs6000_builtin_type_index RS6000_BTI_bool_V8HI, /* __vector __bool short */ RS6000_BTI_bool_V4SI, /* __vector __bool int */ RS6000_BTI_bool_V2DI, /* __vector __bool long */ + RS6000_BTI_bool_V1TI, /* __vector __bool 128-bit */ RS6000_BTI_pixel_V8HI, /* __vector __pixel */ RS6000_BTI_long, /* long_integer_type_node */ RS6000_BTI_unsigned_long, /* long_unsigned_type_node */ @@ -2489,6 +2490,7 @@ enum rs6000_builtin_type_index #define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI]) #define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI]) #define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI]) +#define bool_V1TI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V1TI]) #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI]) #define long_long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long_long]) diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 796345c80d3..0cca4232619 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -685,6 +685,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vector_gtv1ti" + [(set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand")))] + "TARGET_POWER10" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nlt<mode>" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -697,6 +704,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_gtu<mode>" [(set (match_operand:VEC_I 0 "vint_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -704,6 +722,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vector_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand")))] + "TARGET_POWER10" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nltu<mode>" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -716,6 +741,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_geu<mode>" [(set (match_operand:VEC_I 0 "vint_operand") (geu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -735,6 +771,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_ngtu<mode>" [(set (match_operand:VEC_I 3 "vlogical_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand") @@ -746,6 +793,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + ; There are 14 possible vector FP comparison operators, gt and eq of them have ; been expanded above, so just support 12 remaining operators here. @@ -894,6 +952,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vector_eq_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (eq:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + ;; This expansion handles the V16QI, V8HI, and V4SI modes in the ;; implementation of the vec_all_ne built-in functions on Power9. (define_expand "vector_ne_<mode>_p" @@ -976,6 +1046,23 @@ operands[3] = gen_reg_rtx (V2DImode); }) +(define_expand "vector_ne_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V2DI mode in the implementation of the ;; vec_any_eq built-in function on Power9. ;; @@ -1002,6 +1089,26 @@ operands[3] = gen_reg_rtx (V2DImode); }) +(define_expand "vector_ae_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0))) + (set (match_dup 0) + (xor:SI (match_dup 0) + (const_int 1)))] + "TARGET_POWER10" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V4SF and V2DF modes in the Power9 ;; implementation of the vec_all_ne built-in functions. Note that the ;; expansions for this pattern with these modes makes no use of power9- @@ -1061,6 +1168,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vector_gt_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + (define_expand "vector_ge_<mode>_p" [(parallel [(set (reg:CC CR6_REGNO) @@ -1085,6 +1204,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vector_gtu_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_POWER10" + "") + ;; AltiVec/VSX predicates. ;; This expansion is triggered during expansion of predicate built-in @@ -1460,6 +1591,20 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +(define_expand "vrotlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vrlq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for rotatert to make use of vrotl (define_expand "vrotr<mode>3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1481,6 +1626,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vashlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for logical shift right on each vector element (define_expand "vlshr<mode>3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1489,6 +1649,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vlshrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vsrq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for arithmetic shift right on each vector element (define_expand "vashr<mode>3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1496,6 +1671,22 @@ (match_operand:VEC_I 2 "vint_operand")))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)" "") + +;; No immediate version of this 128-bit instruction +(define_expand "vashrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_POWER10" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vsraq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Vector reduction expanders for VSX ; The (VEC_reduc:... diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 31fcffe8f33..5b6a0bd728a 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -298,6 +298,12 @@ UNSPEC_VSX_XXSPLTD UNSPEC_VSX_DIVSD UNSPEC_VSX_DIVUD + UNSPEC_VSX_DIVSQ + UNSPEC_VSX_DIVUQ + UNSPEC_VSX_DIVESQ + UNSPEC_VSX_DIVEUQ + UNSPEC_VSX_MODSQ + UNSPEC_VSX_MODUQ UNSPEC_VSX_MULSD UNSPEC_VSX_SIGN_EXTEND UNSPEC_VSX_XVCVBF16SPN @@ -1732,6 +1738,60 @@ } [(set_attr "type" "div")]) +(define_insn "vsx_div_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVSQ))] + "TARGET_POWER10" + "vdivsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_udiv_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVUQ))] + "TARGET_POWER10" + "vdivuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_dives_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVESQ))] + "TARGET_POWER10" + "vdivesq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_diveu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVEUQ))] + "TARGET_POWER10" + "vdiveuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_mods_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODSQ))] + "TARGET_POWER10" + "vmodsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_modu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODUQ))] + "TARGET_POWER10" + "vmoduq %0,%1,%2" + [(set_attr "type" "div")]) + ;; *tdiv* instruction returning the FG flag (define_expand "vsx_tdiv<mode>3_fg" [(set (match_dup 3) @@ -3083,6 +3143,21 @@ "xxpermdi %x0,%x1,%x1,2" [(set_attr "type" "vecperm")]) +;; Swap upper/lower 64-bit values in a 128-bit vector +(define_insn "xxswapd_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (subreg:V1TI + (vec_select:V2DI + (subreg:V2DI + (match_operand:V1TI 1 "vsx_register_operand" "v") 0 ) + (parallel [(const_int 1)(const_int 0)])) + 0))] + "TARGET_POWER10" +;; AIX does not support extended mnemonic xxswapd. Use the basic +;; mnemonic xxpermdi instead. + "xxpermdi %x0,%x1,%x1,2" + [(set_attr "type" "vecperm")]) + (define_insn "xxgenpcvm_<mode>_internal" [(set (match_operand:VSX_EXTRACT_I4 0 "altivec_register_operand" "=wa") (unspec:VSX_EXTRACT_I4 @@ -4767,6 +4842,15 @@ (set_attr "type" "vecload")]) +;; ISA 3.1 vector extend sign support +(define_insn "vsx_sign_extend_v2di_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")] + UNSPEC_VSX_SIGN_EXTEND))] + "TARGET_POWER10" + "vextsd2q %0,%1" + [(set_attr "type" "vecexts")]) + ;; ISA 3.0 vector extend sign support (define_insn "vsx_sign_extend_qi_<mode>" @@ -5451,6 +5535,19 @@ "vcmpneb %0,%1,%2" [(set_attr "type" "vecsimple")]) +;; Vector Compare Not Equal v1ti (specified/not+eq:) +(define_expand "vcmpnet" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (not:V1TI + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))))] + "TARGET_POWER10" +{ + emit_insn (gen_eqvv1ti3 (operands[0], operands[1], operands[2])); + emit_insn (gen_one_cmplv1ti2 (operands[0], operands[0])); + DONE; +}) + ;; Vector Compare Not Equal or Zero Byte (define_insn "vcmpnezb" [(set (match_operand:V16QI 0 "altivec_register_operand" "=v") diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index c1c2c9f9bf7..99cc053acbe 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21316,6 +21316,180 @@ Generate PCV from specified Mask size, as if implemented by the immediate value is either 0, 1, 2 or 3. @findex vec_genpcvm +@smallexample +@exdent vector unsigned __int128 vec_rl (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rl (vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input left by the number of bits +specified in the most significant quad word of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_rlmi (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlmi (vector signed __int128, + vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and inserting it under mask +into the second input. The first bit in the mask, the last bit in the mask are +obtained from the two 7-bit fields bits [108:115] and bits [117:123] +respectively of the second input. The shift is obtained from the third input +in the 7-bit field [125:131] where all bits counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_rlnm (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlnm (vector signed __int128, + vector unsigned __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and ANDing it with a mask. The +first bit in the mask and the last bit in the mask are obtained from the two +7-bit fields bits [117:123] and bits [125:131] respectively of the second +input. The shift is obtained from the third input in the 7-bit field bits +[125:131] where all bits counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_sl(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sl(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of shifting the first input left by the number of bits +specified in the most significant bits of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sr(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sr(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing a logical right shift of the first argument +by the number of bits specified in the most significant double word of the +second input truncated to 7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sra(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sra(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing arithmetic right shift of the first argument +by the number of bits specified in the most significant bits of the +second input truncated to 7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_mule (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mule (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the even +doubleword elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_mulo (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mulo (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the odd +doubleword elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_div (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_div (vector signed __int128, + vector signed __int128); +@end smallexample + +Returns the result of dividing the first operand by the second operand. An +attempt to divide any value by zero or to divide the most negative signed +128-bit integer by negative one results in an undefined value. + +@smallexample +@exdent vector unsigned __int128 vec_dive (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_dive (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is produced by shifting the first input left by 128 bits and +dividing by the second. If an attempt is made to divide by zero or the result +is larger than 128 bits, the result is undefined. + +@smallexample +@exdent vector unsigned __int128 vec_mod (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_mod (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is the modulo result of dividing the first input by the second +input. + +The following builtins perform 128-bit vector comparisons. The +@code{vec_all_xx}, @code{vec_any_xx}, and @code{vec_cmpxx}, where @code{xx} is +one of the operations @code{eq, ne, gt, lt, ge, le} perform pairwise +comparisons between the elements at the same positions within their two vector +arguments. The @code{vec_all_xx}function returns a non-zero value if and only +if all pairwise comparisons are true. The @code{vec_any_xx} function returns +a non-zero value if and only if at least one pairwise comparison is true. The +@code{vec_cmpxx}function returns a vector of the same type as its two +arguments, within which each element consists of all ones to denote that +specified logical comparison of the corresponding elements was true. +Otherwise, the element of the returned vector contains all zeros. + +@smallexample +vector bool __int128 vec_cmpeq (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpeq (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpne (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpne (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpgt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpgt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmplt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmplt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpge (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpge (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmple (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmple (vector unsigned __int128, vector unsigned __int128); + +int vec_all_eq (vector signed __int128, vector signed __int128); +int vec_all_eq (vector unsigned __int128, vector unsigned __int128); +int vec_all_ne (vector signed __int128, vector signed __int128); +int vec_all_ne (vector unsigned __int128, vector unsigned __int128); +int vec_all_gt (vector signed __int128, vector signed __int128); +int vec_all_gt (vector unsigned __int128, vector unsigned __int128); +int vec_all_lt (vector signed __int128, vector signed __int128); +int vec_all_lt (vector unsigned __int128, vector unsigned __int128); +int vec_all_ge (vector signed __int128, vector signed __int128); +int vec_all_ge (vector unsigned __int128, vector unsigned __int128); +int vec_all_le (vector signed __int128, vector signed __int128); +int vec_all_le (vector unsigned __int128, vector unsigned __int128); + +int vec_any_eq (vector signed __int128, vector signed __int128); +int vec_any_eq (vector unsigned __int128, vector unsigned __int128); +int vec_any_ne (vector signed __int128, vector signed __int128); +int vec_any_ne (vector unsigned __int128, vector unsigned __int128); +int vec_any_gt (vector signed __int128, vector signed __int128); +int vec_any_gt (vector unsigned __int128, vector unsigned __int128); +int vec_any_lt (vector signed __int128, vector signed __int128); +int vec_any_lt (vector unsigned __int128, vector unsigned __int128); +int vec_any_ge (vector signed __int128, vector signed __int128); +int vec_any_ge (vector unsigned __int128, vector unsigned __int128); +int vec_any_le (vector signed __int128, vector signed __int128); +int vec_any_le (vector unsigned __int128, vector unsigned __int128); +@end smallexample + + @node PowerPC Hardware Transactional Memory Built-in Functions @subsection PowerPC Hardware Transactional Memory Built-in Functions GCC provides two interfaces for accessing the Hardware Transactional diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c new file mode 100644 index 00000000000..85ad544e22b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -0,0 +1,2254 @@ +/* { dg-do run } */ +/* { dg-options "-mcpu=power10 -O2 -save-temps" } */ +/* { dg-require-effective-target power10_hw } */ +/* { dg-require-effective-target ppc_native_128bit } */ + +/* Check that the expected 128-bit instructions are generated if the processor + supports the 128-bit integer instructions. */ +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvslq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsrq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvsraq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mvcmpuq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpsq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 0 } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq.\M} 16 } } */ +/* { dg-final { scan-assembler-times {\mvmuleud\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivsq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivuq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdivesq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvdiveuq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmodsq\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mvmoduq\M} 1 } } */ + +#include <altivec.h> + +#define DEBUG 0 + +#if DEBUG +#include <stdio.h> +#include <stdlib.h> + + +void print_i128(__int128_t val) +{ + printf(" %lld %llu (0x%llx %llx)", + (signed long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF), + (unsigned long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF)); +} +#endif + +void abort (void); + +int main () +{ + int i, result_int; + + __int128_t arg1, result; + __uint128_t uarg2; + + vector signed long long int vec_arg1_di, vec_arg2_di; + vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di; + vector unsigned long long int vec_uresult_di; + vector unsigned long long int vec_uexpected_result_di; + + __int128_t expected_result; + __uint128_t uexpected_result; + + vector __int128_t vec_arg1, vec_arg2, vec_result; + vector __uint128_t vec_uarg1, vec_uarg2, vec_uarg3, vec_uresult; + vector bool __int128 vec_result_bool; + + /* sign extend double to 128-bit integer */ + vec_arg1_di[0] = 1000; + vec_arg1_di[1] = -123456; + + expected_result = 1000; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -123456; + vec_arg1_di[1] = 1000; + + expected_result = -123456; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* test shift 128-bit integers. + Note, shift amount is given by the lower 7-bits of the shift amount. */ + vec_arg1[0] = 3; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]*4; + + vec_result = vec_sl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 3; + uarg2 = 4; + expected_result = arg1*16; + + result = arg1 << uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 << uint128): "); + print_i128(arg1); + printf(" << %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 3; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]*4; + + vec_uresult = vec_sl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]/4; + + vec_result = vec_sr (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sr(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 48; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]/4; + + vec_uresult = vec_sr (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sr(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 48; + uarg2 = 4; + expected_result = arg1/16; + + result = arg1 >> uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 >> uint128: "); + print_i128(arg1); + printf(" >> %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x0000000012345678ULL; + expected_result = (expected_result << 64) | 0x90ABCDEFAABBCCDDULL; + + vec_result = vec_sra (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sra(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0xFFFFFFFFFFFFAABBLL; + uexpected_result = (uexpected_result << 64) | 0xCCDDEEFF11221234ULL; + + vec_uresult = vec_sra (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sra(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x90ABCDEFAABBCCDDULL; + expected_result = (expected_result << 64) | 0xEEFF112212345678ULL; + + vec_result = vec_rl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0x11221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEEFFULL; + + vec_uresult = vec_rl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = (32 << 8) | 95; + vec_uarg3[0] = 32; + expected_result = 0xaabbccddULL; + expected_result = (expected_result << 64) | 0xeeff112200000000ULL; + + vec_result = vec_rlnm (vec_arg1, vec_uarg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(int128, uint128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg3[0] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = (8 << 8) | 119; + vec_uarg3[0] = 48; + + uexpected_result = 0x00221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEE00ULL; + + vec_uresult = vec_rlnm (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(uint128, uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg3[0] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_arg2[0] = 0x000000000000DEADULL; + vec_arg2[0] = (vec_arg2[0] << 64) | 0x0000BEEF00000000ULL; + vec_uarg3[0] = 96 << 16 | 127 << 8 | 32; + expected_result = 0x000000000000DEADULL; + expected_result = (expected_result << 64) | 0x0000BEEF12345678ULL; + + vec_result = vec_rlmi (vec_arg1, vec_arg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(int128, int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg2_di[1] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 0xDEAD000000000000ULL; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 0x000000000000BEEFULL; + vec_uarg3[0] = 16 << 16 | 111 << 8 | 48; + uexpected_result = 0xDEAD1234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDBEEFULL; + + vec_uresult = vec_rlmi (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(uint128, unit128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg3[1] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* 128-bit compare tests, result is all 1's if true */ + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1[0] = 2468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + uexpected_result = 0xFFFFFFFFFFFFFFFFULL; + uexpected_result = (uexpected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != uexpected_result) { +#if DEBUG + printf("ERROR: unsigned vec_cmpgt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed vec_cmpgt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpeq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmplt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmple ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmpge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + /* Vector multiply Even and Odd tests */ + vec_arg1_di[0] = 200; + vec_arg1_di[1] = 400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[0] * vec_arg2_di[0]; + + vec_result = vec_mule (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mule (signed, signed) failed.\n"); + printf(" vec_arg1_di[0] = %lld\n", vec_arg1_di[0]); + printf(" vec_arg2_di[0] = %lld\n", vec_arg2_di[0]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -200; + vec_arg1_di[1] = -400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[1] * vec_arg2_di[1]; + + vec_result = vec_mulo (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mulo (signed, signed) failed.\n"); + printf(" vec_arg1_di[1] = %lld\n", vec_arg1_di[1]); + printf(" vec_arg2_di[1] = %lld\n", vec_arg2_di[1]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[0] * vec_uarg2_di[0]; + + vec_uresult = vec_mule (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mule (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[1] = %lld\n", vec_uarg1_di[1]); + printf(" vec_uarg2_di[1] = %lld\n", vec_uarg2_di[1]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[1] * vec_uarg2_di[1]; + + vec_uresult = vec_mulo (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mulo (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[0] = %lld\n", vec_uarg1_di[0]); + printf(" vec_uarg2_di[0] = %lld\n", vec_uarg2_di[0]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Divide Quadword */ + vec_arg1[0] = -12345678; + vec_arg2[0] = 2; + expected_result = -6172839; + + vec_result = vec_div (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_div (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24680; + vec_uarg2[0] = 4; + uexpected_result = 6170; + + vec_uresult = vec_div (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_div (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Divide Extended Quadword */ + vec_arg1[0] = -20; // has 128-bit of zero concatenated onto it + vec_arg2[0] = 0x2000000000000000; + vec_arg2[0] = vec_arg2[0] << 64; + expected_result = -160; + + vec_result = vec_dive (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_dive (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 20; // has 128-bit of zero concatenated onto it + vec_uarg2[0] = 0x4000000000000000; + vec_uarg2[0] = vec_uarg2[0] << 64; + uexpected_result = 80; + + vec_uresult = vec_dive (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_dive (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector modulo quad word */ + vec_arg1[0] = -12345675; + vec_arg2[0] = 2; + expected_result = -1; + + vec_result = vec_mod (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mod (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24685; + vec_uarg2[0] = 4; + uexpected_result = 1; + + vec_uresult = vec_mod (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mod (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + return 0; +} -- 2.17.1