Hi Carl, on 2023/5/3 23:30, Carl Love via Gcc-patches wrote: > GCC maintainers: > > The following patch cleans up the definition for the > __builtin_altivec_vcmpnet. The current implementation implies that the > builtin is only supported on Power 9 since it is defined under the > Power 9 stanza. However the builtin has no ISA restrictions as stated > in the Power Vector Intrinsic Programming Reference document. The > current built-in works because the builtin gets replaced during GIMPLE > folding by a simple not-equal operator so it doesn't get expanded and > checked for Power 9 code generation. > > This patch moves the definition to the Altivec stanza in the builtin > definition file. The builtin then generates code for Power 8 and > earlier processors or Power 9 and later processors. > > The patch has been tested on Power 8 and Power 9 with no regressions. > > Please let me know if the patch is acceptable for mainline. Thanks. > > Carl > > ----------------------------------------- > rs6000: vec_cmpne confusing implementation
A better subject seems to be "Make __builtin_altivec_vcmpne{b,h,w,t} supported under altivec" to match better what this patch does? > > __builtin_altivec_vcmpnet does not have any ISA restrictions. The current Power Vector Intrinsic Programming Reference doesn't define functions __builtin_altivec_vcmpne* but only vec_cmpne, which hasn't restricted any ISA yet. So I think what you proposed is actually to extend the current built-in functions __builtin_altivec_vcmpne{b,h,w,t} not longer power9 and later only. This patch needs some test cases for the corresponding changes, but I'm curious why people want to use these built-ins instead of just using vec_cmpne? The latter looks more general and better. > built-in definitions for vcmpneb, vcmpneh, vcmpnew, vcmpnet are defined > under the Power 9 section. This implies they are only supported on Power > 9 and above when in fact they are defined and work on Power 8. > > This patch moves the definitions to the Altivec stanza and maps the > builtin dispatches to different code generation for Power 8 and earlier > or for Power 9 and later. > > gcc/ChangeLog: > > * config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew. > vcmpnet): Move definitions to Altivec stanza. > * config/rs6000/vsx.md (cmpneb, vcmpneh, vcmpnew, vcmpnet): New > define_expand. > (cmpneb, vcmpneh, vcmpnew, vcmpnet): Rename define_insn. > > Patch has been tested on Power 8 and Power 9 with no regressions. > --- > gcc/config/rs6000/rs6000-builtins.def | 24 ++++++------ > gcc/config/rs6000/vsx.md | 54 +++++++++++++++++++++++++-- > 2 files changed, 63 insertions(+), 15 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-builtins.def > b/gcc/config/rs6000/rs6000-builtins.def > index 638d0bc72ca..adb4122be29 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -641,6 +641,18 @@ > const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi); > VCMPGTUW_P vector_gtu_v4si_p {pred} > > + const vsc __builtin_altivec_vcmpneb (vsc, vsc); > + VCMPNEB vcmpneb {} > + > + const vss __builtin_altivec_vcmpneh (vss, vss); > + VCMPNEH vcmpneh {} > + > + const vsi __builtin_altivec_vcmpnew (vsi, vsi); > + VCMPNEW vcmpnew {} > + The expanders for these three can just start with separated eq and not, later if vcmpne{b,h,w} get supported, these two can be combined further. > + const vbq __builtin_altivec_vcmpnet (vsq, vsq); > + VCMPNET vcmpnet {} The current define_expand for vcmpnet guards TARGET_POWER10, once you have test cases to cover these changes, you will found the current expander isn't enough for what you want. > + > const vsi __builtin_altivec_vctsxs (vf, const int<5>); > VCTSXS altivec_vctsxs {} > > @@ -2599,9 +2611,6 @@ > const signed int __builtin_altivec_vcmpaew_p (vsi, vsi); > VCMPAEW_P vector_ae_v4si_p {} > > - const vsc __builtin_altivec_vcmpneb (vsc, vsc); > - VCMPNEB vcmpneb {} > - > const signed int __builtin_altivec_vcmpneb_p (vsc, vsc); > VCMPNEB_P vector_ne_v16qi_p {} > > @@ -2614,15 +2623,9 @@ > const signed int __builtin_altivec_vcmpnefp_p (vf, vf); > VCMPNEFP_P vector_ne_v4sf_p {} > > - const vss __builtin_altivec_vcmpneh (vss, vss); > - VCMPNEH vcmpneh {} > - > const signed int __builtin_altivec_vcmpneh_p (vss, vss); > VCMPNEH_P vector_ne_v8hi_p {} > > - const vsi __builtin_altivec_vcmpnew (vsi, vsi); > - VCMPNEW vcmpnew {} > - > const signed int __builtin_altivec_vcmpnew_p (vsi, vsi); > VCMPNEW_P vector_ne_v4si_p {} > > @@ -3203,9 +3206,6 @@ > const signed int __builtin_altivec_vcmpgtut_p (signed int, vuq, vuq); > VCMPGTUT_P vector_gtu_v1ti_p {pred} > > - const vbq __builtin_altivec_vcmpnet (vsq, vsq); > - VCMPNET vcmpnet {} > - > const signed int __builtin_altivec_vcmpnet_p (vsq, vsq); > VCMPNET_P vector_ne_v1ti_p {} > > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index 7d845df5c2d..3f05e3e6d00 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -5652,8 +5652,56 @@ > DONE; > }) > > +;; Expand for builtin vcmpneb > +(define_expand "vcmpneb" > + [(set (match_operand:V16QI 0 "altivec_register_operand" "=v") > + (not:V16QI > + (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v") > + (match_operand:V16QI 2 "altivec_register_operand" "v"))))] > + "" > + { > + if (TARGET_P9_VECTOR) > + emit_insn (gen_vcmpneb_p9 (operands[0], operands[1], operands[2])); > + else > + emit_insn (gen_altivec_vcmpequb_p (operands[0], operands[1], > + operands[2])); > + DONE; > + }) > + It's better to have an explicit condition TARGET_ALTIVEC to match the stanza. As mentioned above, you can just expand it with eq and not with something like: (define_expand "vcmpneb" [(set (match_operand:V16QI 3 "altivec_register_operand" "=v") (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v") (match_operand:V16QI 2 "altivec_register_operand" "v"))) (set (match_operand:V16QI 0 "altivec_register_operand" "=v") (not:V16QI (match_dup 3))] "TARGET_ALTIVEC" { operands[3] = gen_reg_rtx (V16QImode); }); And move these expands to altivec.md since this is not vsx specific. > +;; Expand for builtin vcmpneh > +(define_expand "vcmpneh" > + [(set (match_operand:V8HI 0 "altivec_register_operand" "=v") > + (not:V8HI > + (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v") > + (match_operand:V8HI 2 "altivec_register_operand" "v"))))] > + "" > + { > + if (TARGET_P9_VECTOR) > + emit_insn (gen_vcmpneh_p9 (operands[0], operands[1], operands[2])); > + else > + emit_insn (gen_altivec_vcmpequh_p (operands[0], operands[1], > + operands[2])); > + DONE; > + }) > + Ditto, > +;; Expand for builtin vcmpnew > +(define_expand "vcmpnew" > + [(set (match_operand:V4SI 0 "altivec_register_operand" "=v") > + (not:V4SI > + (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v") > + (match_operand:V4SI 2 "altivec_register_operand" "v"))))] > + "" > + { > + if (TARGET_P9_VECTOR) > + emit_insn (gen_vcmpnew_p9 (operands[0], operands[1], operands[2])); > + else > + emit_insn (gen_altivec_vcmpequw_p (operands[0], operands[1], > + operands[2])); > + DONE; > + }) Ditto, they can be merged with some mode iterators further. > + > ;; Vector Compare Not Equal Byte (specified/not+eq:) > -(define_insn "vcmpneb" > +(define_insn "vcmpneb_p9" > [(set (match_operand:V16QI 0 "altivec_register_operand" "=v") > (not:V16QI > (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v") > @@ -5703,7 +5751,7 @@ > [(set_attr "type" "vecsimple")]) > > ;; Vector Compare Not Equal Half Word (specified/not+eq:) > -(define_insn "vcmpneh" > +(define_insn "vcmpneh_p9" > [(set (match_operand:V8HI 0 "altivec_register_operand" "=v") > (not:V8HI > (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v") > @@ -5723,7 +5771,7 @@ > [(set_attr "type" "vecsimple")]) > > ;; Vector Compare Not Equal Word (specified/not+eq:) > -(define_insn "vcmpnew" > +(define_insn "vcmpnew_p9" > [(set (match_operand:V4SI 0 "altivec_register_operand" "=v") > (not:V4SI > (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v") These are not needed with the above changes. BR, Kewen