> > "Kong, Lingling" <lingling.k...@intel.com> writes: > > > Hi, > > > > > > This version has added a new optab named 'cfmovcc'. The new optab is > > > used in the middle end to expand to cfcmov. And simplified my patch > > > by trying to generate the conditional faulting movcc in > > > noce_try_cmove_arith > > function. > > > > > > All the changes passed bootstrap & regtest x86-64-pc-linux-gnu. > > > We also tested spec with SDE and passed the runtime test. > > > > > > Ok for trunk? > > > > > > > > > APX CFCMOV[1] feature implements conditionally faulting which means > > > If the comparison is false, all memory faults are suppressed when > > > load or store a memory operand. Now we could load or store a memory > > > operand may trap or fault for conditional move. > > > > > > In middle-end, now we don't support a conditional move if we knew > > > that a load from A or B could trap or fault. To enable CFCMOV, we > > > added a new optab named cfmovcc. > > > > > > Conditional move suppress fault for condition mem store would not > > > move any arithmetic calculations. For condition mem load now just > > > support a conditional move one trap mem and one no trap and no mem cases. > > > > Sorry if this is going over old ground (I haven't read the earlier versions > > yet), but: > > instead of adding a new optab, could we treat CFCMOV as a scalar > > instance of maskload_optab? Robin is working on adding an "else" > > value for when the condition/mask is false. After that, it would seem > > to be a pretty close match to CFCMOV. > > > > One reason for preferring maskload is that it makes the load an > > explicit part of the interface. We could then potentially use it in gimple > > too, not > just expand. > > > > Yes, for conditional load is like a scalar instance of maskload_optab with > else > operand. > I could try to use maskload_optab to generate cfcmov in rtl ifcvt pass. But > it still > after expand. > Now we don't have if-covert pass for scalar in gimple, do we have plan to do > that ? >
Hi, I have tried to use maskload/maskstore to generate CFCMOV in ifcvt pass, Unlike mov<mode>cc, maskload/maskstore are not allowed to FAIL. But I need restrictions for CFCMOV in backend expand. Since expand maskload/maskstore cannot fail, I can only make restrictions in ifcvt and emit_conditional_move (in optabs.cc). I'm not sure if this is the right approach, do you have any suggestions? Thanks, Lingling > > Thanks, > > Richard > > > > > > > > > > > [1].https://www.intel.com/content/www/us/en/developer/articles/techn > > > ic al/advanced-performance-extensions-apx.html > > > > > > gcc/ChangeLog: > > > > > > * doc/md.texi: Add cfmovcc insn pattern explanation. > > > * ifcvt.cc (can_use_cmove_load_mem_notrap): New func > > > for conditional faulting movcc for load. > > > (can_use_cmove_store_mem_notrap): New func for conditional > > > faulting movcc for store. > > > (can_use_cfmovcc): New func for conditional faulting. > > > (noce_try_cmove_arith): Try to convert to conditional > > > faulting > > > movcc. > > > (noce_process_if_block): Ditto. > > > * optabs.cc (emit_conditional_move): Handle cfmovcc. > > > (emit_conditional_move_1): Ditto. > > > * optabs.def (OPTAB_D): New optab. > > > --- > > > gcc/doc/md.texi | 10 ++++ > > > gcc/ifcvt.cc | 119 ++++++++++++++++++++++++++++++++++++++++++++---- > > > gcc/optabs.cc | 14 +++++- > > > gcc/optabs.def | 1 + > > > 4 files changed, 132 insertions(+), 12 deletions(-) > > > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index > > > a9259112251..5f563787c49 100644 > > > --- a/gcc/doc/md.texi > > > +++ b/gcc/doc/md.texi > > > @@ -8591,6 +8591,16 @@ Return 1 if operand 1 is a normal floating > > > point number and 0 otherwise. @var{m} is a scalar floating point > > > mode. Operand 0 has mode @code{SImode}, and operand 1 has mode > > @var{m}. > > > +@cindex @code{cfmov@var{mode}cc} instruction pattern @item > > > +@samp{cfmov@var{mode}cc} Similar to @samp{mov@var{mode}cc} but for > > > +conditional faulting, If the comparison is false, all memory faults > > > +are suppressed when load or store a memory operand. > > > + > > > +Conditionally move operand 2 or operand 3 into operand 0 according > > > +to the comparison in operand 1. If the comparison is true, operand > > > +2 is moved into operand 0, otherwise operand 3 is moved. > > > + > > > @end table > > > @end ifset > > > diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index > > > 6487574c514..59845390607 100644 > > > --- a/gcc/ifcvt.cc > > > +++ b/gcc/ifcvt.cc > > > @@ -778,6 +778,9 @@ static bool noce_try_store_flag_mask (struct > > > noce_if_info *); static rtx noce_emit_cmove (struct noce_if_info *, > > > rtx, enum > > rtx_code, rtx, > > > rtx, rtx, rtx, rtx = > > > NULL, rtx = NULL); static bool noce_try_cmove (struct noce_if_info > > > *); > > > +static bool can_use_cmove_load_mem_notrap (rtx, rtx); static bool > > > +can_use_cmove_store_mem_notrap (rtx, rtx, rtx, bool); static bool > > > +can_use_cfmovcc (struct noce_if_info *); > > > static bool noce_try_cmove_arith (struct noce_if_info *); static rtx > > > noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **); > > > static bool noce_try_minmax (struct noce_if_info *); @@ -2132,6 > > > +2135,69 @@ noce_emit_bb (rtx last_insn, basic_block bb, bool > > > +simple) > > > return true; > > > } > > > +/* Return TRUE if we could convert "if (test) x = *a; else x = b;" > > > + or "if (test) x = a; else x = *b;" to conditional faulting movcc, > > > + i.e. x86 cfcmov, especially when load a or b may cause memmory > > > +faults. */ > > > + > > > +static bool > > > +can_use_cmove_load_mem_notrap (rtx a, rtx b) { > > > + /* Just handle a conditional move from one trap MEM + other non_trap, > > > + non mem cases. */ > > > + if (!(MEM_P (a) ^ MEM_P (b))) > > > + return false; > > > + bool a_trap = may_trap_or_fault_p (a); > > > + bool b_trap = may_trap_or_fault_p (b); > > > + > > > + if (!(a_trap ^ b_trap)) > > > + return false; > > > + if (a_trap && !MEM_P (a)) > > > + return false; > > > + if (b_trap && !MEM_P (b)) > > > + return false; > > > + > > > + return true; > > > +} > > > + > > > +/* Return TRUE if we could convert "if (test) *x = a; else skip" to > > > + conditional faulting movcc, i.e. x86 cfcmov, especially when store > > > + x may cause memmory faults and in else_bb x == b. */ > > > + > > > +static bool > > > +can_use_cmove_store_mem_notrap (rtx x, rtx a, rtx b, bool a_simple) > > > +{ > > > + gcc_assert (MEM_P (x)); > > > + > > > + machine_mode x_mode = GET_MODE (x); > > > + > > > + if (!rtx_equal_p (x, b) || !may_trap_or_fault_p (x)) > > > + return false; > > > + if (!a_simple || !register_operand (a, x_mode)) > > > + return false; > > > + > > > + return true; > > > +} > > > + > > > +/* Return TRUE if backend supports cfmovcc_optab, which suppressed > > memory > > > + faults when load or store a memory operand and the condition code > > > + evaluates to false. */ > > > + > > > +static bool > > > +can_use_cfmovcc (struct noce_if_info *if_info) { > > > + rtx a = if_info->a; > > > + rtx b = if_info->b; > > > + rtx x = if_info->x; > > > + > > > + if (optab_handler (cfmovcc_optab, GET_MODE (x)) == CODE_FOR_nothing) > > > + return false; > > > + > > > + if (MEM_P (x)) > > > + return can_use_cmove_store_mem_notrap (x, a, b, > > > +if_info->then_simple); > > > + else > > > + return can_use_cmove_load_mem_notrap (a, b); } > > > + > > > /* Try more complex cases involving conditional_move. */ static > > > bool @@ -2171,7 +2237,15 @@ noce_try_cmove_arith (struct noce_if_info > *if_info) > > > /* ??? We could handle this if we knew that a load from A or B could > > > not trap or fault. This is also true if we've already loaded > > > from the address along the path from ENTRY. */ > > > - else if (may_trap_or_fault_p (a) || may_trap_or_fault_p (b)) > > > + else if (may_trap_or_fault_p (a) && may_trap_or_fault_p (b)) > > > + return false; > > > + /* cfmovcc_optab implements conditionally faulting which means > > > + that if the condition code evaluates to false, all memory faults > > > + are suppressed when load or store a memory operand. Now we could > > > + load or store a memory operand may trap or fault for conditional > > > + move. */ > > > + else if ((may_trap_or_fault_p (a) || may_trap_or_fault_p (b)) > > > + && !can_use_cfmovcc (if_info)) > > > return false; > > > /* if (test) x = a + b; else x = c - d; @@ -2247,9 +2321,14 @@ > > > noce_try_cmove_arith (struct noce_if_info *if_info) > > > /* If either operand is complex, load it into a register first. > > > The best way to do this is to copy the original insn. In this > > > way we preserve any clobbers etc that the insn may have had. > > > - This is of course not possible in the IS_MEM case. */ > > > + This is of course not possible in the IS_MEM case. > > > + For load or store a operands may trap or fault, should not > > > + hoist the load or store, otherwise it unable to suppress memory > > > + fault, it just a normal arithmetic insn insteads of conditional > > > + faulting movcc. */ > > > - if (! general_operand (a, GET_MODE (a)) || tmp_a) > > > + if (! may_trap_or_fault_p (a) > > > + && (! general_operand (a, GET_MODE (a)) || tmp_a)) > > > { > > > if (is_mem) > > > @@ -2278,7 +2357,8 @@ noce_try_cmove_arith (struct noce_if_info > *if_info) > > > } > > > } > > > - if (! general_operand (b, GET_MODE (b)) || tmp_b) > > > + if (! may_trap_or_fault_p (b) > > > + && (! general_operand (b, GET_MODE (b)) || tmp_b)) > > > { > > > if (is_mem) > > > { > > > @@ -4210,12 +4290,31 @@ noce_process_if_block (struct noce_if_info > > *if_info) > > > } > > > if (!set_b && MEM_P (orig_x)) > > > - /* We want to avoid store speculation to avoid cases like > > > - if (pthread_mutex_trylock(mutex)) > > > - ++global_variable; > > > - Rather than go to much effort here, we rely on the SSA optimizers, > > > - which do a good enough job these days. */ > > > - return false; > > > + { > > > + /* When target support conditional faulting movcc, i.e. x86 cfcmov, > > > + we could do conditonal mem store for "if (...) *x = a; else > > > skip" > > > + to cfmovcc_optab, which x may trap or fault. */ > > > + if ((optab_handler (cfmovcc_optab, GET_MODE (orig_x)) > > > + != CODE_FOR_nothing) > > > + && HAVE_conditional_move > > > + && may_trap_or_fault_p (orig_x) > > > + && register_operand (a, GET_MODE (orig_x))) > > > + { > > > + x = orig_x; > > > + if_info->x = x; > > > + if (noce_try_cmove_arith (if_info)) > > > + goto success; > > > + else > > > + return false; > > > + } > > > + /* We want to avoid store speculation to avoid cases like > > > + if (pthread_mutex_trylock(mutex)) > > > + ++global_variable; > > > + Rather than go to much effort here, we rely on the SSA > > > optimizers, > > > + which do a good enough job these days. */ > > > + else > > > + return false; > > > + } > > > if (noce_try_move (if_info)) > > > goto success; > > > diff --git a/gcc/optabs.cc b/gcc/optabs.cc index > > > 2bcb3f7b47a..181edf64f03 100644 > > > --- a/gcc/optabs.cc > > > +++ b/gcc/optabs.cc > > > @@ -5034,6 +5034,7 @@ emit_conditional_move (rtx target, struct > > rtx_comparison comp, > > > rtx_insn *last; > > > enum insn_code icode; > > > enum rtx_code reversed; > > > + optab op = movcc_optab; > > > /* If the two source operands are identical, that's just a move. > > > */ @@ -5082,7 +5083,11 @@ emit_conditional_move (rtx target, struct > > rtx_comparison comp, > > > if (mode == VOIDmode) > > > mode = GET_MODE (op2); > > > - icode = direct_optab_handler (movcc_optab, mode); > > > + if ((may_trap_or_fault_p (op2) || may_trap_or_fault_p (op3)) > > > + && optab_handler (cfmovcc_optab, mode) != CODE_FOR_nothing) > > > + op = cfmovcc_optab; > > > + > > > + icode = direct_optab_handler (op, mode); > > > if (icode == CODE_FOR_nothing) > > > return NULL_RTX; > > > @@ -5194,6 +5199,7 @@ emit_conditional_move_1 (rtx target, rtx > > comparison, > > > rtx op2, rtx op3, > > > machine_mode mode) { > > > enum insn_code icode; > > > + optab op = movcc_optab; > > > if (comparison == NULL_RTX || !COMPARISON_P (comparison)) > > > return NULL_RTX; > > > @@ -5214,7 +5220,11 @@ emit_conditional_move_1 (rtx target, rtx > > comparison, > > > if (mode == VOIDmode) > > > mode = GET_MODE (op2); > > > - icode = direct_optab_handler (movcc_optab, mode); > > > + if ((may_trap_or_fault_p (op2) || may_trap_or_fault_p (op3)) > > > + && optab_handler (cfmovcc_optab, mode) != CODE_FOR_nothing) > > > + op = cfmovcc_optab; > > > + > > > + icode = direct_optab_handler (op, mode); > > > if (icode == CODE_FOR_nothing) > > > return NULL_RTX; > > > diff --git a/gcc/optabs.def b/gcc/optabs.def index > > > 58a939442bd..893e5c1c6c2 100644 > > > --- a/gcc/optabs.def > > > +++ b/gcc/optabs.def > > > @@ -551,3 +551,4 @@ OPTAB_D (len_store_optab, "len_store_$a") > > > OPTAB_D (select_vl_optab, "select_vl$a") OPTAB_D (andn_optab, > > > "andn$a3") OPTAB_D (iorn_optab, "iorn$a3") > > > +OPTAB_D (cfmovcc_optab, "cfmov$acc") > > > -- > > > 2.31.1