On Tue, 6 Oct 2020 at 11:42, Christophe Lyon <christophe.l...@linaro.org> wrote:
>
> Hi Jakub,
>
> On Tue, 6 Oct 2020 at 10:13, Richard Biener <rguent...@suse.de> wrote:
> >
> > On Tue, 6 Oct 2020, Jakub Jelinek wrote:
> >
> > > Hi!
> > >
> > > As written in the comment, tree-ssa-math-opts.c wouldn't create a DIVMOD
> > > ifn call for division + modulo by constant for the fear that during
> > > expansion we could generate better code for those cases.
> > > If the divisoris a power of two, that is certainly the case always,
> > > but otherwise expand_divmod can punt in many cases, e.g. if the division
> > > type's precision is above HOST_BITS_PER_WIDE_INT, we don't even call
> > > choose_multiplier, because it works on HOST_WIDE_INTs (true, something
> > > we should fix eventually now that we have wide_ints), or if pre/post shift
> > > is larger than BITS_PER_WORD.
> > >
> > > So, the following patch recognizes DIVMOD with constant last argument even
> > > when it is unclear if expand_divmod will be able to optimize it, and then
> > > during DIVMOD expansion if the divisor is constant attempts to expand it 
> > > as
> > > division + modulo and if they actually don't contain any libcalls or
> > > division/modulo, they are kept as is, otherwise that sequence is thrown 
> > > away
> > > and divmod optab or libcall is used.
> > >
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > OK.
> >
> > Richard.
> >
> > > 2020-10-06  Jakub Jelinek  <ja...@redhat.com>
> > >
> > >       PR rtl-optimization/97282
> > >       * tree-ssa-math-opts.c (divmod_candidate_p): Don't return false for
> > >       constant op2 if it is not a power of two and the type has precision
> > >       larger than HOST_BITS_PER_WIDE_INT or BITS_PER_WORD.
> > >       * internal-fn.c (contains_call_div_mod): New function.
> > >       (expand_DIVMOD): If last argument is a constant, try to expand it as
> > >       TRUNC_DIV_EXPR followed by TRUNC_MOD_EXPR, but if the sequence
> > >       contains any calls or {,U}{DIV,MOD} rtxes, throw it away and use
> > >       divmod optab or divmod libfunc.
> > >
>
> This patch causes ICEs on arm while building newlib or glibc
>
> For instance with newlib when compiling vfwprintf.o:
> during RTL pass: expand
> In file included from
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/newlib/newlib/libc/stdio/vfprintf.c:153:
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/newlib/newlib/libc/include/stdio.h:
> In function '_vfprintf_r':
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/newlib/newlib/libc/include/stdio.h:503:9:
> internal compiler error: in int_mode_for_mode, at stor-layout.c:404
>   503 | int     _vfprintf_r (struct _reent *, FILE *__restrict, const
> char *__restrict, __VALIST)
>       |         ^~~~~~~~~~~
> 0xaed4e3 int_mode_for_mode(machine_mode)
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/stor-layout.c:404
> 0x7ff73d emit_move_via_integer
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3425
> 0x808f2d emit_move_insn_1(rtx_def*, rtx_def*)
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3793
> 0x8092d7 emit_move_insn(rtx_def*, rtx_def*)
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3935
> 0x6e703f emit_library_call_value_1(int, rtx_def*, rtx_def*,
> libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*)
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/calls.c:5601
> 0xdff642 emit_library_call_value
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/rtl.h:4258
> 0xdff642 arm_expand_divmod_libfunc
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:33256
> 0x8c69af expand_DIVMOD
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/internal-fn.c:3084
> 0x7021b7 expand_call_stmt
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:2612
> 0x7021b7 expand_gimple_stmt_1
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3686
> 0x7021b7 expand_gimple_stmt
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3851
> 0x702cfd expand_gimple_basic_block
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5892
> 0x70533e execute
>         
> /tmp/1435347_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6576
>

I have just filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97322
to track this.

Thanks


> Christophe
>
>
>
> > >       * gcc.target/i386/pr97282.c: New test.
> > >
> > > --- gcc/tree-ssa-math-opts.c.jj       2020-10-01 10:40:10.104755999 +0200
> > > +++ gcc/tree-ssa-math-opts.c  2020-10-05 13:51:54.476628287 +0200
> > > @@ -3567,9 +3567,24 @@ divmod_candidate_p (gassign *stmt)
> > >
> > >    /* Disable the transform if either is a constant, since 
> > > division-by-constant
> > >       may have specialized expansion.  */
> > > -  if (CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2))
> > > +  if (CONSTANT_CLASS_P (op1))
> > >      return false;
> > >
> > > +  if (CONSTANT_CLASS_P (op2))
> > > +    {
> > > +      if (integer_pow2p (op2))
> > > +     return false;
> > > +
> > > +      if (TYPE_PRECISION (type) <= HOST_BITS_PER_WIDE_INT
> > > +       && TYPE_PRECISION (type) <= BITS_PER_WORD)
> > > +     return false;
> > > +
> > > +      /* If the divisor is not power of 2 and the precision wider than
> > > +      HWI, expand_divmod punts on that, so in that case it is better
> > > +      to use divmod optab or libfunc.  Similarly if choose_multiplier
> > > +      might need pre/post shifts of BITS_PER_WORD or more.  */
> > > +    }
> > > +
> > >    /* Exclude the case where TYPE_OVERFLOW_TRAPS (type) as that should
> > >       expand using the [su]divv optabs.  */
> > >    if (TYPE_OVERFLOW_TRAPS (type))
> > > --- gcc/internal-fn.c.jj      2020-10-02 10:36:43.272290992 +0200
> > > +++ gcc/internal-fn.c 2020-10-05 15:15:12.498349327 +0200
> > > @@ -50,6 +50,7 @@ along with GCC; see the file COPYING3.
> > >  #include "tree-phinodes.h"
> > >  #include "ssa-iterators.h"
> > >  #include "explow.h"
> > > +#include "rtl-iter.h"
> > >
> > >  /* The names of each internal function, indexed by function number.  */
> > >  const char *const internal_fn_name_array[] = {
> > > @@ -2985,6 +2986,32 @@ expand_gather_load_optab_fn (internal_fn
> > >      emit_move_insn (lhs_rtx, ops[0].value);
> > >  }
> > >
> > > +/* Helper for expand_DIVMOD.  Return true if the sequence starting with
> > > +   INSN contains any call insns or insns with {,U}{DIV,MOD} rtxes.  */
> > > +
> > > +static bool
> > > +contains_call_div_mod (rtx_insn *insn)
> > > +{
> > > +  subrtx_iterator::array_type array;
> > > +  for (; insn; insn = NEXT_INSN (insn))
> > > +    if (CALL_P (insn))
> > > +      return true;
> > > +    else if (INSN_P (insn))
> > > +      FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
> > > +     switch (GET_CODE (*iter))
> > > +       {
> > > +       case CALL:
> > > +       case DIV:
> > > +       case UDIV:
> > > +       case MOD:
> > > +       case UMOD:
> > > +         return true;
> > > +       default:
> > > +         break;
> > > +       }
> > > +  return false;
> > > + }
> > > +
> > >  /* Expand DIVMOD() using:
> > >   a) optab handler for udivmod/sdivmod if it is available.
> > >   b) If optab_handler doesn't exist, generate call to
> > > @@ -3007,10 +3034,44 @@ expand_DIVMOD (internal_fn, gcall *call_
> > >    rtx op1 = expand_normal (arg1);
> > >    rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> > >
> > > -  rtx quotient, remainder, libfunc;
> > > +  rtx quotient = NULL_RTX, remainder = NULL_RTX;
> > > +  rtx_insn *insns = NULL;
> > > +
> > > +  if (TREE_CODE (arg1) == INTEGER_CST)
> > > +    {
> > > +      /* For DIVMOD by integral constants, there could be efficient code
> > > +      expanded inline e.g. using shifts and plus/minus.  Try to expand
> > > +      the division and modulo and if it emits any library calls or any
> > > +      {,U}{DIV,MOD} rtxes throw it away and use a divmod optab or
> > > +      divmod libcall.  */
> > > +      struct separate_ops ops;
> > > +      ops.code = TRUNC_DIV_EXPR;
> > > +      ops.type = type;
> > > +      ops.op0 = make_tree (ops.type, op0);
> > > +      ops.op1 = arg1;
> > > +      ops.op2 = NULL_TREE;
> > > +      ops.location = gimple_location (call_stmt);
> > > +      start_sequence ();
> > > +      quotient = expand_expr_real_2 (&ops, NULL_RTX, mode, 
> > > EXPAND_NORMAL);
> > > +      if (contains_call_div_mod (get_insns ()))
> > > +     quotient = NULL_RTX;
> > > +      else
> > > +     {
> > > +       ops.code = TRUNC_MOD_EXPR;
> > > +       remainder = expand_expr_real_2 (&ops, NULL_RTX, mode, 
> > > EXPAND_NORMAL);
> > > +       if (contains_call_div_mod (get_insns ()))
> > > +         remainder = NULL_RTX;
> > > +     }
> > > +      if (remainder)
> > > +     insns = get_insns ();
> > > +      end_sequence ();
> > > +    }
> > > +
> > > +  if (remainder)
> > > +    emit_insn (insns);
> > >
> > >    /* Check if optab_handler exists for divmod_optab for given mode.  */
> > > -  if (optab_handler (tab, mode) != CODE_FOR_nothing)
> > > +  else if (optab_handler (tab, mode) != CODE_FOR_nothing)
> > >      {
> > >        quotient = gen_reg_rtx (mode);
> > >        remainder = gen_reg_rtx (mode);
> > > @@ -3018,7 +3079,7 @@ expand_DIVMOD (internal_fn, gcall *call_
> > >      }
> > >
> > >    /* Generate call to divmod libfunc if it exists.  */
> > > -  else if ((libfunc = optab_libfunc (tab, mode)) != NULL_RTX)
> > > +  else if (rtx libfunc = optab_libfunc (tab, mode))
> > >      targetm.expand_divmod_libfunc (libfunc, mode, op0, op1,
> > >                                  &quotient, &remainder);
> > >
> > > --- gcc/testsuite/gcc.target/i386/pr97282.c.jj        2020-10-05 
> > > 15:31:19.500363710 +0200
> > > +++ gcc/testsuite/gcc.target/i386/pr97282.c   2020-10-05 
> > > 15:28:51.864499619 +0200
> > > @@ -0,0 +1,25 @@
> > > +/* PR rtl-optimization/97282 */
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2" } */
> > > +/* { dg-final { scan-assembler "call\[^\n\r]*__udivmod\[dt]i4" } } */
> > > +
> > > +#ifdef __SIZEOF_INT128__
> > > +typedef __uint128_t T;
> > > +#else
> > > +typedef unsigned long long T;
> > > +#endif
> > > +
> > > +unsigned long
> > > +foo (T x)
> > > +{
> > > +  if (x == 0)
> > > +    return 0;
> > > +
> > > +  unsigned long ret = 0;
> > > +  while (x > 0)
> > > +    {
> > > +      ret = ret + x % 10;
> > > +      x = x / 10;
> > > +    }
> > > +  return ret;
> > > +}
> > >
> > >       Jakub
> > >
> > >
> >
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imend

Reply via email to