https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jan 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Short testcase:
> function foo(a)
>   integer(kind=4) :: a(1024)
>   a(:) = modulo (a(:), 39)
> end function
> -O2 -mcpu=power10.
> vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
> (and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
> same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
> TRUNC_*).
> I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
> in the vectorizer and tree-vect-generic.cc

True.

> Further gradual improvements can be:
> 1) match.pd has:
> /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
>    TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  */
> (simplify
>  (floor_div @0 @1)
>  (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>       && TYPE_UNSIGNED (type))
>   (trunc_div @0 @1)))
> but expmed.cc has:
>   /* Promote floor rounding to trunc rounding for unsigned operations.  */
>   if (unsignedp)
>     {
>       if (code == FLOOR_DIV_EXPR)
>         code = TRUNC_DIV_EXPR;
>       if (code == FLOOR_MOD_EXPR)
>         code = TRUNC_MOD_EXPR;
>       if (code == EXACT_DIV_EXPR && op1_is_pow2)
>         code = TRUNC_DIV_EXPR;
>     }
> Shouldn't we make it
> (for floor_divmod (floor_div floor_mod)
>      trunc_divmod (trunc_div trunc_mod)
>  (simplify
>   (floor_divmod @0 @1)
>   (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>        && TYPE_UNSIGNED (type))
>    (trunc_divmod @0 @1))))
> ?

Yeah, if the simplification is incomplete we should amend it.

> 2) as the RTL optabs really do just trunc div/mod, perhaps
> tree-vect-patterns.cc
> could be changed to replace some or all of those operations with the trunc
> operation followed by some arith and cond_exprs so that the vectorizer knows
> actual cost of those operations.
> E.g. it seems expmed.cc expands
> r = x %[fl] y;
> as
> r = x % y; if (r && (x ^ y) < 0) r += y;
> and
> d = x /[fl] y;
> would be
> r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
> Looking at wide-int.h,
> r = x %[cl] y;
> as
> r = x % y; if (r && (x ^ y) >= 0) r -= y;
> and
> d = /[cl] y;
> as
> r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
> All of the above for signed, as I said earlier, unsigned [fl] is the same as
> trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
> [rd] is even more complex.

That sounds reasonable as well.  I think we can do 0) and 1) now and
defer 2) to the next stage1, maybe tracking it with an enhancement
bugreport.

Reply via email to