On Tue, 5 May 2020, Jakub Jelinek wrote:

> Hi!
> 
> On x86 (the only target with umulv4_optab) one can use mull; seto to check
> for overflow instead of performing wider multiplication and performing
> comparison on the high bits.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Hmm.  The preceeding pattern doesn't check for umulv4 availability, so
why do it here?  I suppose an alternative simplification would be
to use a highpart multiply?  Do you intentionally not consider
((type)A * CST) >> prec?

Thanks,
Richard.

> 2020-05-05  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/94914
>       * match.pd ((((type)A * B) >> prec) != 0 to .MUL_OVERFLOW(A, B) != 0):
>       New simplification.
> 
>       * gcc.target/i386/pr94914.c: New test.
> 
> --- gcc/match.pd.jj   2020-05-04 11:02:14.288865592 +0200
> +++ gcc/match.pd      2020-05-04 12:15:33.220799388 +0200
> @@ -4776,6 +4776,27 @@ (define_operator_list COND_TERNARY
>     (with { tree t = TREE_TYPE (@0), cpx = build_complex_type (t); }
>      (out (imagpart (IFN_MUL_OVERFLOW:cpx @0 @1)) { build_zero_cst (t); })))))
>  
> +/* Similarly, for unsigned operands, (((type) A * B) >> prec) != 0 where type
> +   is at least twice as wide as type of A and B, simplify to
> +   __builtin_mul_overflow (A, B, <unused>).  */
> +(for cmp (eq ne)
> + (simplify
> +  (cmp (rshift (mult:s (convert@3 @0) (convert @1)) INTEGER_CST@2)
> +       integer_zerop)
> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +       && INTEGRAL_TYPE_P (TREE_TYPE (@3))
> +       && TYPE_UNSIGNED (TREE_TYPE (@0))
> +       && (TYPE_PRECISION (TREE_TYPE (@3))
> +        >= 2 * TYPE_PRECISION (TREE_TYPE (@0)))
> +       && tree_fits_uhwi_p (@2)
> +       && tree_to_uhwi (@2) == TYPE_PRECISION (TREE_TYPE (@0))
> +       && types_match (@0, @1)
> +       && type_has_mode_precision_p (TREE_TYPE (@0))
> +       && (optab_handler (umulv4_optab, TYPE_MODE (TREE_TYPE (@0)))
> +        != CODE_FOR_nothing))
> +   (with { tree t = TREE_TYPE (@0), cpx = build_complex_type (t); }
> +    (cmp (imagpart (IFN_MUL_OVERFLOW:cpx @0 @1)) { build_zero_cst (t); })))))
> +
>  /* Simplification of math builtins.  These rules must all be optimizations
>     as well as IL simplifications.  If there is a possibility that the new
>     form could be a pessimization, the rule should go in the canonicalization
> --- gcc/testsuite/gcc.target/i386/pr94914.c.jj        2020-05-04 
> 12:58:04.435775670 +0200
> +++ gcc/testsuite/gcc.target/i386/pr94914.c   2020-05-04 12:57:39.307152707 
> +0200
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/94914 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler "\tseto\t" } } */
> +/* { dg-final { scan-assembler "\tsetno\t" } } */
> +
> +int
> +foo (unsigned int x, unsigned int y)
> +{
> +  return (((unsigned long long)x * y) >> 32) != 0;
> +}
> +
> +int
> +bar (unsigned int x, unsigned int y)
> +{
> +  return (((unsigned long long)x * y) >> 32) == 0;
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to