On Tue, 5 May 2020, Jakub Jelinek wrote: > Hi! > > On x86 (the only target with umulv4_optab) one can use mull; seto to check > for overflow instead of performing wider multiplication and performing > comparison on the high bits. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Hmm. The preceeding pattern doesn't check for umulv4 availability, so why do it here? I suppose an alternative simplification would be to use a highpart multiply? Do you intentionally not consider ((type)A * CST) >> prec? Thanks, Richard. > 2020-05-05 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/94914 > * match.pd ((((type)A * B) >> prec) != 0 to .MUL_OVERFLOW(A, B) != 0): > New simplification. > > * gcc.target/i386/pr94914.c: New test. > > --- gcc/match.pd.jj 2020-05-04 11:02:14.288865592 +0200 > +++ gcc/match.pd 2020-05-04 12:15:33.220799388 +0200 > @@ -4776,6 +4776,27 @@ (define_operator_list COND_TERNARY > (with { tree t = TREE_TYPE (@0), cpx = build_complex_type (t); } > (out (imagpart (IFN_MUL_OVERFLOW:cpx @0 @1)) { build_zero_cst (t); }))))) > > +/* Similarly, for unsigned operands, (((type) A * B) >> prec) != 0 where type > + is at least twice as wide as type of A and B, simplify to > + __builtin_mul_overflow (A, B, <unused>). */ > +(for cmp (eq ne) > + (simplify > + (cmp (rshift (mult:s (convert@3 @0) (convert @1)) INTEGER_CST@2) > + integer_zerop) > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > + && INTEGRAL_TYPE_P (TREE_TYPE (@3)) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && (TYPE_PRECISION (TREE_TYPE (@3)) > + >= 2 * TYPE_PRECISION (TREE_TYPE (@0))) > + && tree_fits_uhwi_p (@2) > + && tree_to_uhwi (@2) == TYPE_PRECISION (TREE_TYPE (@0)) > + && types_match (@0, @1) > + && type_has_mode_precision_p (TREE_TYPE (@0)) > + && (optab_handler (umulv4_optab, TYPE_MODE (TREE_TYPE (@0))) > + != CODE_FOR_nothing)) > + (with { tree t = TREE_TYPE (@0), cpx = build_complex_type (t); } > + (cmp (imagpart (IFN_MUL_OVERFLOW:cpx @0 @1)) { build_zero_cst (t); }))))) > + > /* Simplification of math builtins. These rules must all be optimizations > as well as IL simplifications. If there is a possibility that the new > form could be a pessimization, the rule should go in the canonicalization > --- gcc/testsuite/gcc.target/i386/pr94914.c.jj 2020-05-04 > 12:58:04.435775670 +0200 > +++ gcc/testsuite/gcc.target/i386/pr94914.c 2020-05-04 12:57:39.307152707 > +0200 > @@ -0,0 +1,17 @@ > +/* PR tree-optimization/94914 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler "\tseto\t" } } */ > +/* { dg-final { scan-assembler "\tsetno\t" } } */ > + > +int > +foo (unsigned int x, unsigned int y) > +{ > + return (((unsigned long long)x * y) >> 32) != 0; > +} > + > +int > +bar (unsigned int x, unsigned int y) > +{ > + return (((unsigned long long)x * y) >> 32) == 0; > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)