Hi Jakub, > -----Original Message----- > From: Jakub Jelinek <ja...@redhat.com> > Sent: Thursday, November 25, 2021 8:19 AM > To: Richard Biener <rguent...@suse.de> > Cc: Tamar Christina <tamar.christ...@arm.com>; gcc-patches@gcc.gnu.org > Subject: [PATCH] match.pd: Fix up the recent bitmask_inv_cst_vector_p > simplification [PR103417] > > Hi! > > The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10 > changes. > The simplification triggers on > (x & 4294967040U) >= 0U > and turns it into: > x <= 255U > which is incorrect, it should fold to 1 because unsigned >= 0U is always true > and normally the > /* Non-equality compare simplifications from fold_binary */ > (if (wi::to_wide (cst) == min) > (if (cmp == GE_EXPR) > { constant_boolean_node (true, type); }) simplification folds that, > but > this simplification was done earlier. > > The simplification correctly doesn't include lt which has the same reason why > it shouldn't be handled, we'll fold it to 0 elsewhere.
Yes this was a bug, sorry I'm not sure why I didn't catch it... > > But, IMNSHO while it isn't incorrect to handle le and gt there, it is > unnecessary. Because (x & cst) <= 0U and (x & cst) > 0U should never appear, > again in > /* Non-equality compare simplifications from fold_binary */ we have a > simplification for it: > (if (cmp == LE_EXPR) > (eq @2 @1)) > (if (cmp == GT_EXPR) > (ne @2 @1)))) > This is done for > (cmp (convert?@2 @0) uniform_integer_cst_p@1) and so should be done > for both integers and vectors. > As the bitmask_inv_cst_vector_p simplification only handles eq and ne for > signed types, I think it can be simplified to just following patch. As I mentioned on the PR I don't think LE and GT should be removed, the patch Is attempting to simplify the bitmask used because most vector ISAs can create the simpler mask much easier than the complex mask. It. 0xFFFFFF00 is harder to create than 0xFF. So while for scalar it doesn't matter as much, it does for vector code. Regards, Tamar > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > I wonder if (for cst (VECTOR_CST INTEGER_CST) is good for the best size of *- > match.c, wouldn't accepting just CONSTANT_CLASS_P@1 and then just say in > bitmask_inv_cst_vector_p return NULL_TREE if it isn't INTEGER_CST or > VECTOR_CST? > > Also, without/with this patch I see on i686-linux (can be reproduced with > RUNTESTFLAGS="--target_board=unix/-m32/-mno-sse dg.exp='bic-bitmask* > signbit-2*'" > too): > FAIL: gcc.dg/bic-bitmask-10.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-11.c scan-tree-dump dce7 ">\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-12.c scan-tree-dump dce7 "<=\\\\s*.+{ 255,.+}" > FAIL: gcc.dg/bic-bitmask-2.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-23.c (test for excess errors) > FAIL: gcc.dg/bic-bitmask-23.c scan-tree-dump dce7 "<=\\\\s*.+{ 255, 15, 1, > 65535 }" > FAIL: gcc.dg/bic-bitmask-3.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-4.c scan-tree-dump-times dce7 "=\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-5.c scan-tree-dump-times dce7 ">\\\\s*.+{ 255,.+}" > 1 > FAIL: gcc.dg/bic-bitmask-6.c scan-tree-dump-times dce7 > "<=\\\\s*.+{ 255,.+}" 1 > FAIL: gcc.dg/bic-bitmask-8.c scan-tree-dump-times dce7 ">\\\\s*.+{ 1,.+}" 1 > FAIL: gcc.dg/bic-bitmask-9.c scan-tree-dump dce7 > "&\\\\s*.+{ 4294967290,.+}" > FAIL: gcc.dg/signbit-2.c scan-tree-dump optimized "\\\\s+>\\\\s+{ 0(, 0)+ }" > Those tests use vect_int effective target, but AFAIK that can be used only in > *.dg/vect/ because it relies on vect.exp enabling options to support > vectorization on the particular target (e.g. for i686-linux that -msse2). > I think there isn't other way to get the DEFAULT_VECTCFLAGS into dg- > options other than having the test driven by vect.exp. > > And, finally, I've noticed incorrect formatting in the new > bitmask_inv_cst_vector_p routine: > do { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } while (++idx < nelts); > It should be > do > { > if (idx > 0) > cst = vector_cst_elt (t, idx); > ... > builder.quick_push (newcst); > } > while (++idx < nelts); > > 2021-11-25 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/103417 > * match.pd ((X & Y) CMP 0): Only handle eq and ne. Commonalize > common tests. > > * gcc.c-torture/execute/pr103417.c: New test. > > --- gcc/match.pd.jj 2021-11-24 11:46:03.191918052 +0100 > +++ gcc/match.pd 2021-11-24 22:33:43.852575772 +0100 > @@ -5214,20 +5214,16 @@ (define_operator_list SYNC_FETCH_AND_AND > /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z > where ~Y + 1 == pow2 and Z = ~Y. */ (for cst (VECTOR_CST INTEGER_CST) > - (for cmp (le eq ne ge gt) > - icmp (le le gt le gt) > - (simplify > - (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > - (with { tree csts = bitmask_inv_cst_vector_p (@1); } > - (switch > - (if (csts && TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > - (icmp @0 { csts; })) > - (if (csts && !TYPE_UNSIGNED (TREE_TYPE (@1)) > - && (cmp == EQ_EXPR || cmp == NE_EXPR) > - && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (for cmp (eq ne) > + icmp (le gt) > + (simplify > + (cmp (bit_and:c@2 @0 cst@1) integer_zerop) > + (with { tree csts = bitmask_inv_cst_vector_p (@1); } > + (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2))) > + (if (TYPE_UNSIGNED (TREE_TYPE (@1))) > + (icmp @0 { csts; }) > (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); } > - (icmp (convert:utype @0) { csts; })))))))) > + (icmp (convert:utype @0) { csts; })))))))) > > /* -A CMP -B -> B CMP A. */ > (for cmp (tcc_comparison) > --- gcc/testsuite/gcc.c-torture/execute/pr103417.c.jj 2021-11-24 > 22:36:00.732626424 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr103417.c 2021-11-24 > 22:35:43.964865218 +0100 > @@ -0,0 +1,11 @@ > +/* PR tree-optimization/103417 */ > + > +struct { int a : 8; int b : 24; } c = { 0, 1 }; > + > +int > +main () > +{ > + if (c.b && !c.b) > + __builtin_abort (); > + return 0; > +} > > Jakub