On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <tho...@codesourcery.com> wrote: > > Hi! > > On 2022-10-20T14:23:33+0200, Aldy Hernandez <al...@redhat.com> wrote: > >> I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)', > >> right? > > > > For r == 2, r & 3 == 2, whereas (r & 2) && (r & 1) == 0, so no? > > Thanks, and now please let me crawl back under my stone, embarassing... > That'd rather be '(r & 2) || (r & 1)'.
No worries. If there was a tally of how many times a GCC hacker has to crawl under a stone, I'd have the record ;-). > > Well, with that now clarified, how about the again updated > "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached? I see 7 different tests in this patch. Did the 6 that pass, fail before my patch for PR107195 and are now working? Cause unless that's the case, they shouldn't be in a test named pr107195-3.c, but somewhere else. I see there's one XFAILed test in your patch, and this certainly doesn't look like something that has anything to do with the patch I submitted. Perhaps you could open a PR with an enhancement request for this one? That being said... /* { dg-additional-options -O1 } */ extern int __attribute__((const)) foo4b (int); int f4b (unsigned int r) { if (foo4b (r)) r *= 8U; if ((r / 2U) & 2U) r += foo4b (r); return r; } /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 { xfail *-*-* } } } */ At -O2, this is something PRE is doing, so GCC already handles this. However, you are suggesting this isn't handled at -O1 and should be?? None of the VRPs run at -O1 so ranger-vrp won't even get a chance. However, DOM runs at -O1 and it uses ranger to do simple copy propagation and some jump threading...so technically we could do something... DOM should be able to thread from the r *= 8U to the return because the nonzero mask (known zeros) after the multiplication is 0xfffffff8, which it could use to solve the second conditional as false. This would leave us with: if (foo4b (r)) { r *= 8U; return r; } else { if ((r / 2U) & 2U) r += foo4b (r); } ...which exposes the fact that the second call to foo4b() has the same "r" as the first one, so it could be folded. I don't know whose job it is to notice that two const calls have the same arguments, but ISTM that if we thread the above correctly, someone should be able to clean this up. No clue whether this happens at -O1. However... we're not threading this. It looks like we're not keeping track of nonzero bits (known zeros) through the division. The multiplication gives us 0xfffffff8 and we should be able to divide that by 2 and get 0x7ffffffc which solves the second conditional to 0. So...maybe DOM+ranger could set things up for another pass to clean this up? Either way, you could open an enhancement request, if anything to keep the nonzero mask up to date through the division. Aldy