On Mon, Oct 25, 2021 at 6:58 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > On 10/20/21 6:28 AM, Aldy Hernandez wrote: > > Sometimes we can solve a candidate path without having to recurse > > further back. This can mostly happen in fully resolving mode, because > > we can ask the ranger what the range on entry to the path is, but > > there's no reason this can't always apply. This one-liner removes > > the fully-resolving restriction. > > > > I'm tickled pink to see how many things we now get quite early > > in the compilation. I actually had to disable jump threading entirely > > for a few tests because the early threader was catching things > > disturbingly early. Also, as Richi predicted, I saw a lot of pre-VRP > > cleanups happening. > > > > I was going to commit this as obvious, but I think the test changes > > merit discussion. > > > > We've been playing games with gcc.dg/tree-ssa/ssa-thread-11.c for quite > > some time. Every time a threading pass gets smarter, we push the > > check further down the pipeline. We've officially run out of dumb > > threading passes to disable ;-). In the last year we've gone up from a > > handful of threads, to 34 threads with the current combination of > > options. I doubt this is testing anything useful any more, so I've > > removed it. > > > > Similarly for gcc.dg/tree-ssa/ssa-dom-thread-4.c. We used to thread 3 > > jump threads, but they were disallowed because of loop rotation. Then > > we started catching more jump threads in VRP2 threading so we tested > > there. With this patch though, we triple the number of threads found > > from 11 to 31. I believe this test has outlived its usefulness, and > > I've removed it. Note that even though we have these outrageous > > possibilities for this test, the block copier ultimately chops them > > down (23 survive though). > > Im running into an issue with ssa-dom-thread-4.c when trying to run > ranger for the VRP2 pass. It reduces the number of threads to 2, and > upon closer inspection as to why, I see: > > unsigned char > bitmap_ior_and_compl (bitmap dst, const_bitmap a, const_bitmap b, > const_bitmap kill) > { > unsigned char changed = 0; > > bitmap_element *dst_elt; > const bitmap_element *a_elt, *b_elt, *kill_elt, *dst_prev; > > while (a_elt || b_elt) > { > > Ranger determines that the uses of a_elt and b_elt in the guard are used > before defined, so assumed UNDFINED and removes a condition check. > > So it seems like this entire test case is predicated on undefined > behaviour? fwiw, If I initialize them, I get 0 threads...
Hah. That makes sense. As the threaders have gotten smarter, the scan-tree-dump-times has moved later on in the pipeline to less capable threaders. Currently it's testing for 3 threads in the first VRP threader pass. With my proposed patch I bet we see an UNDEFINED somewhere in the calculation, and bail on the entire thread as unreachable. Aldy