On Mon, Jun 19, 2023 at 12:15 PM Jan Hubicka <hubi...@ucw.cz> wrote: > > > On Mon, Jun 19, 2023 at 9:52 AM Jan Hubicka via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Hi, > > > this was suggested earlier somewhere, but I can not find the thread. > > > C++ has assume attribute that expands int > > > if (conditional) > > > __builtin_unreachable () > > > We do not want to account the conditional in inline heuristics since > > > we know that it is going to be optimized out. > > > > > > Bootstrapped/regtested x86_64-linux, will commit it later today if > > > thre are no complains. > > > > I think we also had the request to not account the condition feeding > > stmts (if they only feed it and have no side-effects). libstdc++ has > > complex range comparisons here. Also ... > > I was thinking of this: it depends on how smart do we want to get. > We also have dead conditionals guarding clobbers, predicts and other > stuff. In general we can use mark phase of cd-dce telling it to ignore > those statements and then use its resut in the analysis.
Hmm, possible but a bit heavy-handed. There's simple_dce_from_worklist which might be a way to do this (of course we cannot use that 1:1). Also then consider a = a + 1; if (a > 10) __builtin_unreachable (); if (a < 5) __builtin_unreachable (); and a has more than one use but both are going away. So indeed a more global analysis would be needed to get the full benefit. > Also question is how much we can rely on middle-end optimizing out > unreachables. For example: > int a; > int b[3]; > test() > { > if (a > 0) > { > for (int i = 0; i < 3; i++) > b[i] = 0; > __builtin_unreachable (); > } > } > > IMO can be optimized to empty function. I believe we used to have code > in tree-cfgcleanup to remove statements just before > __builtin_unreachable which can not terminate execution, but perhaps it > existed only in my local tree? I think we rely on DCE/DSE here and explicit unreachable () pruning after VRP picked up things (I think it simply gets the secondary effect optimizing the condition it created the range for in the first pass). DSE is appearantly not able to kill the stores, I will fix that. I think DCE can, but only for non-aliased stores. > We could also perhaps declare unreachable NOVOPs which would make DSE to > remove the stores. But only because of a bug in DSE ... it also removes them if that __builtin_unreachable () is GIMPLE_RESX. > > > > ... we do code generate BUILT_IN_UNREACHABLE_TRAP, no? > > You are right. I tested it with -funreachable-traps but it does not do > what I expected, I need -fsanitize=unreachable -fsanitize-trap=unreachable > > Also if I try to call it by hand I get: > > jan@localhost:/tmp> gcc t.c -S -O2 -funreachable-traps > -fdump-tree-all-details -fsanitize=unreachable -fsanitize-trap=unreachable > t.c: In function ‘test’: > t.c:9:13: warning: implicit declaration of function > ‘__builtin_unreachable_trap’; did you mean ‘__builtin_unreachable trap’? > [-Wimplicit-function-declaration] > 9 | __builtin_unreachable_trap (); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~ > | __builtin_unreachable trap > > Which is not as helpful as it is trying to be. > > > > > + ret = true; > > > +done: > > > + for (basic_block vbb:visited_bbs) > > > + cache[vbb->index] = (unsigned char)ret + 1; > > > + return ret; > > > +} > > > + > > > /* Analyze function body for NODE. > > > EARLY indicates run from early optimization pipeline. */ > > > > > > @@ -2743,6 +2791,8 @@ analyze_function_body (struct cgraph_node *node, > > > bool early) > > > const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; > > > order = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); > > > nblocks = pre_and_rev_post_order_compute (NULL, order, false); > > > + auto_vec<unsigned char, 10> cache; > > > + cache.safe_grow_cleared (last_basic_block_for_fn (cfun)); > > > > A sbitmap with two bits per entry would be more space efficient here. > > bitmap > > has bitmap_set_aligned_chunk and bitmap_get_aligned_chunk for convenience, > > adding the corresponding to sbitmap.h would likely ease use there as well. > > I did not know about the chunk API which is certainly nice :) > sbitmap will always allocate, while here we stay on stack for small > functions and I am not sure how much extra bit operations would not > offset extra memset, but overall I think it is all in noise. Ah, yeah. > Honza