On 9/24/20 2:44 PM, Richard Biener wrote: > On Thu, 24 Sep 2020, Tom de Vries wrote: > >> On 9/24/20 1:42 PM, Richard Biener wrote: >>> On Wed, 23 Sep 2020, Tom de Vries wrote: >>> >>>> On 9/23/20 9:28 AM, Richard Biener wrote: >>>>> On Tue, 22 Sep 2020, Tom de Vries wrote: >>>>> >>>>>> [ was: Re: [Patch] [middle-end & nvptx] gcc/tracer.c: Don't split BB >>>>>> with SIMT LANE [PR95654] ] >>>>>> >>>>>> On 9/16/20 8:20 PM, Alexander Monakov wrote: >>>>>>> >>>>>>> >>>>>>> On Wed, 16 Sep 2020, Tom de Vries wrote: >>>>>>> >>>>>>>> [ cc-ing author omp support for nvptx. ] >>>>>>> >>>>>>> The issue looks familiar. I recognized it back in 2017 (and LLVM people >>>>>>> recognized it too for their GPU targets). In an attempt to get agreement >>>>>>> to fix the issue "properly" for GCC I found a similar issue that affects >>>>>>> all targets, not just offloading, and filed it as PR 80053. >>>>>>> >>>>>>> (yes, there are no addressable labels involved in offloading, but >>>>>>> nevertheless >>>>>>> the nature of the middle-end issue is related) >>>>>> >>>>>> Hi Alexander, >>>>>> >>>>>> thanks for looking into this. >>>>>> >>>>>> Seeing that the attempt to fix things properly is stalled, for now I'm >>>>>> proposing a point-fix, similar to the original patch proposed by Tobias. >>>>>> >>>>>> Richi, Jakub, OK for trunk? >>>>> >>>>> I notice that we call ignore_bb_p many times in tracer.c but one call >>>>> is conveniently early in tail_duplicate (void): >>>>> >>>>> int n = count_insns (bb); >>>>> if (!ignore_bb_p (bb)) >>>>> blocks[bb->index] = heap.insert (-bb->count.to_frequency (cfun), >>>>> bb); >>>>> >>>>> where count_insns already walks all stmts in the block. It would be >>>>> nice to avoid repeatedly walking all stmts, maybe adjusting the above >>>>> call is enough and/or count_insns can compute this and/or the ignore_bb_p >>>>> result can be cached (optimize_bb_for_size_p might change though, >>>>> but maybe all other ignore_bb_p calls effectively just are that, >>>>> checks for blocks that became optimize_bb_for_size_p). >>>>> >>>> >>>> This untested follow-up patch tries something in that direction. >>>> >>>> Is this what you meant? >>> >>> Yeah, sort of. >>> >>> +static bool >>> +cached_can_duplicate_bb_p (const_basic_block bb) >>> +{ >>> + if (can_duplicate_bb) >>> >>> is there any path where can_duplicate_bb would be NULL? >>> >> >> Yes, ignore_bb_p is called from gimple-ssa-split-paths.c. > > Oh, that was probably done because of the very same OMP issue ... > >>> + { >>> + unsigned int size = SBITMAP_SIZE (can_duplicate_bb); >>> + /* Assume added bb's should be ignored. */ >>> + if ((unsigned int)bb->index < size >>> + && bitmap_bit_p (can_duplicate_bb_computed, bb->index)) >>> + return !bitmap_bit_p (can_duplicate_bb, bb->index); >>> >>> yes, newly added bbs should be ignored so, >>> >>> } >>> >>> - return false; >>> + bool val = compute_can_duplicate_bb_p (bb); >>> + if (can_duplicate_bb) >>> + cache_can_duplicate_bb_p (bb, val); >>> >>> no need to compute & cache for them, just return true (because >>> we did duplicate them)? >>> >> >> Also the case for gimple-ssa-split-paths.c.? > > If it had the bitmap then yes ... since it doesn't the early > out should be in the conditional above only. >
Ack, updated the patch accordingly, and split it up in two bits, one that does refactoring, and one that adds the actual caching: - [ftracer] Factor out can_duplicate_bb_p - [ftracer] Add caching of can_duplicate_bb_p I'll post these in reply to this email. Thanks, - Tom