On 9/24/20 2:44 PM, Richard Biener wrote:
> On Thu, 24 Sep 2020, Tom de Vries wrote:
> 
>> On 9/24/20 1:42 PM, Richard Biener wrote:
>>> On Wed, 23 Sep 2020, Tom de Vries wrote:
>>>
>>>> On 9/23/20 9:28 AM, Richard Biener wrote:
>>>>> On Tue, 22 Sep 2020, Tom de Vries wrote:
>>>>>
>>>>>> [ was: Re: [Patch] [middle-end & nvptx] gcc/tracer.c: Don't split BB
>>>>>> with SIMT LANE [PR95654] ]
>>>>>>
>>>>>> On 9/16/20 8:20 PM, Alexander Monakov wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Wed, 16 Sep 2020, Tom de Vries wrote:
>>>>>>>
>>>>>>>> [ cc-ing author omp support for nvptx. ]
>>>>>>>
>>>>>>> The issue looks familiar. I recognized it back in 2017 (and LLVM people
>>>>>>> recognized it too for their GPU targets). In an attempt to get agreement
>>>>>>> to fix the issue "properly" for GCC I found a similar issue that affects
>>>>>>> all targets, not just offloading, and filed it as PR 80053.
>>>>>>>
>>>>>>> (yes, there are no addressable labels involved in offloading, but 
>>>>>>> nevertheless
>>>>>>> the nature of the middle-end issue is related)
>>>>>>
>>>>>> Hi Alexander,
>>>>>>
>>>>>> thanks for looking into this.
>>>>>>
>>>>>> Seeing that the attempt to fix things properly is stalled, for now I'm
>>>>>> proposing a point-fix, similar to the original patch proposed by Tobias.
>>>>>>
>>>>>> Richi, Jakub, OK for trunk?
>>>>>
>>>>> I notice that we call ignore_bb_p many times in tracer.c but one call
>>>>> is conveniently early in tail_duplicate (void):
>>>>>
>>>>>       int n = count_insns (bb);
>>>>>       if (!ignore_bb_p (bb))
>>>>>         blocks[bb->index] = heap.insert (-bb->count.to_frequency (cfun), 
>>>>> bb);
>>>>>
>>>>> where count_insns already walks all stmts in the block.  It would be
>>>>> nice to avoid repeatedly walking all stmts, maybe adjusting the above
>>>>> call is enough and/or count_insns can compute this and/or the ignore_bb_p
>>>>> result can be cached (optimize_bb_for_size_p might change though,
>>>>> but maybe all other ignore_bb_p calls effectively just are that,
>>>>> checks for blocks that became optimize_bb_for_size_p).
>>>>>
>>>>
>>>> This untested follow-up patch tries something in that direction.
>>>>
>>>> Is this what you meant?
>>>
>>> Yeah, sort of.
>>>
>>> +static bool
>>> +cached_can_duplicate_bb_p (const_basic_block bb)
>>> +{
>>> +  if (can_duplicate_bb)
>>>
>>> is there any path where can_duplicate_bb would be NULL?
>>>
>>
>> Yes, ignore_bb_p is called from gimple-ssa-split-paths.c.
> 
> Oh, that was probably done because of the very same OMP issue ...
> 
>>> +    {
>>> +      unsigned int size = SBITMAP_SIZE (can_duplicate_bb);
>>> +      /* Assume added bb's should be ignored.  */
>>> +      if ((unsigned int)bb->index < size
>>> +         && bitmap_bit_p (can_duplicate_bb_computed, bb->index))
>>> +       return !bitmap_bit_p (can_duplicate_bb, bb->index);
>>>
>>> yes, newly added bbs should be ignored so,
>>>
>>>      }
>>>  
>>> -  return false;
>>> +  bool val = compute_can_duplicate_bb_p (bb);
>>> +  if (can_duplicate_bb)
>>> +    cache_can_duplicate_bb_p (bb, val);
>>>
>>> no need to compute & cache for them, just return true (because
>>> we did duplicate them)?
>>>
>>
>> Also the case for gimple-ssa-split-paths.c.?
> 
> If it had the bitmap then yes ... since it doesn't the early
> out should be in the conditional above only.
> 

Ack, updated the patch accordingly, and split it up in two bits, one
that does refactoring, and one that adds the actual caching:
- [ftracer] Factor out can_duplicate_bb_p
- [ftracer] Add caching of can_duplicate_bb_p

I'll post these in reply to this email.

Thanks,
- Tom

Reply via email to