Honza,

Thanks a lot for your comments. 

> On May 9, 2023, at 6:22 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
> 
>>>>> 
>>>>> From my understanding, -fprofile-partial-training is one important option 
>>>>> for PGO performance.
>>>> 
>>>> I don't think so, speed benefit would be rather small I guess.
>>> I saw some articles online to introduce this option for gcc10,
>>> https://documentation.suse.com/sbp/all/html/SBP-GCC-10/index.html#sec-gcc10-pgo
>> 
>> Hi.
>> 
>> Ah, I see.
>> 
>>> And also based on my previous experience in Studio compiler, I guess that 
>>> this one might have
>>> Some good performance impact on PGO.  Is there any old performance data on 
>>> this option? (I cannot find online)
>> 
>> Maybe Honza can chime in here? Or Martin who is the author of the white 
>> paper.
> 
> Main motivation for this was profiling programs that contain specific
> code paths for different CPUs (such as graphics library in Firefox or Linux
> kernel). In the situation training machine differs from the machine
> program is run later, we end up optimizing for size all code paths
> except ones taken by the specific CPU.  This patch essentially tells gcc
> to consider every non-trained function as built without profile
> feedback.
Make sense.
> 
> For Firefox it had important impact on graphics rendering tests back
> then since the building machined had AVX while the benchmarking did not.
> Some benchmarks improved several times which is not a surprise if you
> consider tight graphics rendering loop optimized for size versus
> vectorized one.  

That’s a lot of improvement. So, without -fprofile-partial-training, the PGO 
hurt the performance for those cases? 

> The patch has bad effect on code size which in turn
> impacts performance too, so I think it makes sense to use
> -fprofile-partial-training with bit of care (i.e. only one code where
> such scenarios are likely).

Right. 
> 
> As for backporting, I do not have checkout of GCC 8 right now. It
> depends on profile infrastructure that was added in 2017 (so stage1 of
> GCC 8), so the patch may backport quite easilly.  I am not 100% sure
> what shape the infrastrucure was in the first version, but I am quite
> convinced it had the necessary bits - it was able to make the difference
> between 0 profile count and missing profile feedback.

This is good to know, I will try to back port to GCC8 and let them test to see 
any good impact.

Qing
> 
> Honza
>> 

Reply via email to