On Wed, 13 Jan 2021, Qing Zhao wrote:

> 
> 
> > On Jan 13, 2021, at 1:39 AM, Richard Biener <rguent...@suse.de> wrote:
> > 
> > On Tue, 12 Jan 2021, Qing Zhao wrote:
> > 
> >> Hi, 
> >> 
> >> Just check in to see whether you have any comments and suggestions on this:
> >> 
> >> FYI, I have been continue with Approach D implementation since last week:
> >> 
> >> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
> >> .DEFFERED_INIT during expand to
> >> real initialization. Adjusting uninitialized pass with the new refs with 
> >> “.DEFFERED_INIT”.
> >> 
> >> For the remaining work of Approach D:
> >> 
> >> ** complete the implementation of -ftrivial-auto-var-init=pattern;
> >> ** complete the implementation of uninitialized warnings maintenance work 
> >> for D. 
> >> 
> >> I have completed the uninitialized warnings maintenance work for D.
> >> And finished partial of the -ftrivial-auto-var-init=pattern 
> >> implementation. 
> >> 
> >> The following are remaining work of Approach D:
> >> 
> >>   ** -ftrivial-auto-var-init=pattern for VLA;
> >>   **add a new attribute for variable:
> >> __attribute((uninitialized)
> >> the marked variable is uninitialized intentionaly for performance purpose.
> >>   ** adding complete testing cases;
> >> 
> >> 
> >> Please let me know if you have any objection on my current decision on 
> >> implementing approach D. 
> > 
> > Did you do any analysis on how stack usage and code size are changed 
> > with approach D?
> 
> I did the code size change comparison (I will provide the data in another 
> email). And with this data, D works better than A in general. (This is 
> surprise to me actually).
> 
> But not the stack usage.  Not sure how to collect the stack usage data, 
> do you have any suggestion on this?

There is -fstack-usage you could use, then of course watching
the stack segment at runtime.  I'm mostly concerned about
stack-limited "processes" such as the linux kernel which I think
is a primary target of your work.

Richard.

> 
> > How does compile-time behave (we could gobble up
> > lots of .DEFERRED_INIT calls I guess)?
> I can collect this data too and report it later.
> 
> Thanks.
> 
> Qing
> > 
> > Richard.
> > 
> >> Thanks a lot for your help.
> >> 
> >> Qing
> >> 
> >> 
> >>> On Jan 5, 2021, at 1:05 PM, Qing Zhao via Gcc-patches 
> >>> <gcc-patches@gcc.gnu.org> wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> This is an update for our previous discussion. 
> >>> 
> >>> 1. I implemented the following two different implementations in the 
> >>> latest upstream gcc:
> >>> 
> >>> A. Adding real initialization during gimplification, not maintain the 
> >>> uninitialized warnings.
> >>> 
> >>> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
> >>> .DEFFERED_INIT during expand to
> >>> real initialization. Adjusting uninitialized pass with the new refs with 
> >>> “.DEFFERED_INIT”.
> >>> 
> >>> Note, in this initial implementation,
> >>>   ** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of 
> >>> -ftrivial-auto-var-init=pattern 
> >>>      is not done yet.  Therefore, the performance data is only about 
> >>> -ftrivial-auto-var-init=zero. 
> >>> 
> >>>   ** I added an temporary  option -fauto-var-init-approach=A|B|C|D  to 
> >>> choose implementation A or D for 
> >>>      runtime performance study.
> >>>   ** I didn’t finish the uninitialized warnings maintenance work for D. 
> >>> (That might take more time than I expected). 
> >>> 
> >>> 2. I collected runtime data for CPU2017 on a x86 machine with this new 
> >>> gcc for the following 3 cases:
> >>> 
> >>> no: default. (-g -O2 -march=native )
> >>> A:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=A 
> >>> D:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=D 
> >>> 
> >>> And then compute the slowdown data for both A and D as following:
> >>> 
> >>> benchmarks                A / no  D /no
> >>> 
> >>> 500.perlbench_r   1.25%   1.25%
> >>> 502.gcc_r         0.68%   1.80%
> >>> 505.mcf_r         0.68%   0.14%
> >>> 520.omnetpp_r     4.83%   4.68%
> >>> 523.xalancbmk_r   0.18%   1.96%
> >>> 525.x264_r                1.55%   2.07%
> >>> 531.deepsjeng_    11.57%  11.85%
> >>> 541.leela_r               0.64%   0.80%
> >>> 557.xz_                    -0.41% -0.41%
> >>> 
> >>> 507.cactuBSSN_r   0.44%   0.44%
> >>> 508.namd_r                0.34%   0.34%
> >>> 510.parest_r              0.17%   0.25%
> >>> 511.povray_r              56.57%  57.27%
> >>> 519.lbm_r         0.00%   0.00%
> >>> 521.wrf_r                  -0.28% -0.37%
> >>> 526.blender_r             16.96%  17.71%
> >>> 527.cam4_r                0.70%   0.53%
> >>> 538.imagick_r             2.40%   2.40%
> >>> 544.nab_r         0.00%   -0.65%
> >>> 
> >>> avg                               5.17%   5.37%
> >>> 
> >>> From the above data, we can see that in general, the runtime performance 
> >>> slowdown for 
> >>> implementation A and D are similar for individual benchmarks.
> >>> 
> >>> There are several benchmarks that have significant slowdown with the new 
> >>> added initialization for both
> >>> A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I 
> >>> will try to study a little bit
> >>> more on what kind of new initializations introduced such slowdown. 
> >>> 
> >>> From the current study so far, I think that approach D should be good 
> >>> enough for our final implementation. 
> >>> So, I will try to finish approach D with the following remaining work
> >>> 
> >>>     ** complete the implementation of -ftrivial-auto-var-init=pattern;
> >>>     ** complete the implementation of uninitialized warnings maintenance 
> >>> work for D. 
> >>> 
> >>> 
> >>> Let me know if you have any comments and suggestions on my current and 
> >>> future work.
> >>> 
> >>> Thanks a lot for your help.
> >>> 
> >>> Qing
> >>> 
> >>>> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches 
> >>>> <gcc-patches@gcc.gnu.org> wrote:
> >>>> 
> >>>> The following are the approaches I will implement and compare:
> >>>> 
> >>>> Our final goal is to keep the uninitialized warning and minimize the 
> >>>> run-time performance cost.
> >>>> 
> >>>> A. Adding real initialization during gimplification, not maintain the 
> >>>> uninitialized warnings.
> >>>> B. Adding real initialization during gimplification, marking them with 
> >>>> “artificial_init”. 
> >>>>   Adjusting uninitialized pass, maintaining the annotation, making sure 
> >>>> the real init not
> >>>>   Deleted from the fake init. 
> >>>> C.  Marking the DECL for an uninitialized auto variable as 
> >>>> “no_explicit_init” during gimplification,
> >>>>    maintain this “no_explicit_init” bit till after 
> >>>> pass_late_warn_uninitialized, or till pass_expand, 
> >>>>    add real initialization for all DECLs that are marked with 
> >>>> “no_explicit_init”.
> >>>> D. Adding .DEFFERED_INIT during gimplification, expand the 
> >>>> .DEFFERED_INIT during expand to
> >>>>   real initialization. Adjusting uninitialized pass with the new refs 
> >>>> with “.DEFFERED_INIT”.
> >>>> 
> >>>> 
> >>>> In the above, approach A will be the one that have the minimum run-time 
> >>>> cost, will be the base for the performance
> >>>> comparison. 
> >>>> 
> >>>> I will implement approach D then, this one is expected to have the most 
> >>>> run-time overhead among the above list, but
> >>>> Implementation should be the cleanest among B, C, D. Let’s see how much 
> >>>> more performance overhead this approach
> >>>> will be. If the data is good, maybe we can avoid the effort to implement 
> >>>> B, and C. 
> >>>> 
> >>>> If the performance of D is not good, I will implement B or C at that 
> >>>> time.
> >>>> 
> >>>> Let me know if you have any comment or suggestions.
> >>>> 
> >>>> Thanks.
> >>>> 
> >>>> Qing
> >>> 
> >> 
> >> 
> > 
> > -- 
> > Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de>>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to