On Wed, 2024-10-02 at 14:14 -0700, Andi Kleen wrote:
> From: Andi Kleen <a...@gcc.gnu.org>
> 
> Time vars normally use times(2) to get the user/sys/wall time, which
> is always a
> system call. I don't think the system time is very useful because
> most overhead
> is in user time. If we only use the wall (or monotonic) time modern
> OS have an
> optimized path to get it directly from a CPU instruction like RDTSC
> without system call, which is much faster.
> 
> Comparing the overhead with tramp3d:
> 
>   ./gcc/cc1plus -quiet  ../tsrc/tramp3d-v4.i ran
>     1.03 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report-
> wall ../tsrc/tramp3d-v4.i
>     1.18 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report
> ../tsrc/tramp3d-v4.i
> 
> -ftime-report costs 18% (excluding the output), while -ftime-report-
> wall
> only costs 3%, so is nearly free. So it would be feasible for some
> build
> system to always enable it and break down the build time into passes.
> 
> The drawback is that if there is context switching with other
> programs
> the time will be overestimated, however for the common case that the
> system is not oversubscribed it is more accurate because each
> measurement has less overhead.
> 
> Add a -ftime-report-wall option. It actually uses the POSIX monotonic
> time,
> so strictly it's not wall clock, but it's still a reasonable name.
> 
> Bootstrapped on x86_64-linux with full test suite run.

Note that if the user requests SARIF output e.g. with
  -fdiagnostics-format=sarif-stderr
then any timevar data from -ftime-report is written in JSON form as
part of the SARIF, rather than in text form to stderr (see
75d623946d4b6ea80a777b789b116d4b4a2298dc).

I see that the proposed patch leaves the user and sys stats as zero,
and conditionalizes what's printed for text output as part of
timer::print.  Should it also do something similar in
make_json_for_timevar_time_def for the json output, and not add the
properties for "user" and "sys" if the data hasn't been gathered?

Hope I'm reading the patch correctly.

Thanks
Dave

> 
> gcc/ChangeLog:
> 
>       * common.opt (ftime-report-wall): Add.
>       * common.opt.urls: Regenerate.
>       * doc/invoke.texi: (ftime-report-wall): Document
>       * gcc.cc (try_generate_repro): Check for -ftime-report-wall.
>       * timevar.cc (get_time): Use clock_gettime if enabled.
>       (timer::print): Print only wall time for time_report_wall.
>       * toplev.cc (toplev::start_timevars): Check for
> time_report_wall.
> 
> gcc/testsuite/ChangeLog:
> 
>       * g++.dg/ext/timevar3.C: New test.
> ---
>  gcc/common.opt                      |  4 ++++
>  gcc/common.opt.urls                 |  3 +++
>  gcc/doc/invoke.texi                 |  7 +++++++
>  gcc/gcc.cc                          |  3 ++-
>  gcc/testsuite/g++.dg/ext/timevar3.C | 14 +++++++++++++
>  gcc/timevar.cc                      | 31 +++++++++++++++++++++++----
> --
>  gcc/toplev.cc                       |  3 ++-
>  7 files changed, 57 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/timevar3.C
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d270e524ff45..e9fb15e28d80 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3010,6 +3010,10 @@ ftime-report
>  Common Var(time_report)
>  Report the time taken by each compiler pass.
>  
> +ftime-report-wall
> +Common Var(time_report_wall)
> +Report the wall time taken by each compiler.
> +
>  ftime-report-details
>  Common Var(time_report_details)
>  Record times taken by sub-phases separately.
> diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls
> index e31736cd9945..6e79a8f9390b 100644
> --- a/gcc/common.opt.urls
> +++ b/gcc/common.opt.urls
> @@ -1378,6 +1378,9 @@ UrlSuffix(gcc/Optimize-Options.html#index-
> fthread-jumps)
>  ftime-report
>  UrlSuffix(gcc/Developer-Options.html#index-ftime-report)
>  
> +ftime-report-wall
> +UrlSuffix(gcc/Developer-Options.html#index-ftime-report-wall)
> +
>  ftime-report-details
>  UrlSuffix(gcc/Developer-Options.html#index-ftime-report-details)
>  
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e199522f62c7..80cb355f5d79 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -784,6 +784,7 @@ Objective-C and Objective-C++ Dialects}.
>  -frandom-seed=@var{string}  -fsched-verbose=@var{n}
>  -fsel-sched-verbose  -fsel-sched-dump-cfg  -fsel-sched-pipelining-
> verbose
>  -fstats  -fstack-usage  -ftime-report  -ftime-report-details
> +-ftime-report-wall
>  -fvar-tracking-assignments-toggle  -gtoggle
>  -print-file-name=@var{library}  -print-libgcc-file-name
>  -print-multi-directory  -print-multi-lib  -print-multi-os-directory
> @@ -21026,6 +21027,12 @@ slightly different place within the
> compiler.
>  @item -ftime-report-details
>  Record the time consumed by infrastructure parts separately for each
> pass.
>  
> +@opindex ftime-report-wall
> +@item -ftime-report-wall
> +Report statistics about compiler pass time consumpion, but only
> using wall
> +time.  This is faster than @option{-ftime-report}, but can be more
> +influenced by background jobs.
> +
>  @opindex fira-verbose
>  @item -fira-verbose=@var{n}
>  Control the verbosity of the dump file for the integrated register
> allocator.
> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> index 16fed46fb35f..8d3046eb7874 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -7964,7 +7964,8 @@ try_generate_repro (const char **argv)
>         it might varry between invocations.  */
>      else if (! strcmp (argv[nargs], "-quiet"))
>        quiet = 1;
> -    else if (! strcmp (argv[nargs], "-ftime-report"))
> +    else if (! strcmp (argv[nargs], "-ftime-report")
> +                || ! strcmp (argv[nargs], "-ftime-report-wall"))
>        return;
>  
>    if (out_arg == -1 || !quiet)
> diff --git a/gcc/testsuite/g++.dg/ext/timevar3.C
> b/gcc/testsuite/g++.dg/ext/timevar3.C
> new file mode 100644
> index 000000000000..b003f37f9654
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/timevar3.C
> @@ -0,0 +1,14 @@
> +// PR c++/52248
> +// { dg-options "-ftime-report-wall" }
> +// { dg-allow-blank-lines-in-output 1 }
> +// { dg-prune-output "Time variable" }
> +// { dg-prune-output "k" }
> +// { dg-prune-output " 0 " }
> +// { dg-prune-output "checks" }
> +// { dg-prune-output "\[0-9\]+%" }
> +
> +void
> +foo ()
> +{
> +  goto lab;                  // { dg-error "not defined" }
> +}
> diff --git a/gcc/timevar.cc b/gcc/timevar.cc
> index 68bcf44864f9..bf788f735621 100644
> --- a/gcc/timevar.cc
> +++ b/gcc/timevar.cc
> @@ -243,6 +243,15 @@ get_time (struct timevar_time_def *now)
>    now->ggc_mem = timevar_ggc_mem_total;
>  
>    {
> +#if _POSIX_C_SOURCE >= 199309L
> +    if (time_report_wall)
> +      {
> +     struct timespec ts;
> +     clock_gettime (CLOCK_MONOTONIC, &ts);
> +     now->wall = ts.tv_sec * 1000000000 + ts.tv_nsec;
> +     return;
> +      }
> +#endif
>  #ifdef USE_TIMES
>      struct tms tms;
>      now->wall = times (&tms)  * ticks_to_nanosec;
> @@ -682,14 +691,16 @@ timer::print_row (FILE *fp,
>  
>  #ifdef HAVE_USER_TIME
>    /* Print user-mode time for this process.  */
> -  fprintf (fp, "%7.2f (%3.0f%%)",
> +  if (!time_report_wall)
> +    fprintf (fp, "%7.2f (%3.0f%%)",
>          nanosec_to_floating_sec (elapsed.user),
>          percent_of (total->user, elapsed.user));
>  #endif /* HAVE_USER_TIME */
>  
>  #ifdef HAVE_SYS_TIME
> +  if (!time_report_wall)
>    /* Print system-mode time for this process.  */
> -  fprintf (fp, "%7.2f (%3.0f%%)",
> +    fprintf (fp, "%7.2f (%3.0f%%)",
>          nanosec_to_floating_sec (elapsed.sys),
>          percent_of (total->sys, elapsed.sys));
>  #endif /* HAVE_SYS_TIME */
> @@ -754,7 +765,10 @@ timer::print (FILE *fp)
>       TIMEVAR.  */
>    m_start_time = now;
>  
> -  fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr",
> "sys",
> +  if (time_report_wall)
> +    fprintf (fp, "\n%-35s%16s%14s\n", "Time variable", "wall",
> "GGC");
> +  else
> +    fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable",
> "usr", "sys",
>          "wall", "GGC");
>    if (m_jit_client_items)
>      fputs ("GCC items:\n", fp);
> @@ -810,13 +824,18 @@ timer::print (FILE *fp)
>    /* Print total time.  */
>    fprintf (fp, " %-35s:", "TOTAL");
>  #ifdef HAVE_USER_TIME
> -  fprintf (fp, "%7.2f      ", nanosec_to_floating_sec (total-
> >user));
> +  if (!time_report_wall)
> +    fprintf (fp, "%7.2f      ", nanosec_to_floating_sec (total-
> >user));
>  #endif
>  #ifdef HAVE_SYS_TIME
> -  fprintf (fp, "%8.2f      ", nanosec_to_floating_sec (total->sys));
> +  if (!time_report_wall)
> +    fprintf (fp, "%8.2f      ", nanosec_to_floating_sec (total-
> >sys));
>  #endif
>  #ifdef HAVE_WALL_TIME
> -  fprintf (fp, "%8.2f      ", nanosec_to_floating_sec (total-
> >wall));
> +  if (!time_report_wall)
> +    fprintf (fp, "%8.2f      ", nanosec_to_floating_sec (total-
> >wall));
> +  else
> +    fprintf (fp, "%7.2f      ", nanosec_to_floating_sec (total-
> >wall));
>  #endif
>    fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem));
>  
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index 5df59b79c803..87fa3f4c8833 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -2247,7 +2247,8 @@ toplev::~toplev ()
>  void
>  toplev::start_timevars ()
>  {
> -  if (time_report || !quiet_flag  || flag_detailed_statistics)
> +  if (time_report || time_report_wall || !quiet_flag
> +               || flag_detailed_statistics)
>      timevar_init ();
>  
>    timevar_start (TV_TOTAL);

Reply via email to