On Thu, 2018-06-28 at 13:29 +0200, Richard Biener wrote:
> On Tue, Jun 26, 2018 at 3:54 PM David Malcolm <dmalc...@redhat.com>
> wrote:
> > 
> > On Mon, 2018-06-25 at 15:34 +0200, Richard Biener wrote:
> > > On Wed, Jun 20, 2018 at 6:34 PM David Malcolm <dmalc...@redhat.co
> > > m>
> > > wrote:
> > > > 
> > > > Here's v3 of the patch (one big patch this time, rather than a
> > > > kit).
> > > > 
> > > > Like the v2 patch kit, this patch reuses the existing dump API,
> > > > rather than inventing its own.
> > > > 
> > > > Specifically, it uses the dump_* functions in dumpfile.h that
> > > > don't
> > > > take a FILE *, the ones that implicitly write to dump_file
> > > > and/or
> > > > alt_dump_file.  I needed a name for them, so I've taken to
> > > > calling
> > > > them the "structured dump API" (better name ideas welcome).
> > > > 
> > > > v3 eliminates v2's optinfo_guard class, instead using
> > > > "dump_*_loc"
> > > > calls as delimiters when consolidating "dump_*" calls.  There's
> > > > a
> > > > new dump_context class which has responsibility for
> > > > consolidating
> > > > them into optimization records.
> > > > 
> > > > The dump_*_loc calls now capture more than just a location_t:
> > > > they
> > > > capture the profile_count and the location in GCC's own sources
> > > > where
> > > > the dump is being emitted from.
> > > > 
> > > > This works by introducing a new "dump_location_t" class as the
> > > > argument of those dump_*_loc calls.  The dump_location_t can
> > > > be constructed from a gimple * or from an rtx_insn *, so that
> > > > rather than writing:
> > > > 
> > > >   dump_printf_loc (MSG_NOTE, gimple_location (stmt),
> > > >                    "some message: %i", 42);
> > > > 
> > > > you can write:
> > > > 
> > > >   dump_printf_loc (MSG_NOTE, stmt,
> > > >                    "some message: %i", 42);
> > > > 
> > > > and the dump_location_t constructor will grab the location_t
> > > > and
> > > > profile_count of stmt, and the location of the
> > > > "dump_printf_loc"
> > > > callsite (and gracefully handle "stmt" being NULL).
> > > > 
> > > > Earlier versions of the patch captured the location of the
> > > > dump_*_loc call via preprocessor hacks, or didn't work
> > > > properly;
> > > > this version of the patch works more cleanly: internally,
> > > > dump_location_t is split into two new classes:
> > > >   * dump_user_location_t: the location_t and profile_count
> > > > within
> > > >     the *user's code*, and
> > > >   * dump_impl_location_t: the __builtin_FILE/LINE/FUNCTION
> > > > within
> > > >     the *implementation* code (i.e. GCC or a plugin), captured
> > > >     "automagically" via default params
> > > > 
> > > > These classes are sometimes used elsewhere in the code.  For
> > > > example, "vect_location" becomes a dump_user_location_t
> > > > (location_t and profile_count), so that in e.g:
> > > > 
> > > >   vect_location = find_loop_location (loop);
> > > > 
> > > > it's capturing the location_t and profile_count, and then when
> > > > it's used here:
> > > > 
> > > >   dump_printf_loc (MSG_NOTE, vect_location, "foo");
> > > > 
> > > > the dump_location_t is constructed from the vect_location
> > > > plus the dump_impl_location_t at that callsite.
> > > > 
> > > > In contrast, loop-unroll.c's report_unroll's "locus" param
> > > > becomes a dump_location_t: we're interested in where it was
> > > > called from, not in the locations of the various dump_*_loc
> > > > calls
> > > > within it.
> > > > 
> > > > Previous versions of the patch captured a gimple *, and needed
> > > > GTY markers; in this patch, the dump_user_location_t is now
> > > > just a
> > > > location_t and a profile_count.
> > > > 
> > > > The v2 patch added an overload for dump_printf_loc so that you
> > > > could pass in either a location_t, or the new type; this
> > > > version
> > > > of the patch eliminates that: they all now take
> > > > dump_location_t.
> > > > 
> > > > Doing so required adding support for rtx_insn *, so that one
> > > > can
> > > > write this kind of thing in RTL passes:
> > > > 
> > > >   dump_printf_loc (MSG_NOTE, insn, "foo");
> > > > 
> > > > One knock-on effect is that get_loop_location now returns a
> > > > dump_user_location_t rather than a location_t, so that it has
> > > > hotness information.
> > > > 
> > > > Richi: would you like me to split out this location-handling
> > > > code into a separate patch?  (It's kind of redundant without
> > > > adding the remarks and optimization records work, but if that's
> > > > easier I can do it)
> > > 
> > > I think that would be easier because it doesn't require the JSON
> > > stuff and so I'll happily approve it.
> > > 
> > > Thus - trying to review that bits (and sorry for the delay).
> > > 
> > > +  location_t srcloc = loc.get_location_t ();
> > > +
> > >    if (dump_file && (dump_kind & pflags))
> > >      {
> > > -      dump_loc (dump_kind, dump_file, loc);
> > > +      dump_loc (dump_kind, dump_file, srcloc);
> > >        print_gimple_stmt (dump_file, gs, spc, dump_flags |
> > > extra_dump_flags);
> > >      }
> > > 
> > >    if (alt_dump_file && (dump_kind & alt_flags))
> > >      {
> > > -      dump_loc (dump_kind, alt_dump_file, loc);
> > > +      dump_loc (dump_kind, alt_dump_file, srcloc);
> > >        print_gimple_stmt (alt_dump_file, gs, spc, dump_flags |
> > > extra_dump_flags);
> > >      }
> > > +
> > > +  if (optinfo_enabled_p ())
> > > +    {
> > > +      optinfo &info = begin_next_optinfo (loc);
> > > +      info.handle_dump_file_kind (dump_kind);
> > > +      info.add_stmt (gs, extra_dump_flags);
> > > +    }
> > > 
> > > seeing this in multiple places.  I seem to remember that
> > > dump_file / alt_dump_file was suposed to handle dumping
> > > into two locations - a dump file and optinfo (or stdout).  This
> > > looks
> > > like the optinfo "stream" is even more separate.  Could that
> > > obsolete the alt_dump_file stream?  I'd need to review existing
> > > stuff
> > > in more detail to answer but maybe you already know from recently
> > > digging into this.
> > 
> > Possibly.  I attempted this in v1 of the patch, but it was mixed in
> > with
> > a bunch of other stuff.  I'll have another go at doing this.
> > 
> > > Oh, and all the if (optinfo_enable_p ()) stuff is for the
> > > followup
> > > then, right?
> > 
> > Yes.
> > 
> > > I like the boiler-plate changes to dump_* using stuff a lot, so
> > > the
> > > infrastructure to do that (the location wrapping) and these
> > > boiler-
> > > plate
> > > changes are pre-approved if split out.
> > 
> > Thanks.  I split out the location wrapping, and have committed it
> > to
> > trunk (r262149).  It's not clear to me exactly what other parts you
> > liked,
> > so I'm going to try to split out more of the non-JSON bits in the
> > hope that some parts are good enough as-is, and I'll post them for
> > review
> > as followups.
> > 
> > For reference, here's what I've committed (I added some obvious
> > changes
> > to doc/optinfo.texi).
> > 
> > > I think the *_REMARK stuff should get attention of the respective
> > > maintainers - not sure what the difference between NOTE and
> > > REMARK
> > > is ;)
> > 
> > Me neither :)  I think it might come down to "this is purely a
> > debugging
> > message for a pass maintainer" (MSG_NOTE) vs "this is a high-level
> > thing
> > that an advanced user want to see" (MSG_REMARK???).
> > 
> > One idea that occurred to me: are we overusing dump_flags_t?  It's
> > a
> > mixture of TDF_* bitfields (used by our internal dumps) plus MSG_*
> > bitfields (used with -fopt-info).  IIRC the only TDF_ bitfield
> > that's
> > ever used with the MSG_* bitfields is TDF_DETAILS.  Might it make
> > sense
> > to split out the MSG_* bitfields into a different type
> > (dump_level_t???)
> > to reinforce this split?  (and thus the dump_* entrypoints that
> > don't take
> > a FILE * would take this new type).  I'm not sure if this is a good
> > idea
> > or not.
> 
> Making it even more complicated doesn't make it easier to use it
> "correctly".  So I'd rather try to simplify it.  How passes use
> TDF_DETAILS vs. non-details is already highly inconsistent.  This
> is why I liked the original optinfo work because it somehow made
> user-interesting vs. developer-interesting with the same API
> (and to the same dump-file).  Just that MSG_NOTE is exposed to
> users while I think it should be developer-only...
> 
> IMHO the TDF_ stuff should control things at the stmt/tree/BB level,
> thus be IL specific flags while the MSG_ stuff should control
> pass-specific dumping detail.

You mention user-interesting vs developer-interesting, and whether it
would be the same API and/or the same dump-file.

I've posted a bunch of patches here, some small, some large, trying
various different appoaches, coming at the problem from at least
two directions, so maybe it's worth discussing the overall direction
here (with some ASCII art!)

* what is the status quo?
* what do we want to achieve?
* how do we get there?

Status quo:
* We have the "dump_*(MSG_*)" API which hides the destination of
  where we're dumping to (currently dump_file and/or alt_file_file,
  conditionalized via flags).

  dump_* (MSG_*) ----> dumpfile.c --+--> dump_file
                                    |
                                    `--> alt_dump_file

* As of r262149 (the dump_location_t commit) dumpfile.c receives the
  hotness and emission location of each dump_*_loc call, but currently
  it does nothing with that information.
* The messages that we emit through the "dump_*(MSG_*)" API and their
  "levels" are currently rather haphazard.  Some are close to being
  suitable for end-users, but most seem never intended to be
  end-user-facing.  For reference:
    grep -nH -e MSG_MISSED_OPTIMIZATION *.c *.cc | wc -l
    452
    grep -nH -e MSG_NOTE *.c *.cc | wc -l
    551
    grep -nH -e MSG_OPTIMIZED_LOCATIONS *.c *.cc | wc -l
    39
  (though some of these are support code e.g. in dumpfile.c, rather
  than uses).
* The API builds up messages programatically, which is hostile to
  i18n.  The API isn't marked as needing i18n for its strings (which
  IIRC is via using "gmsgid" as a param name being special-cased in
  the gettext toolchain).

What do we want to achieve?
* I want end-users to be able to enable high-level dump messages about
  optimizations, and for those messages to be decipherable without
  reading GCC sources e.g. the opt_problem idea posted here:
  * "[PATCH] [RFC] Higher-level reporting of vectorization problems"
    * https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01462.html
  Presumably such messages would need to be internationalized.  Many
  other messages are really internal only, and would be a burden to
  impose on our translators.
* I want to support additional destinations for any/all dump messages
  beyond just the dump file and -fopt-info:
  * putting them through the diagnostics subsystem (as "remarks")
  * storing them to disk in a machine-readable format, so that
    3rd-party tools can present useful visualizations of how the
    user's code is being optimized, e.g. prioritized by hotness
* I want to preserve the existing API and most existing uses of it
  (to minimize churn and thus keep our lives easier)

The patches I've been posting so far add additional dump destinations,
like this:

  dump_* (MSG_*) >---> dumpfile.c --+--> dump_file
                                    |
                                    +--> alt_dump_file
                                    |
                                    +--> diagnostic "remarks"
                                    |
                                    +--> "optimization records"
                                         (saved to disk)

I believe you favor an approach of "MSG_NOTE is internal, whereas
MSG_MISSED_OPTIMIZATION and MSG_OPTIMIZED_LOCATIONS are for end users".

I think the "should this dump message be internationalized?" issue
introduces a natural separation between dump messages aimed at
end-users vs purely "internal" dumps; this suggests to me that
we need a new API for such dump messages, designed to support i18n,
and marked as such for gettext, so I favor something like this:

  SOURCES                                DESTINATIONS
  dump_* (MSG_*) >---> dumpfile.c --+--> dump_file
                   |                |
  new dump API -->-+                +--> alt_dump_file
                                    |
                                    +--> diagnostic "remarks"
                                    |
                                    +--> "optimization records"
                                         (saved to disk)

(I'm deliberately being vague about what this i18n-enabled dump API
might look like)

That said, I have a version of the patch kit which does just this:

  dump_* (MSG_*) ----> dumpfile.c --+--> dump_file
                                    |
                                    +--> alt_dump_file
                                    |
                                    +--> diagnostic "remarks"

i.e. generalizing the dump destinations (by adding optinfo internally),
but without requiring the JSON support or touching the dump API . Would
that be suitable as a next step?

[...snip...]

Thoughts?

Thanks
Dave

Reply via email to