On Fri, 2018-06-29 at 09:09 +0200, Richard Biener wrote:
> On Thu, Jun 28, 2018 at 4:29 PM David Malcolm <dmalc...@redhat.com>
> wrote:
> > 
> > On Thu, 2018-06-28 at 13:29 +0200, Richard Biener wrote:
> > > On Tue, Jun 26, 2018 at 3:54 PM David Malcolm <dmalc...@redhat.co
> > > m>
> > > wrote:
> > > > 
> > > > On Mon, 2018-06-25 at 15:34 +0200, Richard Biener wrote:
> > > > > On Wed, Jun 20, 2018 at 6:34 PM David Malcolm <dmalcolm@redha
> > > > > t.co
> > > > > m>
> > > > > wrote:
> > > > > > 
> > > > > > Here's v3 of the patch (one big patch this time, rather
> > > > > > than a
> > > > > > kit).
> > > > > > 
> > > > > > Like the v2 patch kit, this patch reuses the existing dump
> > > > > > API,
> > > > > > rather than inventing its own.
> > > > > > 
> > > > > > Specifically, it uses the dump_* functions in dumpfile.h
> > > > > > that
> > > > > > don't
> > > > > > take a FILE *, the ones that implicitly write to dump_file
> > > > > > and/or
> > > > > > alt_dump_file.  I needed a name for them, so I've taken to
> > > > > > calling
> > > > > > them the "structured dump API" (better name ideas welcome).
> > > > > > 
> > > > > > v3 eliminates v2's optinfo_guard class, instead using
> > > > > > "dump_*_loc"
> > > > > > calls as delimiters when consolidating "dump_*"
> > > > > > calls.  There's
> > > > > > a
> > > > > > new dump_context class which has responsibility for
> > > > > > consolidating
> > > > > > them into optimization records.
> > > > > > 
> > > > > > The dump_*_loc calls now capture more than just a
> > > > > > location_t:
> > > > > > they
> > > > > > capture the profile_count and the location in GCC's own
> > > > > > sources
> > > > > > where
> > > > > > the dump is being emitted from.
> > > > > > 
> > > > > > This works by introducing a new "dump_location_t" class as
> > > > > > the
> > > > > > argument of those dump_*_loc calls.  The dump_location_t
> > > > > > can
> > > > > > be constructed from a gimple * or from an rtx_insn *, so
> > > > > > that
> > > > > > rather than writing:
> > > > > > 
> > > > > >   dump_printf_loc (MSG_NOTE, gimple_location (stmt),
> > > > > >                    "some message: %i", 42);
> > > > > > 
> > > > > > you can write:
> > > > > > 
> > > > > >   dump_printf_loc (MSG_NOTE, stmt,
> > > > > >                    "some message: %i", 42);
> > > > > > 
> > > > > > and the dump_location_t constructor will grab the
> > > > > > location_t
> > > > > > and
> > > > > > profile_count of stmt, and the location of the
> > > > > > "dump_printf_loc"
> > > > > > callsite (and gracefully handle "stmt" being NULL).
> > > > > > 
> > > > > > Earlier versions of the patch captured the location of the
> > > > > > dump_*_loc call via preprocessor hacks, or didn't work
> > > > > > properly;
> > > > > > this version of the patch works more cleanly: internally,
> > > > > > dump_location_t is split into two new classes:
> > > > > >   * dump_user_location_t: the location_t and profile_count
> > > > > > within
> > > > > >     the *user's code*, and
> > > > > >   * dump_impl_location_t: the __builtin_FILE/LINE/FUNCTION
> > > > > > within
> > > > > >     the *implementation* code (i.e. GCC or a plugin),
> > > > > > captured
> > > > > >     "automagically" via default params
> > > > > > 
> > > > > > These classes are sometimes used elsewhere in the
> > > > > > code.  For
> > > > > > example, "vect_location" becomes a dump_user_location_t
> > > > > > (location_t and profile_count), so that in e.g:
> > > > > > 
> > > > > >   vect_location = find_loop_location (loop);
> > > > > > 
> > > > > > it's capturing the location_t and profile_count, and then
> > > > > > when
> > > > > > it's used here:
> > > > > > 
> > > > > >   dump_printf_loc (MSG_NOTE, vect_location, "foo");
> > > > > > 
> > > > > > the dump_location_t is constructed from the vect_location
> > > > > > plus the dump_impl_location_t at that callsite.
> > > > > > 
> > > > > > In contrast, loop-unroll.c's report_unroll's "locus" param
> > > > > > becomes a dump_location_t: we're interested in where it was
> > > > > > called from, not in the locations of the various dump_*_loc
> > > > > > calls
> > > > > > within it.
> > > > > > 
> > > > > > Previous versions of the patch captured a gimple *, and
> > > > > > needed
> > > > > > GTY markers; in this patch, the dump_user_location_t is now
> > > > > > just a
> > > > > > location_t and a profile_count.
> > > > > > 
> > > > > > The v2 patch added an overload for dump_printf_loc so that
> > > > > > you
> > > > > > could pass in either a location_t, or the new type; this
> > > > > > version
> > > > > > of the patch eliminates that: they all now take
> > > > > > dump_location_t.
> > > > > > 
> > > > > > Doing so required adding support for rtx_insn *, so that
> > > > > > one
> > > > > > can
> > > > > > write this kind of thing in RTL passes:
> > > > > > 
> > > > > >   dump_printf_loc (MSG_NOTE, insn, "foo");
> > > > > > 
> > > > > > One knock-on effect is that get_loop_location now returns a
> > > > > > dump_user_location_t rather than a location_t, so that it
> > > > > > has
> > > > > > hotness information.
> > > > > > 
> > > > > > Richi: would you like me to split out this location-
> > > > > > handling
> > > > > > code into a separate patch?  (It's kind of redundant
> > > > > > without
> > > > > > adding the remarks and optimization records work, but if
> > > > > > that's
> > > > > > easier I can do it)
> > > > > 
> > > > > I think that would be easier because it doesn't require the
> > > > > JSON
> > > > > stuff and so I'll happily approve it.
> > > > > 
> > > > > Thus - trying to review that bits (and sorry for the delay).
> > > > > 
> > > > > +  location_t srcloc = loc.get_location_t ();
> > > > > +
> > > > >    if (dump_file && (dump_kind & pflags))
> > > > >      {
> > > > > -      dump_loc (dump_kind, dump_file, loc);
> > > > > +      dump_loc (dump_kind, dump_file, srcloc);
> > > > >        print_gimple_stmt (dump_file, gs, spc, dump_flags |
> > > > > extra_dump_flags);
> > > > >      }
> > > > > 
> > > > >    if (alt_dump_file && (dump_kind & alt_flags))
> > > > >      {
> > > > > -      dump_loc (dump_kind, alt_dump_file, loc);
> > > > > +      dump_loc (dump_kind, alt_dump_file, srcloc);
> > > > >        print_gimple_stmt (alt_dump_file, gs, spc, dump_flags
> > > > > |
> > > > > extra_dump_flags);
> > > > >      }
> > > > > +
> > > > > +  if (optinfo_enabled_p ())
> > > > > +    {
> > > > > +      optinfo &info = begin_next_optinfo (loc);
> > > > > +      info.handle_dump_file_kind (dump_kind);
> > > > > +      info.add_stmt (gs, extra_dump_flags);
> > > > > +    }
> > > > > 
> > > > > seeing this in multiple places.  I seem to remember that
> > > > > dump_file / alt_dump_file was suposed to handle dumping
> > > > > into two locations - a dump file and optinfo (or
> > > > > stdout).  This
> > > > > looks
> > > > > like the optinfo "stream" is even more separate.  Could that
> > > > > obsolete the alt_dump_file stream?  I'd need to review
> > > > > existing
> > > > > stuff
> > > > > in more detail to answer but maybe you already know from
> > > > > recently
> > > > > digging into this.
> > > > 
> > > > Possibly.  I attempted this in v1 of the patch, but it was
> > > > mixed in
> > > > with
> > > > a bunch of other stuff.  I'll have another go at doing this.
> > > > 
> > > > > Oh, and all the if (optinfo_enable_p ()) stuff is for the
> > > > > followup
> > > > > then, right?
> > > > 
> > > > Yes.
> > > > 
> > > > > I like the boiler-plate changes to dump_* using stuff a lot,
> > > > > so
> > > > > the
> > > > > infrastructure to do that (the location wrapping) and these
> > > > > boiler-
> > > > > plate
> > > > > changes are pre-approved if split out.
> > > > 
> > > > Thanks.  I split out the location wrapping, and have committed
> > > > it
> > > > to
> > > > trunk (r262149).  It's not clear to me exactly what other parts
> > > > you
> > > > liked,
> > > > so I'm going to try to split out more of the non-JSON bits in
> > > > the
> > > > hope that some parts are good enough as-is, and I'll post them
> > > > for
> > > > review
> > > > as followups.
> > > > 
> > > > For reference, here's what I've committed (I added some obvious
> > > > changes
> > > > to doc/optinfo.texi).
> > > > 
> > > > > I think the *_REMARK stuff should get attention of the
> > > > > respective
> > > > > maintainers - not sure what the difference between NOTE and
> > > > > REMARK
> > > > > is ;)
> > > > 
> > > > Me neither :)  I think it might come down to "this is purely a
> > > > debugging
> > > > message for a pass maintainer" (MSG_NOTE) vs "this is a high-
> > > > level
> > > > thing
> > > > that an advanced user want to see" (MSG_REMARK???).
> > > > 
> > > > One idea that occurred to me: are we overusing
> > > > dump_flags_t?  It's
> > > > a
> > > > mixture of TDF_* bitfields (used by our internal dumps) plus
> > > > MSG_*
> > > > bitfields (used with -fopt-info).  IIRC the only TDF_ bitfield
> > > > that's
> > > > ever used with the MSG_* bitfields is TDF_DETAILS.  Might it
> > > > make
> > > > sense
> > > > to split out the MSG_* bitfields into a different type
> > > > (dump_level_t???)
> > > > to reinforce this split?  (and thus the dump_* entrypoints that
> > > > don't take
> > > > a FILE * would take this new type).  I'm not sure if this is a
> > > > good
> > > > idea
> > > > or not.
> > > 
> > > Making it even more complicated doesn't make it easier to use it
> > > "correctly".  So I'd rather try to simplify it.  How passes use
> > > TDF_DETAILS vs. non-details is already highly inconsistent.  This
> > > is why I liked the original optinfo work because it somehow made
> > > user-interesting vs. developer-interesting with the same API
> > > (and to the same dump-file).  Just that MSG_NOTE is exposed to
> > > users while I think it should be developer-only...
> > > 
> > > IMHO the TDF_ stuff should control things at the stmt/tree/BB
> > > level,
> > > thus be IL specific flags while the MSG_ stuff should control
> > > pass-specific dumping detail.
> > 
> > You mention user-interesting vs developer-interesting, and whether
> > it
> > would be the same API and/or the same dump-file.
> > 
> > I've posted a bunch of patches here, some small, some large, trying
> > various different appoaches, coming at the problem from at least
> > two directions, so maybe it's worth discussing the overall
> > direction
> > here (with some ASCII art!)
> > 
> > * what is the status quo?
> > * what do we want to achieve?
> > * how do we get there?
> > 
> > Status quo:
> > * We have the "dump_*(MSG_*)" API which hides the destination of
> >   where we're dumping to (currently dump_file and/or alt_file_file,
> >   conditionalized via flags).
> > 
> >   dump_* (MSG_*) ----> dumpfile.c --+--> dump_file
> >                                     |
> >                                     `--> alt_dump_file
> > 
> > * As of r262149 (the dump_location_t commit) dumpfile.c receives
> > the
> >   hotness and emission location of each dump_*_loc call, but
> > currently
> >   it does nothing with that information.
> > * The messages that we emit through the "dump_*(MSG_*)" API and
> > their
> >   "levels" are currently rather haphazard.  Some are close to being
> >   suitable for end-users, but most seem never intended to be
> >   end-user-facing.  For reference:
> >     grep -nH -e MSG_MISSED_OPTIMIZATION *.c *.cc | wc -l
> >     452
> >     grep -nH -e MSG_NOTE *.c *.cc | wc -l
> >     551
> >     grep -nH -e MSG_OPTIMIZED_LOCATIONS *.c *.cc | wc -l
> >     39
> >   (though some of these are support code e.g. in dumpfile.c, rather
> >   than uses).
> 
> Yep.  I believe MSG_NOTE was the fallout of optinfo introduction and
> my
> request to share the same API with dumping to the dumpfile.  MSG_NOTE
> just received "anything else" ...
> 
> > * The API builds up messages programatically, which is hostile to
> >   i18n.  The API isn't marked as needing i18n for its strings
> > (which
> >   IIRC is via using "gmsgid" as a param name being special-cased in
> >   the gettext toolchain).
> > 
> > What do we want to achieve?
> > * I want end-users to be able to enable high-level dump messages
> > about
> >   optimizations, and for those messages to be decipherable without
> >   reading GCC sources e.g. the opt_problem idea posted here:
> >   * "[PATCH] [RFC] Higher-level reporting of vectorization
> > problems"
> >     * https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01462.html
> >   Presumably such messages would need to be
> > internationalized.  Many
> >   other messages are really internal only, and would be a burden to
> >   impose on our translators.
> > * I want to support additional destinations for any/all dump
> > messages
> >   beyond just the dump file and -fopt-info:
> >   * putting them through the diagnostics subsystem (as "remarks")
> >   * storing them to disk in a machine-readable format, so that
> >     3rd-party tools can present useful visualizations of how the
> >     user's code is being optimized, e.g. prioritized by hotness
> > * I want to preserve the existing API and most existing uses of it
> >   (to minimize churn and thus keep our lives easier)
> > 
> > The patches I've been posting so far add additional dump
> > destinations,
> > like this:
> > 
> >   dump_* (MSG_*) >---> dumpfile.c --+--> dump_file
> >                                     |
> >                                     +--> alt_dump_file
> >                                     |
> >                                     +--> diagnostic "remarks"
> >                                     |
> >                                     +--> "optimization records"
> >                                          (saved to disk)
> > 
> > I believe you favor an approach of "MSG_NOTE is internal, whereas
> > MSG_MISSED_OPTIMIZATION and MSG_OPTIMIZED_LOCATIONS are for end
> > users".
> 
> Yeah, kind of.  You introduced another MSG_ variant 

"MSG_REMARK" above, though I'm not sure I like that name.

> and in the end we might
> need to go that way.  I guess renaming MSG_NOTE to MSG_DEBUG would
> be a step in the direction to reflect reality ;)

I think I want to look more at the "opt_problem" idea about high-level
vectorization messages, and attack the problem from that direction;
maybe that will suggest a good way forward for our existing dump messages.

> > I think the "should this dump message be internationalized?" issue
> > introduces a natural separation between dump messages aimed at
> > end-users vs purely "internal" dumps; this suggests to me that
> > we need a new API for such dump messages, designed to support i18n,
> > and marked as such for gettext, so I favor something like this:
> > 
> >   SOURCES                                DESTINATIONS
> >   dump_* (MSG_*) >---> dumpfile.c --+--> dump_file
> >                    |                |
> >   new dump API -->-+                +--> alt_dump_file
> >                                     |
> >                                     +--> diagnostic "remarks"
> >                                     |
> >                                     +--> "optimization records"
> >                                          (saved to disk)
> > 
> > (I'm deliberately being vague about what this i18n-enabled dump API
> > might look like)
> 
> I know I threw i18n into the discussion - but I'd say we should focus
> on other details first.  I think there's no need to _require_ the
> "debug"
> dumpfile stuff not be translated, but to make it easier for
> translators
> I suppose being able to "amend" the API calls with a "do not
> translate"
> variant would be OK.

I'm thinking that only a small fraction of the dump messages will be
user-facing, so the default would be "do not translate" with the rare
special-case being "do translate".

> So in what way is the dump_* API not suitable for translation
> that the diagnostic machinery (warning/error) does not suffer from?

Consider this example, from "vect_create_data_ref_ptr" (I used this
example way back in the v1 patch kit):

  if (dump_enabled_p ())
     {
       tree dr_base_type = TREE_TYPE (DR_BASE_OBJECT (dr));
       dump_printf_loc (MSG_NOTE, vect_location,
                        "create %s-pointer variable to type: ",
                        get_tree_code_name (TREE_CODE (aggr_type)));
       dump_generic_expr (MSG_NOTE, TDF_SLIM, aggr_type);
       if (TREE_CODE (dr_base_type) == ARRAY_TYPE)
         dump_printf (MSG_NOTE, "  vectorizing an array ref: ");
       else if (TREE_CODE (dr_base_type) == VECTOR_TYPE)
         dump_printf (MSG_NOTE, "  vectorizing a vector ref: ");
       else if (TREE_CODE (dr_base_type) == RECORD_TYPE)
         dump_printf (MSG_NOTE, "  vectorizing a record based array ref: ");
       else
         dump_printf (MSG_NOTE, "  vectorizing a pointer ref: ");
       dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_BASE_OBJECT (dr));
       dump_printf (MSG_NOTE, "\n");
     }

where there are two problems with translation:
(a) the message is built up piecewise, with conditional logic.
(b) there's nothing yet marking the string fragments as needing
    translation

> That is, if we internally make dump_* go through the pretty-printers
> we can handle stuff like quoting or even stmts/expressions.  I think
> that would be useful for dumping to dumpfiles as well.

That could work.  Currently dump_printf* use fprintf internally, but I
don't think we make much use of the codes there.  So we could make it
use a pretty_printer internally.  The core pretty-print.c code doesn't
handle any of the interesting middle-end types we'd want to talk about
(gimple, tree, symtab_node *, etc), so I think to have a custom
middle-end dump pp_format_decoder callback.  The above example could
then look something like:

  if (dump_enabled_p ())
     {
       tree dr_base_type = TREE_TYPE (DR_BASE_OBJECT (dr));
       const char *msgid;
       if (TREE_CODE (dr_base_type) == ARRAY_TYPE)
         msgid = ("create %s-pointer variable to type: %T"
                  "  vectorizing an array ref: %T");
       else if (TREE_CODE (dr_base_type) == VECTOR_TYPE)
         msgid = ("create %s-pointer variable to type: %T"
                  "  vectorizing a vector ref: %T");
       else if (TREE_CODE (dr_base_type) == RECORD_TYPE)
         msgid = ("create %s-pointer variable to type: %T"
                  "  vectorizing a record based array ref: %T");
       else
         msgid = ("create %s-pointer variable to type: %T"
                  "  vectorizing a pointer ref: %T");
       dump_printf_loc (MSG_NOTE, vect_location, msgid,
                        get_tree_code_name (TREE_CODE (aggr_type)));
                        aggr_type, DR_BASE_OBJECT (dr));
       dump_printf (MSG_NOTE, "\n");
     }

where I'm using "%T" as the format code for tree with TDF_SLIM.  The
optinfo could could then "print" these more interesting types by
capturing the metadata (and thus expose it in optimization records).

Though, that said, I don't think that message as-is is something we'd
want end-users to see, and thus not something for translators.

Maybe something like:

extern void dump_userfacing_printf_loc (dump_flags_t,
                                        const dump_location_t &,
                                        const char *gmsgid, ...)
  ATTRIBUTE_DUMP_PRINTF_3;

for the small subset of messages that we'd want to be translated?
(with ATTRIBUTE_DUMP_PRINTF_3 to signify that we have a new family
of format codes).

(I'm not in love with that function name)

> > 
> > That said, I have a version of the patch kit which does just this:
> > 
> >   dump_* (MSG_*) ----> dumpfile.c --+--> dump_file
> >                                     |
> >                                     +--> alt_dump_file
> >                                     |
> >                                     +--> diagnostic "remarks"
> > 
> > i.e. generalizing the dump destinations (by adding optinfo
> > internally),
> > but without requiring the JSON support or touching the dump API .
> > Would
> > that be suitable as a next step?
> 
> Yes.  Thanks for going step-by-step btw, this is really useful in
> simplifying
> the picture.
> 
> Richard.

Thanks.

Here's a reduced version of the patch kit which builds on what's
already in trunk.

Patch 1 adds the basic optinfo machinery, consolidating the dump_*
calls into optinfo objects that can be sent to multiple dump
destinations  - but it doesn't add any destinations.  It captures
"rich" information about what is dumped, tagged with things like
code hotness information, enough to eventually implement optimization
records in a followup.

   dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file
                                   |
                                   +--> (b) alt_dump_file
                                   |
                                   `--> (c) optinfo

Patch 2 adds in the first destination for them: diagnostic
"remarks" (updated to work on top of patch 1).  It's not perfect
yet (we probably need to be able to filter them using
optinfo_group_t and probably other criteria), and I'm posting it now
to provide more motivation for patch 1, but it's stable enough for
trunk (though a couple of FIXMEs remain).

   dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file
                                   |
                                   +--> (b) alt_dump_file
                                   |
                                   `--> (c) optinfo
                                            `---> optinfo destinations
                                                  (c.1) diagnostic "remarks"

I have followups to add JSON optimizations records as another kind
of optinfo destination, so that the picture would become:

   dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file
                                   |
                                   +--> (b) alt_dump_file
                                   |
                                   `--> (c) optinfo
                                            `---> optinfo destinations
                                                  (c.1) diagnostic "remarks"
                                                  (c.2) optimization records

(the patches add this ASCII art to dumpfile.h, btw)

As mentioned before, the optinfo consolidation works by making the
dump_*_loc calls implicitly start a new optinfo, terminating and
emitting any pending one.  This avoids having to touch all the dump
code to explicitly terminate messages, but means that optinfo
instances can be pending for a while, until the next dump_*_loc
happens (or the end of the pass, or a GC; see the patch).  Sadly,
this delay seems to make them more suitable for user-facing messages
than for debugging GCC itself.  Because of this, I didn't think it
was appropriate to port the alt_dump_file code to using it (better to
have that be more direct).  But I can do that if you want.
(Alternatively, maybe a dump_printf* call that ends with a newline
should terminate the optinfo?  Doing that would make most of them
terminate themselves)

Dave

David Malcolm (2):
  Add "optinfo" framework
  optinfo: add diagnostic remarks

 gcc/Makefile.in                              |   2 +
 gcc/common.opt                               |  17 +
 gcc/coretypes.h                              |   7 +
 gcc/diagnostic-color.c                       |   2 +
 gcc/diagnostic-core.h                        |   2 +
 gcc/diagnostic.c                             |  17 +
 gcc/diagnostic.def                           |   1 +
 gcc/doc/invoke.texi                          |  67 ++++
 gcc/dump-context.h                           | 128 +++++++
 gcc/dumpfile.c                               | 497 +++++++++++++++++++++++++--
 gcc/dumpfile.h                               |  84 +++--
 gcc/fortran/gfc-diagnostic.def               |   1 +
 gcc/ggc-page.c                               |   2 +
 gcc/opt-functions.awk                        |   1 +
 gcc/optinfo-emit-diagnostics.cc              | 317 +++++++++++++++++
 gcc/optinfo-emit-diagnostics.h               |  26 ++
 gcc/optinfo-internal.h                       | 148 ++++++++
 gcc/optinfo.cc                               | 252 ++++++++++++++
 gcc/optinfo.h                                | 173 ++++++++++
 gcc/opts.c                                   |   4 +
 gcc/opts.h                                   |  13 +-
 gcc/profile-count.c                          |  28 ++
 gcc/profile-count.h                          |   5 +
 gcc/selftest-run-tests.c                     |   2 +
 gcc/selftest.h                               |   2 +
 gcc/testsuite/gcc.dg/plugin/plugin.exp       |   2 +
 gcc/testsuite/gcc.dg/plugin/remarks-1.c      |  30 ++
 gcc/testsuite/gcc.dg/plugin/remarks_plugin.c | 152 ++++++++
 gcc/testsuite/lib/gcc-dg.exp                 |   9 +
 29 files changed, 1924 insertions(+), 67 deletions(-)
 create mode 100644 gcc/dump-context.h
 create mode 100644 gcc/optinfo-emit-diagnostics.cc
 create mode 100644 gcc/optinfo-emit-diagnostics.h
 create mode 100644 gcc/optinfo-internal.h
 create mode 100644 gcc/optinfo.cc
 create mode 100644 gcc/optinfo.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/remarks-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/remarks_plugin.c

-- 
1.8.5.3

Reply via email to