On Fri, 2018-06-29 at 09:09 +0200, Richard Biener wrote: > On Thu, Jun 28, 2018 at 4:29 PM David Malcolm <dmalc...@redhat.com> > wrote: > > > > On Thu, 2018-06-28 at 13:29 +0200, Richard Biener wrote: > > > On Tue, Jun 26, 2018 at 3:54 PM David Malcolm <dmalc...@redhat.co > > > m> > > > wrote: > > > > > > > > On Mon, 2018-06-25 at 15:34 +0200, Richard Biener wrote: > > > > > On Wed, Jun 20, 2018 at 6:34 PM David Malcolm <dmalcolm@redha > > > > > t.co > > > > > m> > > > > > wrote: > > > > > > > > > > > > Here's v3 of the patch (one big patch this time, rather > > > > > > than a > > > > > > kit). > > > > > > > > > > > > Like the v2 patch kit, this patch reuses the existing dump > > > > > > API, > > > > > > rather than inventing its own. > > > > > > > > > > > > Specifically, it uses the dump_* functions in dumpfile.h > > > > > > that > > > > > > don't > > > > > > take a FILE *, the ones that implicitly write to dump_file > > > > > > and/or > > > > > > alt_dump_file. I needed a name for them, so I've taken to > > > > > > calling > > > > > > them the "structured dump API" (better name ideas welcome). > > > > > > > > > > > > v3 eliminates v2's optinfo_guard class, instead using > > > > > > "dump_*_loc" > > > > > > calls as delimiters when consolidating "dump_*" > > > > > > calls. There's > > > > > > a > > > > > > new dump_context class which has responsibility for > > > > > > consolidating > > > > > > them into optimization records. > > > > > > > > > > > > The dump_*_loc calls now capture more than just a > > > > > > location_t: > > > > > > they > > > > > > capture the profile_count and the location in GCC's own > > > > > > sources > > > > > > where > > > > > > the dump is being emitted from. > > > > > > > > > > > > This works by introducing a new "dump_location_t" class as > > > > > > the > > > > > > argument of those dump_*_loc calls. The dump_location_t > > > > > > can > > > > > > be constructed from a gimple * or from an rtx_insn *, so > > > > > > that > > > > > > rather than writing: > > > > > > > > > > > > dump_printf_loc (MSG_NOTE, gimple_location (stmt), > > > > > > "some message: %i", 42); > > > > > > > > > > > > you can write: > > > > > > > > > > > > dump_printf_loc (MSG_NOTE, stmt, > > > > > > "some message: %i", 42); > > > > > > > > > > > > and the dump_location_t constructor will grab the > > > > > > location_t > > > > > > and > > > > > > profile_count of stmt, and the location of the > > > > > > "dump_printf_loc" > > > > > > callsite (and gracefully handle "stmt" being NULL). > > > > > > > > > > > > Earlier versions of the patch captured the location of the > > > > > > dump_*_loc call via preprocessor hacks, or didn't work > > > > > > properly; > > > > > > this version of the patch works more cleanly: internally, > > > > > > dump_location_t is split into two new classes: > > > > > > * dump_user_location_t: the location_t and profile_count > > > > > > within > > > > > > the *user's code*, and > > > > > > * dump_impl_location_t: the __builtin_FILE/LINE/FUNCTION > > > > > > within > > > > > > the *implementation* code (i.e. GCC or a plugin), > > > > > > captured > > > > > > "automagically" via default params > > > > > > > > > > > > These classes are sometimes used elsewhere in the > > > > > > code. For > > > > > > example, "vect_location" becomes a dump_user_location_t > > > > > > (location_t and profile_count), so that in e.g: > > > > > > > > > > > > vect_location = find_loop_location (loop); > > > > > > > > > > > > it's capturing the location_t and profile_count, and then > > > > > > when > > > > > > it's used here: > > > > > > > > > > > > dump_printf_loc (MSG_NOTE, vect_location, "foo"); > > > > > > > > > > > > the dump_location_t is constructed from the vect_location > > > > > > plus the dump_impl_location_t at that callsite. > > > > > > > > > > > > In contrast, loop-unroll.c's report_unroll's "locus" param > > > > > > becomes a dump_location_t: we're interested in where it was > > > > > > called from, not in the locations of the various dump_*_loc > > > > > > calls > > > > > > within it. > > > > > > > > > > > > Previous versions of the patch captured a gimple *, and > > > > > > needed > > > > > > GTY markers; in this patch, the dump_user_location_t is now > > > > > > just a > > > > > > location_t and a profile_count. > > > > > > > > > > > > The v2 patch added an overload for dump_printf_loc so that > > > > > > you > > > > > > could pass in either a location_t, or the new type; this > > > > > > version > > > > > > of the patch eliminates that: they all now take > > > > > > dump_location_t. > > > > > > > > > > > > Doing so required adding support for rtx_insn *, so that > > > > > > one > > > > > > can > > > > > > write this kind of thing in RTL passes: > > > > > > > > > > > > dump_printf_loc (MSG_NOTE, insn, "foo"); > > > > > > > > > > > > One knock-on effect is that get_loop_location now returns a > > > > > > dump_user_location_t rather than a location_t, so that it > > > > > > has > > > > > > hotness information. > > > > > > > > > > > > Richi: would you like me to split out this location- > > > > > > handling > > > > > > code into a separate patch? (It's kind of redundant > > > > > > without > > > > > > adding the remarks and optimization records work, but if > > > > > > that's > > > > > > easier I can do it) > > > > > > > > > > I think that would be easier because it doesn't require the > > > > > JSON > > > > > stuff and so I'll happily approve it. > > > > > > > > > > Thus - trying to review that bits (and sorry for the delay). > > > > > > > > > > + location_t srcloc = loc.get_location_t (); > > > > > + > > > > > if (dump_file && (dump_kind & pflags)) > > > > > { > > > > > - dump_loc (dump_kind, dump_file, loc); > > > > > + dump_loc (dump_kind, dump_file, srcloc); > > > > > print_gimple_stmt (dump_file, gs, spc, dump_flags | > > > > > extra_dump_flags); > > > > > } > > > > > > > > > > if (alt_dump_file && (dump_kind & alt_flags)) > > > > > { > > > > > - dump_loc (dump_kind, alt_dump_file, loc); > > > > > + dump_loc (dump_kind, alt_dump_file, srcloc); > > > > > print_gimple_stmt (alt_dump_file, gs, spc, dump_flags > > > > > | > > > > > extra_dump_flags); > > > > > } > > > > > + > > > > > + if (optinfo_enabled_p ()) > > > > > + { > > > > > + optinfo &info = begin_next_optinfo (loc); > > > > > + info.handle_dump_file_kind (dump_kind); > > > > > + info.add_stmt (gs, extra_dump_flags); > > > > > + } > > > > > > > > > > seeing this in multiple places. I seem to remember that > > > > > dump_file / alt_dump_file was suposed to handle dumping > > > > > into two locations - a dump file and optinfo (or > > > > > stdout). This > > > > > looks > > > > > like the optinfo "stream" is even more separate. Could that > > > > > obsolete the alt_dump_file stream? I'd need to review > > > > > existing > > > > > stuff > > > > > in more detail to answer but maybe you already know from > > > > > recently > > > > > digging into this. > > > > > > > > Possibly. I attempted this in v1 of the patch, but it was > > > > mixed in > > > > with > > > > a bunch of other stuff. I'll have another go at doing this. > > > > > > > > > Oh, and all the if (optinfo_enable_p ()) stuff is for the > > > > > followup > > > > > then, right? > > > > > > > > Yes. > > > > > > > > > I like the boiler-plate changes to dump_* using stuff a lot, > > > > > so > > > > > the > > > > > infrastructure to do that (the location wrapping) and these > > > > > boiler- > > > > > plate > > > > > changes are pre-approved if split out. > > > > > > > > Thanks. I split out the location wrapping, and have committed > > > > it > > > > to > > > > trunk (r262149). It's not clear to me exactly what other parts > > > > you > > > > liked, > > > > so I'm going to try to split out more of the non-JSON bits in > > > > the > > > > hope that some parts are good enough as-is, and I'll post them > > > > for > > > > review > > > > as followups. > > > > > > > > For reference, here's what I've committed (I added some obvious > > > > changes > > > > to doc/optinfo.texi). > > > > > > > > > I think the *_REMARK stuff should get attention of the > > > > > respective > > > > > maintainers - not sure what the difference between NOTE and > > > > > REMARK > > > > > is ;) > > > > > > > > Me neither :) I think it might come down to "this is purely a > > > > debugging > > > > message for a pass maintainer" (MSG_NOTE) vs "this is a high- > > > > level > > > > thing > > > > that an advanced user want to see" (MSG_REMARK???). > > > > > > > > One idea that occurred to me: are we overusing > > > > dump_flags_t? It's > > > > a > > > > mixture of TDF_* bitfields (used by our internal dumps) plus > > > > MSG_* > > > > bitfields (used with -fopt-info). IIRC the only TDF_ bitfield > > > > that's > > > > ever used with the MSG_* bitfields is TDF_DETAILS. Might it > > > > make > > > > sense > > > > to split out the MSG_* bitfields into a different type > > > > (dump_level_t???) > > > > to reinforce this split? (and thus the dump_* entrypoints that > > > > don't take > > > > a FILE * would take this new type). I'm not sure if this is a > > > > good > > > > idea > > > > or not. > > > > > > Making it even more complicated doesn't make it easier to use it > > > "correctly". So I'd rather try to simplify it. How passes use > > > TDF_DETAILS vs. non-details is already highly inconsistent. This > > > is why I liked the original optinfo work because it somehow made > > > user-interesting vs. developer-interesting with the same API > > > (and to the same dump-file). Just that MSG_NOTE is exposed to > > > users while I think it should be developer-only... > > > > > > IMHO the TDF_ stuff should control things at the stmt/tree/BB > > > level, > > > thus be IL specific flags while the MSG_ stuff should control > > > pass-specific dumping detail. > > > > You mention user-interesting vs developer-interesting, and whether > > it > > would be the same API and/or the same dump-file. > > > > I've posted a bunch of patches here, some small, some large, trying > > various different appoaches, coming at the problem from at least > > two directions, so maybe it's worth discussing the overall > > direction > > here (with some ASCII art!) > > > > * what is the status quo? > > * what do we want to achieve? > > * how do we get there? > > > > Status quo: > > * We have the "dump_*(MSG_*)" API which hides the destination of > > where we're dumping to (currently dump_file and/or alt_file_file, > > conditionalized via flags). > > > > dump_* (MSG_*) ----> dumpfile.c --+--> dump_file > > | > > `--> alt_dump_file > > > > * As of r262149 (the dump_location_t commit) dumpfile.c receives > > the > > hotness and emission location of each dump_*_loc call, but > > currently > > it does nothing with that information. > > * The messages that we emit through the "dump_*(MSG_*)" API and > > their > > "levels" are currently rather haphazard. Some are close to being > > suitable for end-users, but most seem never intended to be > > end-user-facing. For reference: > > grep -nH -e MSG_MISSED_OPTIMIZATION *.c *.cc | wc -l > > 452 > > grep -nH -e MSG_NOTE *.c *.cc | wc -l > > 551 > > grep -nH -e MSG_OPTIMIZED_LOCATIONS *.c *.cc | wc -l > > 39 > > (though some of these are support code e.g. in dumpfile.c, rather > > than uses). > > Yep. I believe MSG_NOTE was the fallout of optinfo introduction and > my > request to share the same API with dumping to the dumpfile. MSG_NOTE > just received "anything else" ... > > > * The API builds up messages programatically, which is hostile to > > i18n. The API isn't marked as needing i18n for its strings > > (which > > IIRC is via using "gmsgid" as a param name being special-cased in > > the gettext toolchain). > > > > What do we want to achieve? > > * I want end-users to be able to enable high-level dump messages > > about > > optimizations, and for those messages to be decipherable without > > reading GCC sources e.g. the opt_problem idea posted here: > > * "[PATCH] [RFC] Higher-level reporting of vectorization > > problems" > > * https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01462.html > > Presumably such messages would need to be > > internationalized. Many > > other messages are really internal only, and would be a burden to > > impose on our translators. > > * I want to support additional destinations for any/all dump > > messages > > beyond just the dump file and -fopt-info: > > * putting them through the diagnostics subsystem (as "remarks") > > * storing them to disk in a machine-readable format, so that > > 3rd-party tools can present useful visualizations of how the > > user's code is being optimized, e.g. prioritized by hotness > > * I want to preserve the existing API and most existing uses of it > > (to minimize churn and thus keep our lives easier) > > > > The patches I've been posting so far add additional dump > > destinations, > > like this: > > > > dump_* (MSG_*) >---> dumpfile.c --+--> dump_file > > | > > +--> alt_dump_file > > | > > +--> diagnostic "remarks" > > | > > +--> "optimization records" > > (saved to disk) > > > > I believe you favor an approach of "MSG_NOTE is internal, whereas > > MSG_MISSED_OPTIMIZATION and MSG_OPTIMIZED_LOCATIONS are for end > > users". > > Yeah, kind of. You introduced another MSG_ variant
"MSG_REMARK" above, though I'm not sure I like that name. > and in the end we might > need to go that way. I guess renaming MSG_NOTE to MSG_DEBUG would > be a step in the direction to reflect reality ;) I think I want to look more at the "opt_problem" idea about high-level vectorization messages, and attack the problem from that direction; maybe that will suggest a good way forward for our existing dump messages. > > I think the "should this dump message be internationalized?" issue > > introduces a natural separation between dump messages aimed at > > end-users vs purely "internal" dumps; this suggests to me that > > we need a new API for such dump messages, designed to support i18n, > > and marked as such for gettext, so I favor something like this: > > > > SOURCES DESTINATIONS > > dump_* (MSG_*) >---> dumpfile.c --+--> dump_file > > | | > > new dump API -->-+ +--> alt_dump_file > > | > > +--> diagnostic "remarks" > > | > > +--> "optimization records" > > (saved to disk) > > > > (I'm deliberately being vague about what this i18n-enabled dump API > > might look like) > > I know I threw i18n into the discussion - but I'd say we should focus > on other details first. I think there's no need to _require_ the > "debug" > dumpfile stuff not be translated, but to make it easier for > translators > I suppose being able to "amend" the API calls with a "do not > translate" > variant would be OK. I'm thinking that only a small fraction of the dump messages will be user-facing, so the default would be "do not translate" with the rare special-case being "do translate". > So in what way is the dump_* API not suitable for translation > that the diagnostic machinery (warning/error) does not suffer from? Consider this example, from "vect_create_data_ref_ptr" (I used this example way back in the v1 patch kit): if (dump_enabled_p ()) { tree dr_base_type = TREE_TYPE (DR_BASE_OBJECT (dr)); dump_printf_loc (MSG_NOTE, vect_location, "create %s-pointer variable to type: ", get_tree_code_name (TREE_CODE (aggr_type))); dump_generic_expr (MSG_NOTE, TDF_SLIM, aggr_type); if (TREE_CODE (dr_base_type) == ARRAY_TYPE) dump_printf (MSG_NOTE, " vectorizing an array ref: "); else if (TREE_CODE (dr_base_type) == VECTOR_TYPE) dump_printf (MSG_NOTE, " vectorizing a vector ref: "); else if (TREE_CODE (dr_base_type) == RECORD_TYPE) dump_printf (MSG_NOTE, " vectorizing a record based array ref: "); else dump_printf (MSG_NOTE, " vectorizing a pointer ref: "); dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_BASE_OBJECT (dr)); dump_printf (MSG_NOTE, "\n"); } where there are two problems with translation: (a) the message is built up piecewise, with conditional logic. (b) there's nothing yet marking the string fragments as needing translation > That is, if we internally make dump_* go through the pretty-printers > we can handle stuff like quoting or even stmts/expressions. I think > that would be useful for dumping to dumpfiles as well. That could work. Currently dump_printf* use fprintf internally, but I don't think we make much use of the codes there. So we could make it use a pretty_printer internally. The core pretty-print.c code doesn't handle any of the interesting middle-end types we'd want to talk about (gimple, tree, symtab_node *, etc), so I think to have a custom middle-end dump pp_format_decoder callback. The above example could then look something like: if (dump_enabled_p ()) { tree dr_base_type = TREE_TYPE (DR_BASE_OBJECT (dr)); const char *msgid; if (TREE_CODE (dr_base_type) == ARRAY_TYPE) msgid = ("create %s-pointer variable to type: %T" " vectorizing an array ref: %T"); else if (TREE_CODE (dr_base_type) == VECTOR_TYPE) msgid = ("create %s-pointer variable to type: %T" " vectorizing a vector ref: %T"); else if (TREE_CODE (dr_base_type) == RECORD_TYPE) msgid = ("create %s-pointer variable to type: %T" " vectorizing a record based array ref: %T"); else msgid = ("create %s-pointer variable to type: %T" " vectorizing a pointer ref: %T"); dump_printf_loc (MSG_NOTE, vect_location, msgid, get_tree_code_name (TREE_CODE (aggr_type))); aggr_type, DR_BASE_OBJECT (dr)); dump_printf (MSG_NOTE, "\n"); } where I'm using "%T" as the format code for tree with TDF_SLIM. The optinfo could could then "print" these more interesting types by capturing the metadata (and thus expose it in optimization records). Though, that said, I don't think that message as-is is something we'd want end-users to see, and thus not something for translators. Maybe something like: extern void dump_userfacing_printf_loc (dump_flags_t, const dump_location_t &, const char *gmsgid, ...) ATTRIBUTE_DUMP_PRINTF_3; for the small subset of messages that we'd want to be translated? (with ATTRIBUTE_DUMP_PRINTF_3 to signify that we have a new family of format codes). (I'm not in love with that function name) > > > > That said, I have a version of the patch kit which does just this: > > > > dump_* (MSG_*) ----> dumpfile.c --+--> dump_file > > | > > +--> alt_dump_file > > | > > +--> diagnostic "remarks" > > > > i.e. generalizing the dump destinations (by adding optinfo > > internally), > > but without requiring the JSON support or touching the dump API . > > Would > > that be suitable as a next step? > > Yes. Thanks for going step-by-step btw, this is really useful in > simplifying > the picture. > > Richard. Thanks. Here's a reduced version of the patch kit which builds on what's already in trunk. Patch 1 adds the basic optinfo machinery, consolidating the dump_* calls into optinfo objects that can be sent to multiple dump destinations - but it doesn't add any destinations. It captures "rich" information about what is dumped, tagged with things like code hotness information, enough to eventually implement optimization records in a followup. dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file | +--> (b) alt_dump_file | `--> (c) optinfo Patch 2 adds in the first destination for them: diagnostic "remarks" (updated to work on top of patch 1). It's not perfect yet (we probably need to be able to filter them using optinfo_group_t and probably other criteria), and I'm posting it now to provide more motivation for patch 1, but it's stable enough for trunk (though a couple of FIXMEs remain). dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file | +--> (b) alt_dump_file | `--> (c) optinfo `---> optinfo destinations (c.1) diagnostic "remarks" I have followups to add JSON optimizations records as another kind of optinfo destination, so that the picture would become: dump_* (MSG_*) --> dumpfile.c --+--> (a) dump_file | +--> (b) alt_dump_file | `--> (c) optinfo `---> optinfo destinations (c.1) diagnostic "remarks" (c.2) optimization records (the patches add this ASCII art to dumpfile.h, btw) As mentioned before, the optinfo consolidation works by making the dump_*_loc calls implicitly start a new optinfo, terminating and emitting any pending one. This avoids having to touch all the dump code to explicitly terminate messages, but means that optinfo instances can be pending for a while, until the next dump_*_loc happens (or the end of the pass, or a GC; see the patch). Sadly, this delay seems to make them more suitable for user-facing messages than for debugging GCC itself. Because of this, I didn't think it was appropriate to port the alt_dump_file code to using it (better to have that be more direct). But I can do that if you want. (Alternatively, maybe a dump_printf* call that ends with a newline should terminate the optinfo? Doing that would make most of them terminate themselves) Dave David Malcolm (2): Add "optinfo" framework optinfo: add diagnostic remarks gcc/Makefile.in | 2 + gcc/common.opt | 17 + gcc/coretypes.h | 7 + gcc/diagnostic-color.c | 2 + gcc/diagnostic-core.h | 2 + gcc/diagnostic.c | 17 + gcc/diagnostic.def | 1 + gcc/doc/invoke.texi | 67 ++++ gcc/dump-context.h | 128 +++++++ gcc/dumpfile.c | 497 +++++++++++++++++++++++++-- gcc/dumpfile.h | 84 +++-- gcc/fortran/gfc-diagnostic.def | 1 + gcc/ggc-page.c | 2 + gcc/opt-functions.awk | 1 + gcc/optinfo-emit-diagnostics.cc | 317 +++++++++++++++++ gcc/optinfo-emit-diagnostics.h | 26 ++ gcc/optinfo-internal.h | 148 ++++++++ gcc/optinfo.cc | 252 ++++++++++++++ gcc/optinfo.h | 173 ++++++++++ gcc/opts.c | 4 + gcc/opts.h | 13 +- gcc/profile-count.c | 28 ++ gcc/profile-count.h | 5 + gcc/selftest-run-tests.c | 2 + gcc/selftest.h | 2 + gcc/testsuite/gcc.dg/plugin/plugin.exp | 2 + gcc/testsuite/gcc.dg/plugin/remarks-1.c | 30 ++ gcc/testsuite/gcc.dg/plugin/remarks_plugin.c | 152 ++++++++ gcc/testsuite/lib/gcc-dg.exp | 9 + 29 files changed, 1924 insertions(+), 67 deletions(-) create mode 100644 gcc/dump-context.h create mode 100644 gcc/optinfo-emit-diagnostics.cc create mode 100644 gcc/optinfo-emit-diagnostics.h create mode 100644 gcc/optinfo-internal.h create mode 100644 gcc/optinfo.cc create mode 100644 gcc/optinfo.h create mode 100644 gcc/testsuite/gcc.dg/plugin/remarks-1.c create mode 100644 gcc/testsuite/gcc.dg/plugin/remarks_plugin.c -- 1.8.5.3