On Fri, Aug 30, 2013 at 11:23 PM, Teresa Johnson <tejohn...@google.com> wrote: > On Fri, Aug 30, 2013 at 1:30 PM, Xinliang David Li <davi...@google.com> wrote: >> On Fri, Aug 30, 2013 at 12:51 PM, Teresa Johnson <tejohn...@google.com> >> wrote: >>> On Fri, Aug 30, 2013 at 9:27 AM, Xinliang David Li <davi...@google.com> >>> wrote: >>>> Except that in this form, the dump will be extremely large and not >>>> suitable for very large applications. >>> >>> Yes. I did some measurements for both a fairly large source file that >>> is heavily optimized with LIPO and for a simple toy example that has >>> some inlining. For the large source file, the output from >>> -fdump-ipa-inline=stderr was almost 100x the line count of the >>> -fopt-info output. For the toy source file it was 43x. The size of the >>> -details output was 250x and 100x, respectively. Which is untenable >>> for a large app. >>> >>> The issue I am having here is that I want a more verbose message, not >>> a more voluminous set of messages. Using either -fopt-info-all or >>> -fdump-ipa-inline to provoke the more verbose inline message will give >>> me a much greater volume of output. >>> >>> One compromise could be to emit the more verbose inliner message under >>> a param (and a more concise "foo inlined into bar" by default with >>> -fopt-info). Or we could do some variant of what David talks about >>> below. >> >> something like --param=verbose-opt-info=1 > > Yes. Richard, would this be acceptable for now? > > i.e. the inliner messages would be like: > > -fopt-info: > "test.c:8:3: note: foobar inlined into foo with call count 99999000" > (the "with call count X" only when there is profile feedback) > > -fopt-info --param=verbose-opt-info=1: > "test.c:8:3: note: foobar/0 (99999000) inlined into foo/2 (1000) > with call count 99999000 (via inline instance bar [3] (99999000)) > (again the call counts only emitted under profile feedback)
It looks like a hack to me. Is -fdump-ipa-inline useful at all? That is, can't we simply push some of the -details dumping into the non-details dump? Richard. >> >> >>> >>>> Besides, we might also want to >>>> use the same machinery (dump_printf_loc etc) for dump file dumping. >>>> The current behavior of using '-details' to turn on opt-info-all >>>> messages for dump files are not desirable. >>> >>> Interestingly, this doesn't even work. When I do >>> -fdump-ipa-inline-details=stderr (with my patch containing the inliner >>> messages) I am not getting those inliner messages emitted to stderr. >>> Even though in dumpfile.c "details" is set to (TDF_DETAILS | >>> MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION | MSG_NOTE). I'm not >>> sure why, but will need to debug this. >> >> It works for vectorizer pass. > > Ok, let me see what is going on - I just confirmed that it is not > working for the loop unroller messages either. > >> >>> >>>> How about the following: >>>> >>>> 1) add a new dump_kind modifier so that when that modifier is >>>> specified, the messages won't goto the alt_dumpfile (controlled by >>>> -fopt-info), but only to primary dump file. With this, the inline >>>> messages can be dumped via: >>>> >>>> dump_printf_loc (OPT_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY, .....) >>> >>> (you mean (MSG_OPTIMIZED_LOCATIONS | OPT_DUMP_FILE_ONLY) ) >>> >> >> Yes. >> >>> Typically OR-ing together flags like this indicates dump under any of >>> those conditions. But we could implement special handling for >>> OPT_DUMP_FILE_ONLY, which in the above case would mean dump only to >>> the primary dump file, and only under the other conditions specified >>> in the flag (here under "-optimized") >>> >>>> >>>> >>>> 2) add more flags in -fdump- support: >>>> >>>> -fdump-ipa-inline-opt --> turn on opt-info messages only >>>> -fdump-ipa-inline-optall --> turn on opt-info-all messages >>> >>> According to the documentation (see the -fdump-tree- documentation on >>> http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html#Debugging-Options), >>> the above are already supposed to be there (-optimized, -missed, -note >>> and -optall). However, specifying any of these gives a warning like: >>> cc1: warning: ignoring unknown option ‘optimized’ in >>> ‘-fdump-ipa-inline’ [enabled by default] >>> Probably because none is listed in the dump_options[] array in dumpfile.c. >>> >>> However, I don't think there is currently a way to use -fdump- options >>> and *only* get one of these, as much of the current dump output is >>> emitted whenever there is a dump_file defined. Until everything is >>> migrated to the new framework it may be difficult to get this to work. >>> >>>> -fdump-tree-pre-ir --> turn on GIMPLE dump only >>>> -fdump-tree-pre-details --> turn on everything (ir, optall, trace) >>>> >>>> With this, developers can really just use >>>> >>>> >>>> -fdump-ipa-inline-opt=stderr for inline messages. >>> >>> Yes, if we can figure out a good way to get this to work (i.e. only >>> emit the optimized messages and not the rest of the dump messages). >>> And unfortunately to get them all you need to specify >>> "-fdump-ipa-all-optimized -fdump-tree-all-optimized >>> -fdump-rtl-all-optimized" instead of just -fopt-info. Unless we can >>> add -fdump-all-all-optimized. >> >> Having general support requires cleanup of all the old style if >> (dump_file) fprintf (dump_file, ...) instances to be: >> >> if (dump_enabled_p ()) >> dump_printf (dump_kind ....); > > Right. But that is going to be a big longer-term effort - grepping for > dump_file in gcc/*.c gives about 6000 instances. > >> >> >> However, it might be easier to do this filtering for IR dump only (in >> execute_function_dump) -- do not dump IR if any of the MSG_xxxx is >> specified unless IR flag (a new flag) is also specified. > > Unfortunately there are a lot of messages that are not from > execute_function_dump. > > Thanks, > Teresa > >> >> David >> >> >>> >>> Teresa >>> >>>> >>>> thanks, >>>> >>>> David >>>> >>>> On Fri, Aug 30, 2013 at 1:30 AM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Thu, Aug 29, 2013 at 5:15 PM, Teresa Johnson <tejohn...@google.com> >>>>> wrote: >>>>>> On Thu, Aug 29, 2013 at 3:04 AM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>> New patch below that removes this global variable, and also outputs >>>>>>>>>> the node->symbol.order (in square brackets after the function name so >>>>>>>>>> as to not clutter it). Inline messages with profile data look look: >>>>>>>>>> >>>>>>>>>> test.c:8:3: note: foobar [0] (99999000) inlined into foo [2] (1000) >>>>>>>>>> with call count 99999000 (via inline instance bar [3] (99999000)) >>>>>>>>> >>>>>>>>> Ick. This looks both redundant and cluttered. This is supposed to be >>>>>>>>> understandable by GCC users, not only GCC developers. >>>>>>>> >>>>>>>> The main part that is only useful/understandable to gcc developers is >>>>>>>> the node->symbol.order in square brackes, requested by Martin. One >>>>>>>> possibility is that I could put that part under a param, disabled by >>>>>>>> default. We have something similar on the google branches that emits >>>>>>>> LIPO module info in the message, enabled via a param. >>>>>>> >>>>>>> But we have _dump files_ for that. That's the developer-consumed >>>>>>> form of opt-info. -fopt-info is purely user sugar and for usual >>>>>>> translation >>>>>>> units it shouldn't exceed a single terminal full of output. >>>>>> >>>>>> But as a developer I don't want to have to parse lots of dump files >>>>>> for a summary of the major optimizations performed (e.g. inlining, >>>>>> unrolling) for an application, unless I am diving into the reasons for >>>>>> why or why not one of those optimizations occurred in a particular >>>>>> location. I really do want a summary emitted to stderr so that it is >>>>>> easily searchable/summarizable for the app as a whole. >>>>>> >>>>>> For example, some of the apps I am interested in have thousands of >>>>>> input files, and trying to collect and parse dump files for each and >>>>>> every one is overwhelming (it probably would be even if my input files >>>>>> numbered in the hundreds). What has been very useful is having these >>>>>> high level summary messages of inlines and unrolls emitted to stderr >>>>>> by -fopt-info. Then it is easy to search and sort by hotness to get a >>>>>> feel for things like what inlines are missing when moving to a new >>>>>> compiler, or compiling a new version of the source, for example. Then >>>>>> you know which files to focus on and collect dump files for. >>>>> >>>>> I thought we can direct dump files to stderr now? So, just use >>>>> -fdump-tree-all=stderr >>>>> >>>>> and grep its contents. >>>>> >>>>>>> >>>>>>>> I'd argue that the other information (the profile counts, emitted only >>>>>>>> when using -fprofile-use, and the inline call chains) are useful if >>>>>>>> you want to understand whether and how critical inlines are occurring. >>>>>>>> I think this is the type of information that users focused on >>>>>>>> optimizations, as well as gcc developers, want when they use >>>>>>>> -fopt-info. Otherwise it is difficult to make sense of the inline >>>>>>>> information. >>>>>>> >>>>>>> Well, I doubt that inline information is interesting to users unless we >>>>>>> are >>>>>>> able to aggressively filter it to what users are interested in. Which >>>>>>> IMHO >>>>>>> isn't possible - users are interested in "I have not inlined this even >>>>>>> though >>>>>>> inlining would severely improve performance" which would indicate a bug >>>>>>> in the heuristics we can reliably detect and thus it wouldn't be there. >>>>>> >>>>>> I have interacted with users who are aware of optimizations such as >>>>>> inlining and unrolling and want to look at that information to >>>>>> diagnose performance differences when refactoring code or using a new >>>>>> compiler version. I also think inlining (especially cross-module) is >>>>>> one example of an optimization that is still being tuned, and user >>>>>> reports of performance issues related to that have been useful. >>>>>> >>>>>> I really think that the two groups of people who will find -fopt-info >>>>>> useful are gcc developers and savvy performance-hungry users. For the >>>>>> former group the additional info is extremely useful. For the latter >>>>>> group some of the extra information may not be required (although a >>>>>> call count is useful for those using profile feedback), but IMO is not >>>>>> unreasonable. >>>>> >>>>> well, your proposed output wrecks my 80x24 terminal already due to overly >>>>> long lines. >>>>> >>>>> In the end we may up with a verbosity level for each sub-set of opt-info >>>>> messages. Ick. >>>>> >>>>> Richard. >>>>> >>>>>> Teresa >>>>>> >>>>>> >>>>>> -- >>>>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 >>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413