On Wed, 26 Jun 2013, David Malcolm wrote: > FWIW I wonder to what extent the discussions that follow all exhibit a > tradeoff between the desire to provide a clean API vs the desire to > minimize the size of the patch (to reduce backporting pain).
I don't think reducing backporting pain is particularly relevant. There are lots of things to clean up; the question is priorities and ordering rather than whether some particular cleanup should be avoided because of backporting pain it causes. > > * For dump_file and associated variables such as dump_flags, I sort of > > think there should be a proper API for code writing to dumps rather than > > directly accessing dump_file all over the compiler. That should massively > > reduce the number of places needing to access those fields of the > > universe. > > Right, but doing this would be a big patch, touching numerous files. You said you had refactoring tools to make such things easier.... Given such tools, I'm imagining the large but mechanical changes shouldn't be so much more work than smaller mechanical changes for the same issue. > What would a new API look like? The classic problem with > logging/dumping is that you want to avoid doing work in the no-logging > case, so that rather than: > > log ("some message: %s", perform_some_expensive_computation ()); > > with enable/disable in "log" (and thus always doing the expensive work > and typically discarding it), you have things like: > > if (logging) > { > log ("some message: %s", perform_some_expensive_computation ()); > } > > If we're going to have a "logging" conditional there, that's going to > involve passing around or looking up a variable, so why not simply make > it the dump_file? That's a reason for the API to separate whether to generate data for dumps from the actual dumping. But I think the API for whether to dump should (if we think about what a good design would be, rather than what a minimal patch would be) be something that returns a boolean, not a FILE *. The fact that a FILE * is involved internally isn't something most of the code all over the compiler generating data for dumps should need to care about. Here's one motivation for wanting better APIs both for this and for .s output. Some places in the compiler use printf formats such as HOST_WIDE_INT_PRINT_DEC. Others use GCC pretty-printer formats such as %wd (the pretty-printer implementation internally uses HOST_WIDE_INT_PRINT_DEC to implement %wd). People need to remember what to use where, and if they use the wrong format in the wrong place they may get host-dependent bugs. Generally it's a bad idea for host-portability details such as HOST_WIDE_INT_PRINT_DEC to be used all over the place - it's better if they can be used only in a few places implementing formats such as %wd, with everything else using the format %wd. If you have an API such as dump_file_fprintf, you can do that. If code is using fprintf directly on dump_file, you can't. Now, creating interfaces such as dump_file_fprintf doesn't itself clean up this issue - a separate patch would be needed that makes dump_file_fprintf use the pretty-printer code and makes all formats used with dump_file_fprintf use the pretty-printer formats and only those formats (and not any printf features not supported by the pretty-printer code). But it's a useful starting point to facilitate such a cleanup. And changing "fprintf (dump_file, ...)" to "dump_file_fprintf (...)" should be a straightforward change to make, given a refactoring tool that will handle reindenting subsequent lines of arguments to the call to match the new column of the open-parenthesis (and wrapping if that would make lines too long). It might be a larger patch than your suggestion of setting a dump_file local variable from the universe at the start of each function, but I think it's a better interface. > FWIW I originally came up with a very simple API, but I filed it in the > "rejected ideas" appendix: > http://dmalcolm.fedorapeople.org/gcc/global-state/rejected-ideas.html#rejected-idea-dump-if-details That rejected idea involves returning a FILE *, which I don't like if the API for dumping is being rethought. > > * For asm_out_file, see dump_file, stdout and stderr above - there should > > be a well-defined API for writing assembler output and only the > > implementation of that API should refer directly to this global. > > In the plan I suggested using TLS for this in the shared-library build, > out of a desire to minimize the patch size. > > An alternative approach would be to move much of output.h into a class; > something like this: > > class MAYBE_SINGLETON asm_out > { > public: > void putc (char ); > void puts (const char *); > void printf (const char *, ...) some_attribute; > > /* Most out output.h gets moved to here. */ > void assemble_zeros (unsigned HOST_WIDE_INT); > void assemble_align (int); > void assemble_string (const char *, int); > void assemble_external_libcall (rtx); > void assemble_label (FILE *, const char *); > // etc > > private: > FILE *asm_out_file; > }; > > and have an "asm_out_" in the universe. Yes, that's the sort of thing I was thinking of, for much the same reasons as for dump_file including reducing the amount of code that needs to know about any portability issues with host printf. It so happens at least a large proportion of output to asm_out_file goes through asm_fprintf, which uses its own kinds of formats anyway rather than host printf formats - but which takes a FILE * as argument. As far as I know, that FILE * is always asm_out_file being passed around manually; certainly, being a method so the FILE * doesn't need passing around like that would be better. (Every back-end interface that currently takes a FILE * to use for assembler output would change to take a class instance instead, I suppose, and the implementations of the asm_out methods would call such target hooks as needed.) -- Joseph S. Myers jos...@codesourcery.com