On Thu, 2013-06-27 at 14:50 +0000, Joseph S. Myers wrote: > On Wed, 26 Jun 2013, David Malcolm wrote: > > > FWIW I wonder to what extent the discussions that follow all exhibit a > > tradeoff between the desire to provide a clean API vs the desire to > > minimize the size of the patch (to reduce backporting pain). > > I don't think reducing backporting pain is particularly relevant. There > are lots of things to clean up; the question is priorities and ordering > rather than whether some particular cleanup should be avoided because of > backporting pain it causes. > > > > * For dump_file and associated variables such as dump_flags, I sort of > > > think there should be a proper API for code writing to dumps rather than > > > directly accessing dump_file all over the compiler. That should > > > massively > > > reduce the number of places needing to access those fields of the > > > universe. > > > > Right, but doing this would be a big patch, touching numerous files. > > You said you had refactoring tools to make such things easier.... Given > such tools, I'm imagining the large but mechanical changes shouldn't be so > much more work than smaller mechanical changes for the same issue.
I'm wary of scope-creep here. I want to focus on "removal of global state", and I want that to be separate from "cleanups of internal APIs". IMHO the kinds of change you're proposing *would* be much more work. Although I have some tools to help with big refactorings and they do save some time, it's mainly a way to stop my RSI from flaring up (by generating ChangeLogs, and allowing for easy tweaking based on patch reviews). It's still a fair amount of work to write the refactoring script, deal with the cases where it doesn't work, and ensure that whitespace is properly handled. Compared to the above, options like: #define dump_file GET_UNIVERSE().dump_file_ #define dump_flags GET_UNIVERSE().dump_flags_ or: #if SHARED_LIBRARY #define MAYBE_TLS __thread #else #define MAYBE_TLS #endif extern MAYBE_TLS FILE *dump_file; extern MAYBE_TLS FILE *dump_flags; are trivial. So I'd much prefer to keep the globals-removal separate from the API cleanup, and not have the latter be a prerequisite for the former; we can go back later and do API cleanups. > > What would a new API look like? The classic problem with > > logging/dumping is that you want to avoid doing work in the no-logging > > case, so that rather than: > > > > log ("some message: %s", perform_some_expensive_computation ()); > > > > with enable/disable in "log" (and thus always doing the expensive work > > and typically discarding it), you have things like: > > > > if (logging) > > { > > log ("some message: %s", perform_some_expensive_computation ()); > > } > > > > If we're going to have a "logging" conditional there, that's going to > > involve passing around or looking up a variable, so why not simply make > > it the dump_file? > > That's a reason for the API to separate whether to generate data for dumps > from the actual dumping. > > But I think the API for whether to dump should (if we think about what a > good design would be, rather than what a minimal patch would be) be > something that returns a boolean, not a FILE *. The fact that a FILE * is > involved internally isn't something most of the code all over the compiler > generating data for dumps should need to care about. > > Here's one motivation for wanting better APIs both for this and for .s > output. Some places in the compiler use printf formats such as > HOST_WIDE_INT_PRINT_DEC. Others use GCC pretty-printer formats such as > %wd (the pretty-printer implementation internally uses > HOST_WIDE_INT_PRINT_DEC to implement %wd). People need to remember what > to use where, and if they use the wrong format in the wrong place they may > get host-dependent bugs. > > Generally it's a bad idea for host-portability details such as > HOST_WIDE_INT_PRINT_DEC to be used all over the place - it's better if > they can be used only in a few places implementing formats such as %wd, > with everything else using the format %wd. If you have an API such as > dump_file_fprintf, you can do that. If code is using fprintf directly on > dump_file, you can't. > > Now, creating interfaces such as dump_file_fprintf doesn't itself clean up > this issue - a separate patch would be needed that makes dump_file_fprintf > use the pretty-printer code and makes all formats used with > dump_file_fprintf use the pretty-printer formats and only those formats > (and not any printf features not supported by the pretty-printer code). > But it's a useful starting point to facilitate such a cleanup. And > changing "fprintf (dump_file, ...)" to "dump_file_fprintf (...)" should be > a straightforward change to make, given a refactoring tool that will > handle reindenting subsequent lines of arguments to the call to match the > new column of the open-parenthesis (and wrapping if that would make lines > too long). It might be a larger patch than your suggestion of setting a > dump_file local variable from the universe at the start of each function, > but I think it's a better interface. Right, but regardless of what the API looks like, we'd be trading in two global variables (dump_file and dump_flags) for one ("dumper", or whatnot), or using them implicitly behind an API, where presumably we'd need to use TLS in a shared-library build to find them again. > > FWIW I originally came up with a very simple API, but I filed it in the > > "rejected ideas" appendix: > > http://dmalcolm.fedorapeople.org/gcc/global-state/rejected-ideas.html#rejected-idea-dump-if-details > > That rejected idea involves returning a FILE *, which I don't like if the > API for dumping is being rethought. (nods) > > > * For asm_out_file, see dump_file, stdout and stderr above - there should > > > be a well-defined API for writing assembler output and only the > > > implementation of that API should refer directly to this global. > > > > In the plan I suggested using TLS for this in the shared-library build, > > out of a desire to minimize the patch size. > > > > An alternative approach would be to move much of output.h into a class; > > something like this: > > > > class MAYBE_SINGLETON asm_out > > { > > public: > > void putc (char ); > > void puts (const char *); > > void printf (const char *, ...) some_attribute; > > > > /* Most out output.h gets moved to here. */ > > void assemble_zeros (unsigned HOST_WIDE_INT); > > void assemble_align (int); > > void assemble_string (const char *, int); > > void assemble_external_libcall (rtx); > > void assemble_label (FILE *, const char *); > > // etc > > > > private: > > FILE *asm_out_file; > > }; > > > > and have an "asm_out_" in the universe. > > Yes, that's the sort of thing I was thinking of, for much the same reasons > as for dump_file including reducing the amount of code that needs to know > about any portability issues with host printf. > > It so happens at least a large proportion of output to asm_out_file goes > through asm_fprintf, which uses its own kinds of formats anyway rather > than host printf formats - but which takes a FILE * as argument. As far > as I know, that FILE * is always asm_out_file being passed around > manually; certainly, being a method so the FILE * doesn't need passing > around like that would be better. (Every back-end interface that > currently takes a FILE * to use for assembler output would change to take > a class instance instead, I suppose, and the implementations of the > asm_out methods would call such target hooks as needed.) Right, but again, this seems orthogonal to the removal of global variables. If we implemented the above, aren't we trading in one global: FILE *asm_out_file; for another: class asm_writer asm_out; simply moving the problem (when seen from the POV of state-removal). In either case, we can simply make the global into a field of "class universe", and so cleaning up the asm-writing API could happen before or after the state cleanup. Thanks; hope this is constructive Dave