Well, I guess it's safe to say this did not generate resounding interest :-). Just thought I would check once more if anyone thought it was a worthwhile thing to pursue, and/or had any feedback on the attempt at implementing it. FWIW I have been using this myself for a while now and enjoy it. Thanks!
-Lewis On Wed, Dec 17, 2014 at 1:12 PM, Lewis Hyatt <lhy...@gmail.com> wrote: > Hello- > > I recently started using -flto in my builds, it's a very impressive > feature, thanks very much for adding it. One thing that occurred to me > while switching over to using it: In an LTO world, the object files, > it seems to me, are becoming increasingly less relevant, at least for > some applications. Since you are already committing to the build > taking a long time, in return for the run-time performance benefit, it > makes sense in a lot of cases to go whole-hog and just compile > everything every time anyway. This comes with a lot of advantages, > besides fewer large files laying around, it simplifies things a lot, > say I don't need to worry about accidentally linking in an object file > compiled differently vs the rest (different -march, different > compiler, etc.), since I am just rebuilding from scratch every time. > In my use case, I do such things a lot, and find it very freeing to > know I don't need to worry about any state from a previous build. > > In any case, the above was some justification for why I think the > following feature would be appreciated and used by others as well. > It's perhaps a little surprising, or at least disappointing, that > this: > > g++ -flto=jobserver *.o > > will be parallelized, but this: > > g++ -flto=jobserver *.cpp > > will effectively not be; each .cpp is compiled serially, then the LTO > runs in parallel, but in many cases the first step dominates the build > time. Now it's clear why things are done this way, if the user wants > to parallelize the compile, they are free to do so by just naming each > object as a separate target in their Makefile and running a parallel > make. But this takes some effort to set up, especially if you want to > take care to remove the intermediate .o files automatically, and since > -flto has already opened the door to gcc providing parallelization > features, it seems like it would be nice to enable parallelizing more > generally, for all parts of the build that could benefit from it. > > I took a stab at implementing this. The below patch adds an option > -fparallel=(jobserver|N) that works analogously to -flto=, but applies > to the whole build. It generates a Makefile from each spec, with > appropriate dependencies, and then runs make to execute it. The > combination -fparallel=X -flto will also be parallelized on the lto > side as well, as if -flto=jobserver were specified; the idea would be > any downstream tool that could naturally offer parallel features would > do so in the presence of the -fparallel switch. > > I am sure this must be very rough around the edges, it's my first-ever > look at the gcc codebase, but I tried not to make it overly > restrictive. I only really have experience with Linux and C++ so I may > have inadvertently specialized something to these cases, but I did try > to keep it general. Here is a list of potential issues that could be > addressed: > > -For some jobs there are environment variables set on a per-job basis. > I attempted to identify all of them and came up with COMPILER_PATH, > LIBRARY_PATH, and COLLECT_GCC_OPTIONS. This would need to be kept up > to date if others are added. > > -The mechanism I used to propagate environment variables (export + > unset) is probably specific to the Bourne shell and wouldn't work on > other platforms, but there would be some simple platform-specific code > to do it right for Windows and others. > > -Similarly for -pipe mode, I put pipes into the Makefile recipe, so > there may be platforms where this is not the correct syntax. > > Anyway, here it is, in case there is any interest to pursue it > further. Thanks for listening... > > -Lewis > > ============= > > diff --git gcc/common.opt gcc/common.opt > index 3b8b14d..4417847 100644 > --- gcc/common.opt > +++ gcc/common.opt > @@ -1575,6 +1575,10 @@ flto= > Common RejectNegative Joined Var(flag_lto) > Link-time optimization with number of parallel jobs or jobserver. > > +fparallel= > +Common Driver RejectNegative Joined Var(flag_parallel) > +Enable parallel build with number of parallel jobs or jobserver. > + > Enum > Name(lto_partition_model) Type(enum lto_partition_model) > UnknownError(unknown LTO partitioning model %qs) > > diff --git gcc/gcc.c gcc/gcc.c > index a5408a4..6f9c1cd 100644 > --- gcc/gcc.c > +++ gcc/gcc.c > @@ -1716,6 +1716,73 @@ static int have_c = 0; > /* Was the option -o passed. */ > static int have_o = 0; > > +/* Parallel mode */ > +static int parallel = 0; > +static int parallel_ctr = 0; > +static int parallel_sctr = 0; > +static enum { > + parallel_mode_off, > + parallel_mode_first_job_in_spec, > + parallel_mode_continued_spec > +} parallel_mode = parallel_mode_off; > +static bool jobserver = false; > +static FILE* mstream = NULL; > +static const char* makefile = NULL; > + > +/* helper to turn $ -> $$ for make and > + maybe escape single quotes for the shell. */ > +static void > +mstream_escape_puts (const char* string, bool single_quote) > +{ > + if (single_quote) > + fputc ('\'', mstream); > + for (; *string; string++) > + { > + if (*string == '$') > + fputs ("$$", mstream); > + else if (single_quote && *string == '\'') > + fputs ("\'\\\'\'", mstream); > + else > + fputc (*string, mstream); > + } > + if (single_quote) > + fputc ('\'', mstream); > +} > + > +/* In parallel mode, if environment variables are changing for each job, > + then we need to store them in the makefile. */ > +static void > +propagate_environment_to_makefile () > +{ > + static const char *const vars[] = { > + "COMPILER_PATH", > + LIBRARY_PATH_ENV, > + "COLLECT_GCC_OPTIONS", > + }; > + unsigned int i; > + for (i = 0; i < sizeof(vars)/sizeof(*vars); i++) > + { > + const char *const v = vars[i]; > + const char *const val = getenv(v); > + fprintf (mstream, "job%d: __environment", parallel_ctr); > + fputs (i ? "+=" : "=", mstream); > + if (val == NULL) > + { > + fputs ("unset ", mstream); > + mstream_escape_puts (v, false); > + } > + else > + { > + mstream_escape_puts (v, false); > + fputc ('=', mstream); > + mstream_escape_puts (val, true); > + fputs ("; export ", mstream); > + mstream_escape_puts (v, false); > + } > + fputs (";\n", mstream); > + } > +} > + > /* Pointer to output file name passed in with -o. */ > static const char *output_file = 0; > > @@ -1727,6 +1794,7 @@ static struct temp_name { > const char *suffix; /* suffix associated with the code. */ > int length; /* strlen (suffix). */ > int unique; /* Indicates whether %g or %u/%U was used. */ > + int parallel_sctr; /* which parallel spec was it for. */ > const char *filename; /* associated filename. */ > int filename_length; /* strlen (filename). */ > struct temp_name *next; > @@ -2831,6 +2899,39 @@ execute (void) > } > #endif > > + /* In parallel mode, just update the Makefile and return. */ > + if (parallel_mode != parallel_mode_off) > + { > + parallel_ctr++; > + fprintf (mstream, > + ".PHONY: job%d\n" "all: job%d\n", > + parallel_ctr, parallel_ctr); > + propagate_environment_to_makefile (); > + fprintf (mstream, "job%d:", parallel_ctr); > + if (parallel_mode == parallel_mode_first_job_in_spec) > + parallel_mode = parallel_mode_continued_spec; > + else > + fprintf (mstream, " job%d", parallel_ctr - 1); > + fputs ("\n\t@+$(__environment)", mstream); > + /* TODO: if -pipe is in effect, probably this only works on > unix-like systems? */ > + for (i = 0; i < n_commands; i++) > + { > + if (i) > + fputs(" |", mstream); > + const char** arg; > + for (arg = commands[i].argv; *arg != NULL; arg++) > + { > + fputc (' ', mstream); > + mstream_escape_puts (*arg, true); > + } > + if (commands[i].argv[0] != commands[i].prog) > + free (CONST_CAST (char*, commands[i].argv[0])); > + } > + fputc ('\n', mstream); > + execution_count++; > + return 0; > + } > + > /* Run each piped subprocess. */ > > pex = pex_init (PEX_USE_PIPES | ((report_times || report_times_to_file) > @@ -3843,6 +3944,24 @@ driver_handle_option (struct gcc_options *opts, > handle_foffload_option (arg); > break; > > + case OPT_fparallel_: > + if (strcmp (arg, "jobserver") == 0) > + { > + jobserver = true; > + parallel = 1; > + } > + else > + { > + parallel = atoi(arg); > + if (parallel <= 1) > + parallel = 0; > + } > + /* Downstream tools need jobserver mode since > + they will be called from our Makefile. */ > + if (parallel) > + save_switch ("-fparallel=jobserver", 0, NULL, true, true); > + return true; > + > default: > /* Various driver options need no special processing at this > point, having been handled in a prescan above or being > @@ -4510,6 +4629,12 @@ do_spec (const char *spec) > { > int value; > > + if (parallel) > + { > + parallel_mode = parallel_mode_first_job_in_spec; > + parallel_sctr++; > + } > + > value = do_spec_2 (spec); > > /* Force out any unfinished command. > @@ -4526,6 +4651,8 @@ do_spec (const char *spec) > value = execute (); > } > > + parallel_mode = parallel_mode_off; > + > return value; > } > > @@ -5135,7 +5262,8 @@ do_spec_1 (const char *spec, int inswitch, const > char *soft_matched_part) > for (t = temp_names; t; t = t->next) > if (t->length == suffix_length > && strncmp (t->suffix, suffix, suffix_length) == 0 > - && t->unique == (c == 'u' || c == 'U' || c == 'j')) > + && t->unique == (c == 'u' || c == 'U' || c == 'j') > + && t->parallel_sctr == parallel_sctr) > break; > > /* Make a new association if needed. %u and %j > @@ -5161,6 +5289,7 @@ do_spec_1 (const char *spec, int inswitch, const > char *soft_matched_part) > temp_filename_length = strlen (temp_filename); > t->filename = temp_filename; > t->filename_length = temp_filename_length; > + t->parallel_sctr = parallel_sctr; > } > > free (saved_suffix); > @@ -6869,6 +6998,7 @@ class driver > bool prepare_infiles (); > void do_spec_on_infiles () const; > void maybe_run_linker (const char *argv0) const; > + void maybe_run_make () const; > void final_actions () const; > int get_exit_code () const; > > @@ -6918,6 +7048,7 @@ driver::main (int argc, char **argv) > > do_spec_on_infiles (); > maybe_run_linker (argv[0]); > + maybe_run_make (); > final_actions (); > return get_exit_code (); > } > @@ -7624,6 +7755,17 @@ driver::prepare_infiles () > if (!combine_inputs && have_c && have_o && lang_n_infiles > 1) > fatal_error ("cannot specify -o with -c, -S or -E with multiple files"); > > + /* Check if we are using a makefile to implement parallel mode. */ > + if (parallel) > + { > + makefile = make_temp_file (".mk"); > + record_temp_file (makefile, 1, 0); > + mstream = fopen (makefile, "w"); > + if (mstream == NULL) > + fatal_error ("failed to open temporary Makefile %s", > + makefile); > + } > + > /* No early exit needed from main; we can continue. */ > return false; > } > @@ -7863,6 +8005,75 @@ driver::maybe_run_linker (const char *argv0) const > && !(infiles[i].language && infiles[i].language[0] == '*')) > warning (0, "%s: linker input file unused because linking not done", > outfiles[i]); > + > + /* in parallel mode, add the dependencies for the final link. */ > + if (parallel_ctr > 1 && linker_was_run) > + { > + int j; > + fprintf (mstream, "job%d:", parallel_ctr); > + for (j = 1; j < parallel_ctr; j++) > + fprintf (mstream, " job%d", j); > + putc('\n', mstream); > + } > +} > + > +/* in parallel mode, actually do the build now. */ > +void > +driver::maybe_run_make() const > +{ > + char jobs[32]; > + const char *new_argv[6]; > + const char *errmsg; > + int err = 0; > + int status = 0; > + > + if (!parallel) return; > + > + if (ferror (mstream) != 0 > + || fclose (mstream) != 0) > + fatal_error ("error writing to Makefile %s", makefile); > + > + if (!jobserver) > + { > + /* Avoid passing --jobserver-fd= and similar flags > + unless jobserver mode is explicitly enabled. */ > + putenv (xstrdup ("MAKEFLAGS=")); > + putenv (xstrdup ("MFLAGS=")); > + } > + > + new_argv[0] = getenv ("MAKE"); > + if (!new_argv[0]) > + new_argv[0] = "make"; > + new_argv[1] = "-f"; > + new_argv[2] = makefile; > + if (!jobserver) > + { > + snprintf (jobs, 31, "-j%d", parallel); > + new_argv[3] = jobs; > + } > + else > + new_argv[3] = "-j"; > + new_argv[4] = "all"; > + new_argv[5] = NULL; > + > + errmsg = pex_one (PEX_SEARCH, > + new_argv[0], > + CONST_CAST(char *const*, new_argv), > + new_argv[0], > + NULL, NULL, &status, &err); > + if (errmsg != NULL) > + { > + if (err == 0) > + fatal_error (errmsg); > + else > + { > + errno = err; > + pfatal_with_name (errmsg); > + } > + } > + if (WIFSIGNALED (status) > + || (WIFEXITED (status) && WEXITSTATUS (status) >= MIN_FATAL_STATUS)) > + errorcount++; > } > > /* The end of "main". */ > diff --git gcc/lto-wrapper.c gcc/lto-wrapper.c > index f75c0dc..ce50269 100644 > --- gcc/lto-wrapper.c > +++ gcc/lto-wrapper.c > @@ -982,6 +982,9 @@ run_gcc (unsigned argc, char *argv[]) > no_partition = true; > break; > > + case OPT_fparallel_: > + /* Fallthru. */ > + > case OPT_flto_: > if (strcmp (option->arg, "jobserver") == 0) > { > @@ -1239,7 +1242,11 @@ cont: > { > fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]); > for (j = 1; new_argv[j] != NULL; ++j) > - fprintf (mstream, " '%s'", new_argv[j]); > + /* don't propagate parallel when we call gcc again, > + it is wasteful since we are only giving it > + one file. */ > + if (strcmp (new_argv[j], "-fparallel=jobserver") != 0) > + fprintf (mstream, " '%s'", new_argv[j]); > fprintf (mstream, "\n"); > /* If we are not preserving the ltrans input files then > truncate them as soon as we have processed it. This