Hello- I recently started using -flto in my builds, it's a very impressive feature, thanks very much for adding it. One thing that occurred to me while switching over to using it: In an LTO world, the object files, it seems to me, are becoming increasingly less relevant, at least for some applications. Since you are already committing to the build taking a long time, in return for the run-time performance benefit, it makes sense in a lot of cases to go whole-hog and just compile everything every time anyway. This comes with a lot of advantages, besides fewer large files laying around, it simplifies things a lot, say I don't need to worry about accidentally linking in an object file compiled differently vs the rest (different -march, different compiler, etc.), since I am just rebuilding from scratch every time. In my use case, I do such things a lot, and find it very freeing to know I don't need to worry about any state from a previous build.
In any case, the above was some justification for why I think the following feature would be appreciated and used by others as well. It's perhaps a little surprising, or at least disappointing, that this: g++ -flto=jobserver *.o will be parallelized, but this: g++ -flto=jobserver *.cpp will effectively not be; each .cpp is compiled serially, then the LTO runs in parallel, but in many cases the first step dominates the build time. Now it's clear why things are done this way, if the user wants to parallelize the compile, they are free to do so by just naming each object as a separate target in their Makefile and running a parallel make. But this takes some effort to set up, especially if you want to take care to remove the intermediate .o files automatically, and since -flto has already opened the door to gcc providing parallelization features, it seems like it would be nice to enable parallelizing more generally, for all parts of the build that could benefit from it. I took a stab at implementing this. The below patch adds an option -fparallel=(jobserver|N) that works analogously to -flto=, but applies to the whole build. It generates a Makefile from each spec, with appropriate dependencies, and then runs make to execute it. The combination -fparallel=X -flto will also be parallelized on the lto side as well, as if -flto=jobserver were specified; the idea would be any downstream tool that could naturally offer parallel features would do so in the presence of the -fparallel switch. I am sure this must be very rough around the edges, it's my first-ever look at the gcc codebase, but I tried not to make it overly restrictive. I only really have experience with Linux and C++ so I may have inadvertently specialized something to these cases, but I did try to keep it general. Here is a list of potential issues that could be addressed: -For some jobs there are environment variables set on a per-job basis. I attempted to identify all of them and came up with COMPILER_PATH, LIBRARY_PATH, and COLLECT_GCC_OPTIONS. This would need to be kept up to date if others are added. -The mechanism I used to propagate environment variables (export + unset) is probably specific to the Bourne shell and wouldn't work on other platforms, but there would be some simple platform-specific code to do it right for Windows and others. -Similarly for -pipe mode, I put pipes into the Makefile recipe, so there may be platforms where this is not the correct syntax. Anyway, here it is, in case there is any interest to pursue it further. Thanks for listening... -Lewis ============= diff --git gcc/common.opt gcc/common.opt index 3b8b14d..4417847 100644 --- gcc/common.opt +++ gcc/common.opt @@ -1575,6 +1575,10 @@ flto= Common RejectNegative Joined Var(flag_lto) Link-time optimization with number of parallel jobs or jobserver. +fparallel= +Common Driver RejectNegative Joined Var(flag_parallel) +Enable parallel build with number of parallel jobs or jobserver. + Enum Name(lto_partition_model) Type(enum lto_partition_model) UnknownError(unknown LTO partitioning model %qs) diff --git gcc/gcc.c gcc/gcc.c index a5408a4..6f9c1cd 100644 --- gcc/gcc.c +++ gcc/gcc.c @@ -1716,6 +1716,73 @@ static int have_c = 0; /* Was the option -o passed. */ static int have_o = 0; +/* Parallel mode */ +static int parallel = 0; +static int parallel_ctr = 0; +static int parallel_sctr = 0; +static enum { + parallel_mode_off, + parallel_mode_first_job_in_spec, + parallel_mode_continued_spec +} parallel_mode = parallel_mode_off; +static bool jobserver = false; +static FILE* mstream = NULL; +static const char* makefile = NULL; + +/* helper to turn $ -> $$ for make and + maybe escape single quotes for the shell. */ +static void +mstream_escape_puts (const char* string, bool single_quote) +{ + if (single_quote) + fputc ('\'', mstream); + for (; *string; string++) + { + if (*string == '$') + fputs ("$$", mstream); + else if (single_quote && *string == '\'') + fputs ("\'\\\'\'", mstream); + else + fputc (*string, mstream); + } + if (single_quote) + fputc ('\'', mstream); +} + +/* In parallel mode, if environment variables are changing for each job, + then we need to store them in the makefile. */ +static void +propagate_environment_to_makefile () +{ + static const char *const vars[] = { + "COMPILER_PATH", + LIBRARY_PATH_ENV, + "COLLECT_GCC_OPTIONS", + }; + unsigned int i; + for (i = 0; i < sizeof(vars)/sizeof(*vars); i++) + { + const char *const v = vars[i]; + const char *const val = getenv(v); + fprintf (mstream, "job%d: __environment", parallel_ctr); + fputs (i ? "+=" : "=", mstream); + if (val == NULL) + { + fputs ("unset ", mstream); + mstream_escape_puts (v, false); + } + else + { + mstream_escape_puts (v, false); + fputc ('=', mstream); + mstream_escape_puts (val, true); + fputs ("; export ", mstream); + mstream_escape_puts (v, false); + } + fputs (";\n", mstream); + } +} + /* Pointer to output file name passed in with -o. */ static const char *output_file = 0; @@ -1727,6 +1794,7 @@ static struct temp_name { const char *suffix; /* suffix associated with the code. */ int length; /* strlen (suffix). */ int unique; /* Indicates whether %g or %u/%U was used. */ + int parallel_sctr; /* which parallel spec was it for. */ const char *filename; /* associated filename. */ int filename_length; /* strlen (filename). */ struct temp_name *next; @@ -2831,6 +2899,39 @@ execute (void) } #endif + /* In parallel mode, just update the Makefile and return. */ + if (parallel_mode != parallel_mode_off) + { + parallel_ctr++; + fprintf (mstream, + ".PHONY: job%d\n" "all: job%d\n", + parallel_ctr, parallel_ctr); + propagate_environment_to_makefile (); + fprintf (mstream, "job%d:", parallel_ctr); + if (parallel_mode == parallel_mode_first_job_in_spec) + parallel_mode = parallel_mode_continued_spec; + else + fprintf (mstream, " job%d", parallel_ctr - 1); + fputs ("\n\t@+$(__environment)", mstream); + /* TODO: if -pipe is in effect, probably this only works on unix-like systems? */ + for (i = 0; i < n_commands; i++) + { + if (i) + fputs(" |", mstream); + const char** arg; + for (arg = commands[i].argv; *arg != NULL; arg++) + { + fputc (' ', mstream); + mstream_escape_puts (*arg, true); + } + if (commands[i].argv[0] != commands[i].prog) + free (CONST_CAST (char*, commands[i].argv[0])); + } + fputc ('\n', mstream); + execution_count++; + return 0; + } + /* Run each piped subprocess. */ pex = pex_init (PEX_USE_PIPES | ((report_times || report_times_to_file) @@ -3843,6 +3944,24 @@ driver_handle_option (struct gcc_options *opts, handle_foffload_option (arg); break; + case OPT_fparallel_: + if (strcmp (arg, "jobserver") == 0) + { + jobserver = true; + parallel = 1; + } + else + { + parallel = atoi(arg); + if (parallel <= 1) + parallel = 0; + } + /* Downstream tools need jobserver mode since + they will be called from our Makefile. */ + if (parallel) + save_switch ("-fparallel=jobserver", 0, NULL, true, true); + return true; + default: /* Various driver options need no special processing at this point, having been handled in a prescan above or being @@ -4510,6 +4629,12 @@ do_spec (const char *spec) { int value; + if (parallel) + { + parallel_mode = parallel_mode_first_job_in_spec; + parallel_sctr++; + } + value = do_spec_2 (spec); /* Force out any unfinished command. @@ -4526,6 +4651,8 @@ do_spec (const char *spec) value = execute (); } + parallel_mode = parallel_mode_off; + return value; } @@ -5135,7 +5262,8 @@ do_spec_1 (const char *spec, int inswitch, const char *soft_matched_part) for (t = temp_names; t; t = t->next) if (t->length == suffix_length && strncmp (t->suffix, suffix, suffix_length) == 0 - && t->unique == (c == 'u' || c == 'U' || c == 'j')) + && t->unique == (c == 'u' || c == 'U' || c == 'j') + && t->parallel_sctr == parallel_sctr) break; /* Make a new association if needed. %u and %j @@ -5161,6 +5289,7 @@ do_spec_1 (const char *spec, int inswitch, const char *soft_matched_part) temp_filename_length = strlen (temp_filename); t->filename = temp_filename; t->filename_length = temp_filename_length; + t->parallel_sctr = parallel_sctr; } free (saved_suffix); @@ -6869,6 +6998,7 @@ class driver bool prepare_infiles (); void do_spec_on_infiles () const; void maybe_run_linker (const char *argv0) const; + void maybe_run_make () const; void final_actions () const; int get_exit_code () const; @@ -6918,6 +7048,7 @@ driver::main (int argc, char **argv) do_spec_on_infiles (); maybe_run_linker (argv[0]); + maybe_run_make (); final_actions (); return get_exit_code (); } @@ -7624,6 +7755,17 @@ driver::prepare_infiles () if (!combine_inputs && have_c && have_o && lang_n_infiles > 1) fatal_error ("cannot specify -o with -c, -S or -E with multiple files"); + /* Check if we are using a makefile to implement parallel mode. */ + if (parallel) + { + makefile = make_temp_file (".mk"); + record_temp_file (makefile, 1, 0); + mstream = fopen (makefile, "w"); + if (mstream == NULL) + fatal_error ("failed to open temporary Makefile %s", + makefile); + } + /* No early exit needed from main; we can continue. */ return false; } @@ -7863,6 +8005,75 @@ driver::maybe_run_linker (const char *argv0) const && !(infiles[i].language && infiles[i].language[0] == '*')) warning (0, "%s: linker input file unused because linking not done", outfiles[i]); + + /* in parallel mode, add the dependencies for the final link. */ + if (parallel_ctr > 1 && linker_was_run) + { + int j; + fprintf (mstream, "job%d:", parallel_ctr); + for (j = 1; j < parallel_ctr; j++) + fprintf (mstream, " job%d", j); + putc('\n', mstream); + } +} + +/* in parallel mode, actually do the build now. */ +void +driver::maybe_run_make() const +{ + char jobs[32]; + const char *new_argv[6]; + const char *errmsg; + int err = 0; + int status = 0; + + if (!parallel) return; + + if (ferror (mstream) != 0 + || fclose (mstream) != 0) + fatal_error ("error writing to Makefile %s", makefile); + + if (!jobserver) + { + /* Avoid passing --jobserver-fd= and similar flags + unless jobserver mode is explicitly enabled. */ + putenv (xstrdup ("MAKEFLAGS=")); + putenv (xstrdup ("MFLAGS=")); + } + + new_argv[0] = getenv ("MAKE"); + if (!new_argv[0]) + new_argv[0] = "make"; + new_argv[1] = "-f"; + new_argv[2] = makefile; + if (!jobserver) + { + snprintf (jobs, 31, "-j%d", parallel); + new_argv[3] = jobs; + } + else + new_argv[3] = "-j"; + new_argv[4] = "all"; + new_argv[5] = NULL; + + errmsg = pex_one (PEX_SEARCH, + new_argv[0], + CONST_CAST(char *const*, new_argv), + new_argv[0], + NULL, NULL, &status, &err); + if (errmsg != NULL) + { + if (err == 0) + fatal_error (errmsg); + else + { + errno = err; + pfatal_with_name (errmsg); + } + } + if (WIFSIGNALED (status) + || (WIFEXITED (status) && WEXITSTATUS (status) >= MIN_FATAL_STATUS)) + errorcount++; } /* The end of "main". */ diff --git gcc/lto-wrapper.c gcc/lto-wrapper.c index f75c0dc..ce50269 100644 --- gcc/lto-wrapper.c +++ gcc/lto-wrapper.c @@ -982,6 +982,9 @@ run_gcc (unsigned argc, char *argv[]) no_partition = true; break; + case OPT_fparallel_: + /* Fallthru. */ + case OPT_flto_: if (strcmp (option->arg, "jobserver") == 0) { @@ -1239,7 +1242,11 @@ cont: { fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]); for (j = 1; new_argv[j] != NULL; ++j) - fprintf (mstream, " '%s'", new_argv[j]); + /* don't propagate parallel when we call gcc again, + it is wasteful since we are only giving it + one file. */ + if (strcmp (new_argv[j], "-fparallel=jobserver") != 0) + fprintf (mstream, " '%s'", new_argv[j]); fprintf (mstream, "\n"); /* If we are not preserving the ltrans input files then truncate them as soon as we have processed it. This