Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
The PR was originally about an ICE in the linemap code that was patched up for GCC 14. I left it open as a reminder to finish improving the locations assigned to macros that are defined prior to including a PCH. This patch provides that improvement... does it look OK please? Bootstrap + regtest all languages on x86-64 and aarch64 (Linux). Thanks! -Lewis -- >8 -- It is permissible to define macros prior to including a PCH, as long as these definitions are disjoint from or identical to the macros in the PCH. The PCH loading process replaces all libcpp data structures with those from the PCH, so it is necessary to remember the extra macros separately and then restore them after loading the PCH, which all is handled by cpp_save_state() and cpp_read_state() in libcpp/pch.cc. The restoration process consists of pushing a buffer containing the macro definition and then lexing it from there, similar to how a command-line -D option is processed. The current implementation does not attempt to set up the line_map for this process, and so the locations assigned to the macros are often not meaningful. (Similar to what happened in the past with lexing the tokens out of a _Pragma string, lexing out of a buffer rather than a file produces "sorta" reasonable locations that are often close enough, but not reliably correct.) Fix that up by remembering enough additional information (more or less, an expanded_location for each macro definition) to produce a reasonable location for the newly restored macros. One issue that came up is the treatment of command-line-defined macros. From the perspective of the generic line_map data structures, the command-line location is not distinguishable from other locations; it's just an ordinary location created by the front ends with a fake file name by convention. (At the moment, it is always the string `<command-line>', subject to translation.) Since libcpp needs to assign macros to that location, it needs to know what location to use, so I added a new member line_maps::cmdline_location for the front ends to set, similar to how line_maps::builtin_location is handled. This revealed a small issue, in c-opts.cc we have: /* All command line defines must have the same location. */ cpp_force_token_locations (parse_in, line_table->highest_line); But contrary to the comment, all command line defines don't actually end up with the same location anymore. This is because libcpp/lex.cc has been expanded (r6-4873) to include range information on the returned locations. That logic has never been respecting the request of cpp_force_token_locations. I believe this was not intentional, and so I have corrected that here. Prior to this patch, the range logic has been leading to command-line macros all having similar locations in the same line map (or ad-hoc locations based from there for sufficiently long tokens); with this change, they all have exactly the same location and that location is recorded in line_maps::cmdline_location. With that change, then it works fine for pch.cc to restore macros whether they came from the command-line or from the main file. gcc/c-family/ChangeLog: PR preprocessor/105608 * c-opts.cc (c_finish_options): Set new member line_table->cmdline_location. * c-pch.cc (c_common_read_pch): Adapt linemap usage to changes in libcpp pch.cc; it is now possible that the linemap is in a different file after returning from cpp_read_state(). libcpp/ChangeLog: PR preprocessor/105608 * include/line-map.h: Add new member CMDLINE_LOCATION. * lex.cc (get_location_for_byte_range_in_cur_line): Do not expand the token location to include range information if token location override was requested. (warn_about_normalization): Likewise. (_cpp_lex_direct): Likewise. * pch.cc (struct saved_macro): New local struct. (struct save_macro_data): Change DEFNS vector to hold saved_macro rather than uchar*. (save_macros): Adapt to remember the location information for each saved macro in addition to the definition. (cpp_prepare_state): Likewise. (cpp_read_state): Use the saved location information to generate proper locations for the restored macros. gcc/testsuite/ChangeLog: PR preprocessor/105608 * g++.dg/pch/line-map-3.C: Remove xfails. * g++.dg/pch/line-map-4.C: New test. * g++.dg/pch/line-map-4.Hs: New test. --- libcpp/include/line-map.h | 4 + libcpp/lex.cc | 6 +- libcpp/pch.cc | 125 ++++++++++++++++++++----- gcc/c-family/c-opts.cc | 1 + gcc/c-family/c-pch.cc | 12 +-- gcc/testsuite/g++.dg/pch/line-map-3.C | 26 +---- gcc/testsuite/g++.dg/pch/line-map-4.C | 5 + gcc/testsuite/g++.dg/pch/line-map-4.Hs | 1 + 8 files changed, 130 insertions(+), 50 deletions(-) create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.C create mode 100644 gcc/testsuite/g++.dg/pch/line-map-4.Hs diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h index 21a59af2236..b4560cf3737 100644 --- a/libcpp/include/line-map.h +++ b/libcpp/include/line-map.h @@ -878,6 +878,10 @@ public: built-in tokens. */ location_t builtin_location; + /* The special location value to be used for tokens originating on the + command line. */ + location_t cmdline_location; + /* The default value of range_bits in ordinary line maps. */ unsigned int default_range_bits; diff --git a/libcpp/lex.cc b/libcpp/lex.cc index e7705a64f39..2474d028cf8 100644 --- a/libcpp/lex.cc +++ b/libcpp/lex.cc @@ -1353,6 +1353,8 @@ get_location_for_byte_range_in_cur_line (cpp_reader *pfile, const unsigned char *const start, size_t num_bytes) { + if (pfile->forced_token_location) + return pfile->forced_token_location; gcc_checking_assert (num_bytes > 0); /* CPP_BUF_COLUMN and linemap_position_for_column both refer @@ -2035,6 +2037,7 @@ warn_about_normalization (cpp_reader *pfile, /* If possible, create a location range for the token. */ if (loc >= RESERVED_LOCATION_COUNT && token->type != CPP_EOF + && !pfile->forced_token_location /* There must be no line notes to process. */ && (!(pfile->buffer->cur >= pfile->buffer->notes[pfile->buffer->cur_note].pos @@ -4401,7 +4404,8 @@ _cpp_lex_direct (cpp_reader *pfile) /* Potentially convert the location of the token to a range. */ if (result->src_loc >= RESERVED_LOCATION_COUNT - && result->type != CPP_EOF) + && result->type != CPP_EOF + && !pfile->forced_token_location) { /* Ensure that any line notes are processed, so that we have the correct physical line/column for the end-point of the token even diff --git a/libcpp/pch.cc b/libcpp/pch.cc index fa5efc9f102..495fe7f434b 100644 --- a/libcpp/pch.cc +++ b/libcpp/pch.cc @@ -730,9 +730,20 @@ cpp_valid_state (cpp_reader *r, const char *name, int fd) /* Save all the existing macros. */ +namespace { + +struct saved_macro +{ + uchar *defn; + location_t loc; + expanded_location xloc; +}; + +} /* unnamed namespace */ + struct save_macro_data { - uchar **defns; + saved_macro *defns; size_t count; size_t array_size; char **saved_pragmas; @@ -762,14 +773,29 @@ save_macros (cpp_reader *r, cpp_hashnode *h, void *data_p) if (data->count == data->array_size) { data->array_size *= 2; - data->defns = XRESIZEVEC (uchar *, data->defns, (data->array_size)); + data->defns = XRESIZEVEC (saved_macro, data->defns, data->array_size); } const uchar * defn = cpp_macro_definition (r, h); size_t defnlen = ustrlen (defn); - data->defns[data->count] = (uchar *) xmemdup (defn, defnlen, defnlen + 2); - data->defns[data->count][defnlen] = '\n'; + const auto d = data->defns + data->count; + d->defn = (uchar *) xmemdup (defn, defnlen, defnlen + 2); + d->defn[defnlen] = '\n'; + d->loc = h->value.macro->line; + d->xloc = {}; + if (d->loc == r->line_table->cmdline_location) + { + /* The cmdline_location may be different when it comes time to + restore the macros, so use 0 to indicate this. */ + d->loc = 0; + } + else if (d->loc != r->line_table->builtin_location) + { + const auto map = linemap_lookup (r->line_table, d->loc); + gcc_assert (map); + d->xloc = linemap_expand_location (r->line_table, map, d->loc); + } data->count++; } @@ -785,9 +811,22 @@ cpp_prepare_state (cpp_reader *r, struct save_macro_data **data) struct save_macro_data *d = XNEW (struct save_macro_data); d->array_size = 512; - d->defns = XNEWVEC (uchar *, d->array_size); + d->defns = XNEWVEC (saved_macro, d->array_size); d->count = 0; cpp_forall_identifiers (r, save_macros, d); + + /* Sort the saved macros in order, so the line map gets built up + in chronological order as expected when we load them back in. */ + static line_maps *line_table; + line_table = r->line_table; + qsort (d->defns, d->count, sizeof (saved_macro), + [] (const void *x1, const void *x2) + { + const auto d1 = static_cast<const saved_macro *> (x1); + const auto d2 = static_cast<const saved_macro *> (x2); + return -linemap_compare_locations (line_table, d1->loc, d2->loc); + }); + d->saved_pragmas = _cpp_save_pragma_names (r); *data = d; } @@ -820,16 +859,16 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f, r->state.prevent_expansion = 1; r->state.angled_headers = 0; + const auto old_directive_line = r->directive_line; + /* Run through the carefully-saved macros, insert them. */ + const char *cur_fname = nullptr; for (i = 0; i < data->count; i++) { - cpp_hashnode *h; - size_t namelen; - uchar *defn; - - namelen = ustrcspn (data->defns[i], "( \n"); - h = cpp_lookup (r, data->defns[i], namelen); - defn = data->defns[i] + namelen; + const auto d = data->defns + i; + const size_t namelen = ustrcspn (d->defn, "( \n"); + cpp_hashnode *const h = cpp_lookup (r, d->defn, namelen); + const auto defn = d->defn + namelen; /* The PCH file is valid, so we know that if there is a definition from the PCH file it must be the same as the one we had @@ -839,28 +878,72 @@ cpp_read_state (cpp_reader *r, const char *name, FILE *f, if (cpp_push_buffer (r, defn, ustrchr (defn, '\n') - defn, true) != NULL) { - _cpp_clean_line (r); + if (d->loc == r->line_table->builtin_location) + r->directive_line = r->line_table->builtin_location; + else if (d->loc == 0) + /* This was a command-line-defined macro, preserve that + aspect now. */ + r->directive_line = r->line_table->cmdline_location; + else + { + /* In practice, almost all of the locations ought to be in the + main file, since a PCH cannot be loaded after including + other headers. The (probably sole) exception is that + implicit preincludes can be loaded, so if an implicit + preinclude such as /usr/include/stdc-predef.h has changed + since the PCH was created (in a way not invalidating the + PCH), we can end up here with locations that are not in the + main file. In that case, we may be adding them to the + line_map out of order (the implicit preinclude already + being represented in the line_map we have loaded from the + PCH), but it is probably not going to cause any observable + issue given the constraints on what can appear in a header + before loading a PCH. */ + if (!cur_fname || strcmp (cur_fname, d->xloc.file)) + { + linemap_add (r->line_table, LC_RENAME, d->xloc.sysp, + d->xloc.file, d->xloc.line); + cur_fname = d->xloc.file; + } + r->directive_line = linemap_line_start (r->line_table, + d->xloc.line, 127); + } + + /* Even when the macro was defined in the main file, we have to + use cpp_force_token_locations here, so all tokens in the macro + will have location assigned to the directive line with no + column information. We no longer know even which line the + tokens appeared on originally now, much less the column, since + we only stored the macro definition string after lexing it. */ + cpp_force_token_locations (r, r->directive_line); - /* ??? Using r->line_table->highest_line is not ideal here, but we - do need to use some location that is relative to the new line - map just loaded, not the old one that was in effect when these - macros were lexed. The proper fix is to remember the file name - and line number where each macro was defined, and then add - these locations into the new line map. See PR105608. */ - if (!_cpp_create_definition (r, h, r->line_table->highest_line)) + _cpp_clean_line (r); + if (!_cpp_create_definition (r, h, r->directive_line)) abort (); + cpp_stop_forcing_token_locations (r); _cpp_pop_buffer (r); + + if (d->loc < RESERVED_LOCATION_COUNT) + /* Macros defined on the command line or as builtins are always + implicitly used. It is necessary to note this here or else + -Wunused-macros will complain, because these locations do + return TRUE for MAIN_FILE_P(). For a command line + definition, d->loc was set to 0 when we saved it, so it does + satisfy the check performed here. */ + h->value.macro->used = true; } else abort (); } - free (data->defns[i]); + free (data->defns[i].defn); } r->state = old_state; + r->directive_line = old_directive_line; _cpp_restore_pragma_names (r, data->saved_pragmas); + XDELETEVEC (data->defns); free (data); if (deps_restore (r->deps, f, CPP_OPTION (r, restore_pch_deps) ? name : NULL) diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc index c652e82a8c7..f8011071103 100644 --- a/gcc/c-family/c-opts.cc +++ b/gcc/c-family/c-opts.cc @@ -1681,6 +1681,7 @@ c_finish_options (void) bool cxx_assert_seen_p = false; /* All command line defines must have the same location. */ + line_table->cmdline_location = line_table->highest_line; cpp_force_token_locations (parse_in, line_table->highest_line); for (size_t i = 0; i < deferred_count; i++) { diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc index b8f075e29f7..8f0fea82cad 100644 --- a/gcc/c-family/c-pch.cc +++ b/gcc/c-family/c-pch.cc @@ -347,18 +347,18 @@ c_common_read_pch (cpp_reader *pfile, const char *name, rebuild_location_adhoc_htab (line_table); line_table->trace_includes = saved_trace_includes; - /* Set the current location to the line containing the #include (or the - #pragma GCC pch_preprocess) for the purpose of assigning locations to any - macros that are about to be restored. */ - linemap_add (line_table, LC_ENTER, 0, saved_loc.file, - saved_loc.line > 1 ? saved_loc.line - 1 : saved_loc.line); + /* Set the line_map current location to the start of the file, so that things + remain in order after cpp_read_state() re-adds any macros that were defined + prior to calling gt_pch_restore(). */ + linemap_add (line_table, LC_ENTER, saved_loc.sysp, saved_loc.file, 0); timevar_push (TV_PCH_CPP_RESTORE); cpp_result = cpp_read_state (pfile, name, f, smd); /* Set the current location to the line following the #include, where we were prior to processing the PCH. */ - linemap_line_start (line_table, saved_loc.line, 0); + linemap_add (line_table, LC_RENAME, saved_loc.sysp, saved_loc.file, + saved_loc.line); timevar_pop (TV_PCH_CPP_RESTORE); diff --git a/gcc/testsuite/g++.dg/pch/line-map-3.C b/gcc/testsuite/g++.dg/pch/line-map-3.C index 3390d7adba2..9def08de974 100644 --- a/gcc/testsuite/g++.dg/pch/line-map-3.C +++ b/gcc/testsuite/g++.dg/pch/line-map-3.C @@ -1,23 +1,5 @@ -#define UNUSED_MACRO /* { dg-error "UNUSED_MACRO" "" { xfail *-*-* } } */ -#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" { xfail *-*-* } } */ - -/* { dg-do compile } */ -/* { dg-additional-options "-Werror=unused-macros" } */ - /* PR preprocessor/105608 */ -/* This test case is currently xfailed and requires work in libcpp/pch.cc to - resolve. Currently, the macro location is incorrectly assigned to line 2 - of the header file when read via PCH, because libcpp doesn't try to - assign locations relative to the newly loaded line map after restoring - the PCH. */ - -/* In PCH mode we also complain incorrectly about the command line macro -Dwith_PCH - added by dejagnu; that warning would get suppressed if the macro location were - correctly restored by libcpp to reflect that it was a command line macro. */ -/* { dg-bogus "-:with_PCH" "" { xfail *-*-* } 2 } */ - -/* The reason we used -Werror was to prevent pch.exp from rerunning without PCH; - in that case we would get unnecessary XPASS outputs since the test does work - fine without PCH. Once the bug is fixed, remove the -Werror and switch to - dg-warning. */ -/* { dg-regexp {[^[:space:]]*: some warnings being treated as errors} } */ +/* { dg-do compile } */ +/* { dg-additional-options "-Wunused-macros" } */ +#define UNUSED_MACRO /* { dg-warning "-:UNUSED_MACRO" "" } */ +#include "line-map-3.H" /* { dg-bogus "-:UNUSED_MACRO" "" } */ diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.C b/gcc/testsuite/g++.dg/pch/line-map-4.C new file mode 100644 index 00000000000..cc13a467ca1 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/line-map-4.C @@ -0,0 +1,5 @@ +/* PR preprocessor/105608 */ +/* { dg-do compile } */ +#define INT int /* { dg-error "-:declaration does not declare anything" } */ +#include "line-map-4.H" +INT; /* { dg-note "in expansion of macro" } */ diff --git a/gcc/testsuite/g++.dg/pch/line-map-4.Hs b/gcc/testsuite/g++.dg/pch/line-map-4.Hs new file mode 100644 index 00000000000..3b6178bfae0 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/line-map-4.Hs @@ -0,0 +1 @@ +/* This space intentionally left blank. */