> Am 08.09.2022 um 00:05 schrieb Lewis Hyatt via Gcc-patches > <gcc-patches@gcc.gnu.org>: > > The function rebuild_location_adhoc_htab() was meant to reconstruct the > adhoc location hash map after restoring a line_maps instance from a > PCH. However, the function has never performed as intended because it > missed the last step of adding the data into the newly reconstructed hash > map. This patch fixes that. > > It does not seem possible to construct a test case such that the current > incorrect behavior is observable as a compiler issue. It would be > observable, if it were possible for a precompiled header to contain an > adhoc location with a non-zero custom data pointer. But currently, such > data pointers are used only by the middle end to track inlining > information, and this happens later, too late to show up in a PCH. > > I also noted that location_adhoc_data_update, which updates the hash map > pointers in a different scenario, was relying on undefined pointer > arithmetic behavior. I'm not aware of this having caused any issue in > practice, but in this patch I have also changed it to use defined pointer > operations instead. Ok. Thanks, Richard > libcpp/ChangeLog: > > * line-map.cc (location_adhoc_data_update): Remove reliance on > undefined behavior. > (get_combined_adhoc_loc): Likewise. > (rebuild_location_adhoc_htab): Fix issue where the htab was not > properly updated. > --- > > Notes: > Hello- > > While working on something unrelated in line-map.cc, I noticed that the > function rebuild_location_adhoc_htab(), whose job is to reconstruct the > adhoc data hash table after a line_maps instance is reconstructed from PCH, > doesn't actually rebuild the hash table at all: > > void > rebuild_location_adhoc_htab (line_maps *set) > { > unsigned i; > set->location_adhoc_data_map.htab = > htab_create (100, location_adhoc_data_hash, location_adhoc_data_eq, > NULL); > for (i = 0; i < set->location_adhoc_data_map.curr_loc; i++) > htab_find_slot (set->location_adhoc_data_map.htab, > set->location_adhoc_data_map.data + i, INSERT); > ^^^^^^^^^^^^^^ > } > > In order to have the intended effect, it needs to set the return value of > htab_find_slot to be set->location_adhoc_data_map.data + i, otherwise it > doesn't effectively do anything except make the hash table think it has > curr_loc elements set, when in fact it has 0. Subsequent calls to > htab_traverse, for instance, will do nothing, and any lookups will also > fail. > > I tried for some time to construct a test case that would demonstrate an > observable consequence of this issue, but I don't think it's possible... > The > nontrivial uses of this hash map are in the middle end (e.g. to produce the > trace of where a given expression was inlined from), and all this happens > after PCH was read in, and doesn't require any state to be read from the > PCH. It would become apparent in the future, however, if the ability to > attach arbitrary data to an adhoc location were used in other ways, perhaps > somewhere in libcpp; in that hypothetical case, the data would be lost when > reading back in the PCH. > > There is another kinda related function, location_adhoc_data_update, which > updates all the pointers in the hash map whenever they are invalidated. It > seems to me, that this function invokes undefined behavior, since it adds > an > arbitrary offset to the pointers, which do not necessarily point into the > same array after they were realloced. I don't think it's led to any > problems > in practice but in this patch I also changed that to use well-defined > operations. Note sure how people may feel about that, since it does require > on the surface 2x as many operations with this change, but I can't see how > the current approach is guaranteed to be valid on all architectures? > > Bootstrap + regtest looks good on x86-64 Linux. Thanks a lot to whoever may > have time to take a look at it. > > -Lewis > > libcpp/line-map.cc | 41 +++++++++++++++++++++++++++-------------- > 1 file changed, 27 insertions(+), 14 deletions(-) > > diff --git a/libcpp/line-map.cc b/libcpp/line-map.cc > index 62077c3857c..391f1d4bbc1 100644 > --- a/libcpp/line-map.cc > +++ b/libcpp/line-map.cc > @@ -85,27 +85,38 @@ location_adhoc_data_eq (const void *l1, const void *l2) > && lb1->data == lb2->data); > } > > -/* Update the hashtable when location_adhoc_data is reallocated. */ > +/* Update the hashtable when location_adhoc_data_map::data is reallocated. > + The param is an array of two pointers, the previous value of the data > + pointer, and then the new value. The pointers stored in the hash map > + are then rebased to be relative to the new data pointer instead of the > + old one. */ > > static int > -location_adhoc_data_update (void **slot, void *data) > +location_adhoc_data_update (void **slot_v, void *param_v) > { > - *((char **) slot) > - = (char *) ((uintptr_t) *((char **) slot) + *((ptrdiff_t *) data)); > + const auto slot = reinterpret_cast<location_adhoc_data **> (slot_v); > + const auto param = static_cast<location_adhoc_data **> (param_v); > + *slot = (*slot - param[0]) + param[1]; > return 1; > } > > -/* Rebuild the hash table from the location adhoc data. */ > +/* The adhoc data hash table is not part of the GGC infrastructure, so it was > + not initialized when SET was reconstructed from PCH; take care of that by > + rebuilding it from scratch. */ > > void > rebuild_location_adhoc_htab (line_maps *set) > { > - unsigned i; > set->location_adhoc_data_map.htab = > htab_create (100, location_adhoc_data_hash, location_adhoc_data_eq, > NULL); > - for (i = 0; i < set->location_adhoc_data_map.curr_loc; i++) > - htab_find_slot (set->location_adhoc_data_map.htab, > - set->location_adhoc_data_map.data + i, INSERT); > + for (auto p = set->location_adhoc_data_map.data, > + end = p + set->location_adhoc_data_map.curr_loc; > + p != end; ++p) > + { > + const auto slot = reinterpret_cast<location_adhoc_data **> > + (htab_find_slot (set->location_adhoc_data_map.htab, p, INSERT)); > + *slot = p; > + } > } > > /* Helper function for get_combined_adhoc_loc. > @@ -211,8 +222,7 @@ get_combined_adhoc_loc (line_maps *set, > if (set->location_adhoc_data_map.curr_loc >= > set->location_adhoc_data_map.allocated) > { > - char *orig_data = (char *) set->location_adhoc_data_map.data; > - ptrdiff_t offset; > + const auto orig_data = set->location_adhoc_data_map.data; > /* Cast away extern "C" from the type of xrealloc. */ > line_map_realloc reallocator = (set->reallocator > ? set->reallocator > @@ -226,10 +236,13 @@ get_combined_adhoc_loc (line_maps *set, > reallocator (set->location_adhoc_data_map.data, > set->location_adhoc_data_map.allocated > * sizeof (struct location_adhoc_data)); > - offset = (char *) (set->location_adhoc_data_map.data) - orig_data; > if (set->location_adhoc_data_map.allocated > 128) > - htab_traverse (set->location_adhoc_data_map.htab, > - location_adhoc_data_update, &offset); > + { > + location_adhoc_data *param[2] > + = {orig_data, set->location_adhoc_data_map.data}; > + htab_traverse (set->location_adhoc_data_map.htab, > + location_adhoc_data_update, param); > + } > } > *slot = set->location_adhoc_data_map.data > + set->location_adhoc_data_map.curr_loc;
Re: [PATCH] pch: Fix the reconstruction of adhoc data hash table
Richard Biener via Gcc-patches Thu, 08 Sep 2022 00:42:07 -0700
- [PATCH] pch: Fix the reconstruction of adho... Lewis Hyatt via Gcc-patches
- Re: [PATCH] pch: Fix the reconstructio... Richard Biener via Gcc-patches