On Wed, Oct 19, 2022 at 01:17:02PM +0100, Richard Sandiford wrote: > Jakub Jelinek <ja...@redhat.com> writes: > > On Wed, Oct 19, 2022 at 12:54:11PM +0100, Richard Sandiford via Gcc-patches > > wrote: > >> Lewis Hyatt via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > >> > When a GTY'ed struct is streamed to PCH, any plain char* pointers it > >> > contains > >> > (whether they live in GC-controlled memory or not) will be marked for PCH > >> > output by the routine gt_pch_note_object in ggc-common.cc. This routine > >> > special-cases plain char* strings, and in particular it uses strlen() to > >> > get > >> > their length. Thus it does not handle strings with embedded null bytes, > >> > but it > >> > is possible for something PCH cares about (such as a string literal > >> > token in a > >> > macro definition) to contain such embedded nulls. To fix that up, add a > >> > new > >> > GTY option "string_length" so that gt_pch_note_object can be informed the > >> > actual length it ought to use, and use it in the relevant libcpp structs > >> > (cpp_string and ht_identifier) accordingly. > >> > >> This isn't really my area, as I'm about to demonstrate with this > >> question, but: regarding > >> > >> if (note_ptr_fn == gt_pch_p_S) > >> (*slot)->size = strlen ((const char *)obj) + 1; > >> else > >> (*slot)->size = ggc_get_size (obj); > >> > >> do you know why the PCH code goes out of its way to handle the sizes of > >> strings specially? Are there enough garbage strings in the string pool > >> that it's worth optimising the size of the saved memory for strings but > >> not for other types of object? Or is the gt_pch_p_S test needed for > >> correctness, rather than just being an optimisation? > > > > Just guessing, not all GC strings live in the stringpool. > > Isn't e.g. ggc_strdup just a GC allocation where the string length > > isn't stored anywhere? > > Is that different from other GC VLA allocations though? I thought > ultimately we just tried to save and restore the containing pages.
I think just the objects in it, not entire pages (ggc_get_size (obj) sized chunks for non-strings). > > And sometimes it isn't even GC allocated, e.g. ggc_strdup ("") just > > returns ""; I guess const char * pointers in GC memory can also point > > to string literals in .rodata and for PCH we move them. > > Ah, OK, that would definitely explain it, thanks. In that case, > are you OK with the patch, as a way of continuing to support rodata > string pointers while also allowing embedded nuls? LGTM. Jakub