On Wed, Oct 19, 2022 at 8:23 AM Jakub Jelinek <ja...@redhat.com> wrote:
>
> On Wed, Oct 19, 2022 at 01:17:02PM +0100, Richard Sandiford wrote:
> > Jakub Jelinek <ja...@redhat.com> writes:
> > > On Wed, Oct 19, 2022 at 12:54:11PM +0100, Richard Sandiford via 
> > > Gcc-patches wrote:
> > >> Lewis Hyatt via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > >> > When a GTY'ed struct is streamed to PCH, any plain char* pointers it 
> > >> > contains
> > >> > (whether they live in GC-controlled memory or not) will be marked for 
> > >> > PCH
> > >> > output by the routine gt_pch_note_object in ggc-common.cc. This routine
> > >> > special-cases plain char* strings, and in particular it uses strlen() 
> > >> > to get
> > >> > their length. Thus it does not handle strings with embedded null 
> > >> > bytes, but it
> > >> > is possible for something PCH cares about (such as a string literal 
> > >> > token in a
> > >> > macro definition) to contain such embedded nulls. To fix that up, add 
> > >> > a new
> > >> > GTY option "string_length" so that gt_pch_note_object can be informed 
> > >> > the
> > >> > actual length it ought to use, and use it in the relevant libcpp 
> > >> > structs
> > >> > (cpp_string and ht_identifier) accordingly.
> > >>
> > >> This isn't really my area, as I'm about to demonstrate with this
> > >> question, but: regarding
> > >>
> > >>   if (note_ptr_fn == gt_pch_p_S)
> > >>     (*slot)->size = strlen ((const char *)obj) + 1;
> > >>   else
> > >>     (*slot)->size = ggc_get_size (obj);
> > >>
> > >> do you know why the PCH code goes out of its way to handle the sizes of
> > >> strings specially?  Are there enough garbage strings in the string pool
> > >> that it's worth optimising the size of the saved memory for strings but
> > >> not for other types of object?  Or is the gt_pch_p_S test needed for
> > >> correctness, rather than just being an optimisation?
> > >
> > > Just guessing, not all GC strings live in the stringpool.
> > > Isn't e.g. ggc_strdup just a GC allocation where the string length
> > > isn't stored anywhere?
> >
> > Is that different from other GC VLA allocations though?  I thought
> > ultimately we just tried to save and restore the containing pages.
>
> I think just the objects in it, not entire pages (ggc_get_size (obj)
> sized chunks for non-strings).
>
> > > And sometimes it isn't even GC allocated, e.g. ggc_strdup ("") just
> > > returns ""; I guess const char * pointers in GC memory can also point
> > > to string literals in .rodata and for PCH we move them.
> >
> > Ah, OK, that would definitely explain it, thanks.  In that case,
> > are you OK with the patch, as a way of continuing to support rodata
> > string pointers while also allowing embedded nuls?
>
> LGTM.
>

Thank you both very much, I will push that then.
My understanding is that a GTY()ed struct can contain arbitrary char*
pointers as a special case, they need not be in the string pool. They
will be silently ignored by GC marking routines if they are not within
ggc's pages (as opposed to any other pointer, which will abort if it
wasn't under ggc's control), and they will make it into PCH by the
gt_pch_note_object mechanism. For example, struct line_maps contains a
char* to store the file name for each map, which is just an ordinary
malloc()ed string owned by the cpp_reader object.

-Lewis

Reply via email to