On Wed, 2023-08-09 at 18:14 -0400, Lewis Hyatt wrote: > This patch enhances location_get_source_line(), which is the primary > interface provided by the diagnostics infrastructure to obtain the line of > source code corresponding to a given location, so that it understands > generated data locations in addition to normal file-based locations. This > involves changing the argument to location_get_source_line() from a plain > file name, to a source_id object that can represent either type of location. > > gcc/ChangeLog: > > * input.cc (class data_cache_slot): New class. > (file_cache::lookup_data): New function. > (diagnostics_file_cache_forcibly_evict_data): New function. > (file_cache::forcibly_evict_data): New function. > (file_cache::evicted_cache_tab_entry): Generalize (via a template) > to work for both file_cache_slot and data_cache_slot. > (file_cache::add_file): Adapt for new interface to > evicted_cache_tab_entry. > (file_cache::add_data): New function. > (data_cache_slot::create): New function. > (file_cache::file_cache): Support the new m_data_slots member. > (file_cache::~file_cache): Likewise. > (file_cache::lookup_or_add_data): New function. > (file_cache::lookup_or_add): New function that calls either > lookup_or_add_data or lookup_or_add_file as appropriate. > (location_get_source_line): Change the FILE_PATH argument to a > source_id SRC, and use it to support obtaining source lines from > generated data as well as from files. > (location_compute_display_column): Support generated data using the > new features of location_get_source_line. > (dump_location_info): Likewise. > * input.h (location_get_source_line): Adjust prototype. Add a new > convenience overload taking an expanded_location. > (class cache_data_source): Declare. > (class data_cache_slot): Declare. > (class file_cache): Declare new members. > (diagnostics_file_cache_forcibly_evict_data): Declare. > --- > gcc/input.cc | 171 ++++++++++++++++++++++++++++++++++++++++----------- > gcc/input.h | 23 +++++-- > 2 files changed, 153 insertions(+), 41 deletions(-) > > diff --git a/gcc/input.cc b/gcc/input.cc > index 9377020b460..790279d4273 100644 > --- a/gcc/input.cc > +++ b/gcc/input.cc > @@ -207,6 +207,28 @@ private: > void maybe_grow (); > }; > > +/* This is the implementation of cache_data_source for generated > + data that is already in memory. */ > +class data_cache_slot final : public cache_data_source
It occurred to me: why are we caching accessing a buffer that's already in memory - but we're also caching the line-splitting information, and providing the line-splitting algorithm with a consistent interface to the data, right? [...snip...] > @@ -397,6 +434,15 @@ diagnostics_file_cache_forcibly_evict_file (const char > *file_path) > global_dc->m_file_cache->forcibly_evict_file (file_path); > } > > +void > +diagnostics_file_cache_forcibly_evict_data (const char *data, > + unsigned int data_len) > +{ > + if (!global_dc->m_file_cache) > + return; > + global_dc->m_file_cache->forcibly_evict_data (data, data_len); Maybe we should rename diagnostic_context's m_file_cache to m_source_cache? (and class file_cache for that matter?) But if so, that can/should be a followup/separate patch. [...snip...] > @@ -525,10 +582,22 @@ file_cache_slot::create (const > file_cache::input_context &in_context, > return true; > } > > +void > +data_cache_slot::create (const char *data, unsigned int data_len, > + unsigned int highest_use_count) > +{ > + reset (); > + on_create (highest_use_count + 1, > + total_lines_num (source_id {data, data_len})); > + m_data_begin = data; > + m_data_end = data + data_len; > +} > + > /* file_cache's ctor. */ > > file_cache::file_cache () > -: m_file_slots (new file_cache_slot[num_file_slots]) > + : m_file_slots (new file_cache_slot[num_file_slots]), > + m_data_slots (new data_cache_slot[num_file_slots]) Should "num_file_slots" be renamed to "num_slots"? I assume you're using the same value for both kinds of slot since the file_cache::evicted_cache_tab_entry template uses this. I suppose the number could be passed in as an argument to that function if we wanted to have different sizes for the two kinds, but I don't think it matters. [...snip...] > @@ -912,26 +1000,22 @@ cache_data_source::read_line_num (size_t line_num, > If the function fails, a NULL char_span is returned. */ > > char_span > -location_get_source_line (const char *file_path, int line) > +location_get_source_line (source_id src, int line) > { > - const char *buffer = NULL; > - ssize_t len; > - > - if (line == 0) > - return char_span (NULL, 0); > - > - if (file_path == NULL) > - return char_span (NULL, 0); > + const char_span fail (nullptr, 0); > + if (!src || line <= 0) > + return fail; Looking at source_id's operator bool, are there effectively three kinds of source_id? (a) file names (b) generated buffer (c) NULL == m_filename_or_buffer What does (c) mean? Is it a "something's gone wrong/error" state? Or is this more a special-case of (a)? (in that the m_len for such a case would be zero) Should source_id's 2-param ctor have an assert that the ptr is non- NULL? [...snip...] The patch is OK for trunk as-is, but note the question about the source_id ctor above. Thanks Dave