On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote: > Add a new linemap reason LC_GEN which enables encoding the location > of data > that was generated during compilation and does not appear in any > source file. > There could be many use cases, such as, for instance, referring to > the content > of builtin macros (not yet implemented, but an easy lift after this > one.) The > first intended application is to create a place to store the input to > a > _Pragma directive, so that proper locations can be assigned to those > tokens. This will be done in a subsequent commit. > > The actual change needed to the line-maps API in libcpp is not too > large and > requires no space overhead in the line map data structures (on 64-bit > systems > that is; one newly added data member to class line_map_ordinary sits > inside > former padding bytes.) An LC_GEN map is just an ordinary map like any > other, > but the TO_FILE member that normally points to the file name points > instead to > the actual data. This works automatically with PCH as well, for the > same > reason that the file name makes its way into a PCH. In order to > avoid > confusion, the member has been renamed from TO_FILE to DATA, and > associated > accessors adjusted. > > Outside libcpp, there are many small changes but most of them are to > selftests, which are necessarily more sensitive to implementation > details. From the perspective of the user (the "user", here, being a > frontend > using line maps or else the diagnostics infrastructure), the chief > visible > change is that the function location_get_source_line() should be > passed an > expanded_location object instead of a separate filename and line > number. This > is not a big change because in most cases, this information came > anyway from a > call to expand_location and the needed expanded_location object is > readily > available. The new overload of location_get_source_line() uses the > extra > information in the expanded_location object to obtain the data from > the > in-memory buffer when it originated from an LC_GEN map. > > Until the subsequent patch that starts using LC_GEN maps, none are > yet > generated within GCC, hence nothing is added to the testsuite here; > but all > relevant selftests have been extended to cover generated data maps in > addition > to normal files.
[..snip...] Thanks for the updated patch. Reading this patch, it felt a bit unnatural to me to have an (exploded location, source line) pair where the exploded location seems to be representing "which source file or generated buffer", but the line/column info in that exploded_location is to be ignored in favor of the 2nd source line. I think we're missing a class: something that identifies either a specific source file, or a specific generated buffer. How about something like either: class source_id { public: source_id (const char *filename) : m_filename_or_buffer (filename), m_len (0) { } explicit source_id (const char *buffer, unsigned buffer_len) : m_filename_or_buffer (buffer), m_len (buffer_len) { linemap_assert (buffer_len > 0); } private: const char *m_filename_or_buffer; unsigned m_len; // where 0 means "it's a filename" }; or: class source_id { public: source_id (const char *filename) : m_ptr (filename), m_is_buffer (false) { } explicit source_id (const linemap_ordinary *buffer_linemap) : m_ptr (buffer_linemap), m_is_buffer (true) { } private: const void *m_ptr; bool m_is_buffer; }; and use one of these "source_id file" in place of "const char *file", rather than replacing such things with expanded_location? > diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc > index e8d3dece770..4164fa0b1ba 100644 > --- a/gcc/c-family/c-indentation.cc > +++ b/gcc/c-family/c-indentation.cc > @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc, > unsigned int *first_nws, > unsigned int tab_width) > { > - char_span line = location_get_source_line (exploc.file, exploc.line); > + char_span line = location_get_source_line (exploc); ...so this might contine to be: char_span line = location_get_source_line (exploc.file, exploc.line); ...but expanded_location's "file" field would become a source_id, rather than a const char *. It looks like doing do might make a lot of "is this the same file or buffer?" turn into comparisons of source_id instances. So I think expanded_location would become: typedef struct { /* Either the name of the source file involved, or the specific generated buffer. */ source_id file; /* The line-location in the source file. */ int line; int column; void *data; /* In a system header?. */ bool sysp; } expanded_location; and we wouldn't need to add these extra fields: > + > + /* If generated data, the data and its length. The data may contain > embedded > + nulls and need not be null-terminated. */ > + unsigned int generated_data_len; > + const char *generated_data; > } expanded_location; and we could pass around source_id instances when identifying specific filenames/generated buffers. Does this idea simplify/clarify the patch, or make it more complicated? [...snip...] Thoughts? Dave