On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> Add a new linemap reason LC_GEN which enables encoding the location
> of data
> that was generated during compilation and does not appear in any
> source file.
> There could be many use cases, such as, for instance, referring to
> the content
> of builtin macros (not yet implemented, but an easy lift after this
> one.) The
> first intended application is to create a place to store the input to
> a
> _Pragma directive, so that proper locations can be assigned to those
> tokens. This will be done in a subsequent commit.
> 
> The actual change needed to the line-maps API in libcpp is not too
> large and
> requires no space overhead in the line map data structures (on 64-bit
> systems
> that is; one newly added data member to class line_map_ordinary sits
> inside
> former padding bytes.) An LC_GEN map is just an ordinary map like any
> other,
> but the TO_FILE member that normally points to the file name points
> instead to
> the actual data.  This works automatically with PCH as well, for the
> same
> reason that the file name makes its way into a PCH.  In order to
> avoid
> confusion, the member has been renamed from TO_FILE to DATA, and
> associated
> accessors adjusted.
> 
> Outside libcpp, there are many small changes but most of them are to
> selftests, which are necessarily more sensitive to implementation
> details. From the perspective of the user (the "user", here, being a
> frontend
> using line maps or else the diagnostics infrastructure), the chief
> visible
> change is that the function location_get_source_line() should be
> passed an
> expanded_location object instead of a separate filename and line
> number.  This
> is not a big change because in most cases, this information came
> anyway from a
> call to expand_location and the needed expanded_location object is
> readily
> available. The new overload of location_get_source_line() uses the
> extra
> information in the expanded_location object to obtain the data from
> the
> in-memory buffer when it originated from an LC_GEN map.
> 
> Until the subsequent patch that starts using LC_GEN maps, none are
> yet
> generated within GCC, hence nothing is added to the testsuite here;
> but all
> relevant selftests have been extended to cover generated data maps in
> addition
> to normal files.

[..snip...]

Thanks for the updated patch.

Reading this patch, it felt a bit unnatural to me to have an
  (exploded location, source line) 
pair where the exploded location seems to be representing "which source
file or generated buffer", but the line/column info in that
exploded_location is to be ignored in favor of the 2nd source line.

I think we're missing a class: something that identifies either a
specific source file, or a specific generated buffer.

How about something like either:

class source_id
{
public:
  source_id (const char *filename)
  : m_filename_or_buffer (filename),
    m_len (0)
  {
  }

  explicit source_id (const char *buffer, unsigned buffer_len)
  : m_filename_or_buffer (buffer),
    m_len (buffer_len)
  {
    linemap_assert (buffer_len > 0);
  }

private:
  const char *m_filename_or_buffer;
  unsigned m_len;  // where 0 means "it's a filename"
};

or:

class source_id
{
public:
  source_id (const char *filename)
  : m_ptr (filename),
    m_is_buffer (false)
  {
  }

  explicit source_id (const linemap_ordinary *buffer_linemap)
  : m_ptr (buffer_linemap),
    m_is_buffer (true)
  {
  }

private:
  const void *m_ptr;
  bool m_is_buffer;
};

and use one of these "source_id file" in place of "const char *file",
rather than replacing such things with expanded_location?

> diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
> index e8d3dece770..4164fa0b1ba 100644
> --- a/gcc/c-family/c-indentation.cc
> +++ b/gcc/c-family/c-indentation.cc
> @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
>                  unsigned int *first_nws,
>                  unsigned int tab_width)
>  {
> -  char_span line = location_get_source_line (exploc.file, exploc.line);
> +  char_span line = location_get_source_line (exploc);

...so this might contine to be:

  char_span line = location_get_source_line (exploc.file, exploc.line);

...but expanded_location's "file" field would become a source_id,
rather than a const char *.  It looks like doing do might make a lot of
"is this the same file or buffer?"  turn into comparisons of source_id
instances.

So I think expanded_location would become:

typedef struct
{
  /* Either the name of the source file involved, or the
     specific generated buffer.  */
  source_id file;

  /* The line-location in the source file.  */
  int line;

  int column;

  void *data;

  /* In a system header?. */
  bool sysp;
} expanded_location;

and we wouldn't need to add these extra fields:

> +
> +  /* If generated data, the data and its length.  The data may contain 
> embedded
> +   nulls and need not be null-terminated.  */
> +  unsigned int generated_data_len;
> +  const char *generated_data;
>  } expanded_location;

and we could pass around source_id instances when identifying specific
filenames/generated buffers.
 
Does this idea simplify/clarify the patch, or make it more complicated?

[...snip...]

Thoughts?
Dave

Reply via email to