On Thu, 2025-04-03 at 13:58 +0200, Rasmus Villemoes wrote:
> In many setups, especially when CI and/or some meta-build system like
> Yocto or buildroot, is involved, gcc ends up being invoked using
> absolute path names, which are often long and uninteresting.
> 
> That amounts to a lot of noise both when trying to decipher the
> warning or error, when the warning text is copy-pasted to a commit
> fixing the issue, or posted to a mailing list. In the latter case,
> the
> path might also reveal details that should not be public; as a made-
> up
> example, seeing the string
> 
>   /home/ravi/customers/acme-corp/yocto/tmp/work/bird-spa/
> 
> would make Road Runner wary of any "free plumage treatment"
> sign.
> 
> Removing the prefixes manually is tedious and error-prone. So similar
> to the other f*-prefix-map options, provide an option allowing one to
> strip/remap certain prefixes when emitting diagnostics.
> 
> [Of course, one still has to be very careful whenever there might be
> confidential information in the messages, but having the prefix
> likely
> to contain customer or project names removed automatically is
> helpful.]
> ---
> This is mostly just a POC, to ask if something like this could be
> implemented. It obviously lacks documentation and tests.
> 
> I've tested this very lightly, and it works as expected for the
> simple
> cases I've tried.  But I don't know if I managed to find all the
> locations that would need to call remap_diag_filename().
> 
> To make linking work, I had to do a bit of juggling, moving
> file-prefix-map.o to OBJS-libcommon in order to make that function
> available to the diagnostic*.cc files, and then also move the
> definition of flag_canon_prefix_map to file-prefix-map.cc. It builds
> for me, but maybe it's broken in some way I don't know about.

Hi Rasmus

Thanks for the patch; sorry for the delay in getting back to you.

Various thoughts:

The patch only touches "text" output, it doesn't affect "sarif" (or
"json", but that's deprecated and I plan to remove it soon).  

FWIW SARIF has some interesting support for redacting sensitive
information; see e.g. 
"3.5.2 Redactable strings"
https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790687

and "3.14.28 redactionTokens
property"https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790762

I don't know if we want to implement redaction within GCC's SARIF
output code, or to simply defer that to 3rd party post-processing
tools.

If we're going to punt on the issue of redaction, then one approach
here might be to say that the new option only affects text output.  I'm
not sure here.  Is redaction the only use-case, or is this also about
cleaning up long and irrelevant paths in CI output?


[...snip...]

>  
>    const char *line_col = maybe_line_and_column (line, col);
> diff --git a/gcc/file-prefix-map.cc b/gcc/file-prefix-map.cc
> index 3a77b195ae3..7ae3e7f95d5 100644
> --- a/gcc/file-prefix-map.cc
> +++ b/gcc/file-prefix-map.cc

[...snip...]

> +
> +/* Remap using -fdiag-prefix-map.  Return the GC-allocated new name
> +   corresponding to FILENAME or FILENAME if no remapping was
> performed.  */
> +const char *
> +remap_diag_filename (const char *filename)
> +{
> +  return remap_filename (diag_prefix_maps, filename);
> +}

The returned string is GC-allocated, via ggc_internal_alloc.  For host
binaries linked against ggc-page.o this buffer will eventually be
garbage-collected, but for host binaries linked against ggc-none.o this
is a memory leak.  Perhaps the remap_filename code could simply return
a std::string?

Dave

Reply via email to