https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105959

--- Comment #10 from Hans-Peter Nilsson <hp at gcc dot gnu.org> ---
(In reply to David Malcolm from comment #8)
> Note that section 3.1 ("File Format" > "General") specifies:
>   "A SARIF log file SHALL be encoded in UTF-8 [RFC3629]."
> https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html
> 
> Though I suppose it would be possible to escape non-ASCII chars so that the
> .sarif file could use the ASCII subset of UTF-8,

ISTM the point of that test is heavy use of UTF-8, so you can't get away with
using the ASCII subset.  (I see an identifier using ideographs?  Wouldn't want
to review that code...  Might as well use Linear A -which you indeed can in
UTF-8- - it's all greek to me!)

> if there's no other way
> around this from the DejaGnu side.

Perhaps add a parameter to dg-scan (it enforces exactly two arguments now) that
scan-sarif-file can use, as it's always UTF-8, making dg-scan apply "fconfigure
$fd -encoding [lindex $orig_args 2]" and the parameter passed as "utf-8" or
something like that, since SARIF files are always UTF-8.  Assuming that works,
of course; completely untested theory.

Reply via email to