https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105959
--- Comment #10 from Hans-Peter Nilsson <hp at gcc dot gnu.org> --- (In reply to David Malcolm from comment #8) > Note that section 3.1 ("File Format" > "General") specifies: > "A SARIF log file SHALL be encoded in UTF-8 [RFC3629]." > https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html > > Though I suppose it would be possible to escape non-ASCII chars so that the > .sarif file could use the ASCII subset of UTF-8, ISTM the point of that test is heavy use of UTF-8, so you can't get away with using the ASCII subset. (I see an identifier using ideographs? Wouldn't want to review that code... Might as well use Linear A -which you indeed can in UTF-8- - it's all greek to me!) > if there's no other way > around this from the DejaGnu side. Perhaps add a parameter to dg-scan (it enforces exactly two arguments now) that scan-sarif-file can use, as it's always UTF-8, making dg-scan apply "fconfigure $fd -encoding [lindex $orig_args 2]" and the parameter passed as "utf-8" or something like that, since SARIF files are always UTF-8. Assuming that works, of course; completely untested theory.