On Tue, 2 May 2017, Martin Sebor wrote:

> In bug 80280 - Missing closing quote (%>) c/semantics.c and
> c/c-typeck.c, a translator points out one of a number of kinds
> of cosmetic problems that tend to come up late in development,
> during translation of GCC messages.  Other, arguably more minor,
> kinds of issues are caused by forgetting to use proper quoting
> when referencing tree nodes, such as %D or %T.
> 
> To help avoid these kinds of bugs, the attached patch enhances
> the -Wformat checker to detect these common quoting issues and
> report them as warnings.

As I see it, there are four kinds of format specifiers in this context:

* Specifiers that have no special interaction with quoting and may be used 
either quoted or unquoted.  For example, %d, %s and %E.  (%E is only of 
that kind when used for expressions - when hopefully it should only be 
used for simple expressions such as constants.  Uses for identifiers 
should be quoted; cases where there can be complex expressions should be 
replaced by use of location ranges.  Ideally identifiers and expressions 
should have different formats, so these cases can be distinguished and so 
identifiers can end up with a static type other than "tree".)

* Specifiers that should be quoted.  For example, %D.

* Specifiers that open quoting (%<).

* Specifiers that close quoting (%>).

(In addition, the q flag serves to quote the specifier it's applied to.)

> The remaining six patches in the series correct the problems
> highlighted by the warning and get GCC to bootstrap and pass
> regression tests on x86_64.  I suspect additional fixes will
> be needed to get GCC to bootstrap on other targets.  I'll do
> powerpc64le next but besides a general review I'm looking for
> suggestions how to do the same cleanup on all the other targets
> with the least disruption. (E.g., if there's a way to roll out
> a warning one target at a time I could introduce it under
> a suboption of -Wformat and enable the subobption with -Werror
> only on the targets that have already been verified.)

Use contrib/config-list.mk, with a native compiler with this patch in the 
PATH, to test building compilers for many configurations.  (No doubt 
you'll also find existing build issues, which may or may not be filed in 
Bugzilla.)

> +      /* Diagnose nested or unmatched quoting directives such as GCC's
> +      "%<...%<" and "%>...%>".  As a special case, a directive can
> +      be considered to both begin and end quoting (e.g., GCC's %E).
> +      Such a directive is recognized but not diagosed.  */

I don't think it makes conceptual sense to consider %E to both begin and 
end quoting.  I'd expect %E to have exactly the same begin_quote and 
end_quote (or whatever) data as %d and %s, because it has the same 
properties (may be used quoted or unquoted).

> diff --git a/gcc/c-family/c-format.h b/gcc/c-family/c-format.h
> index 13ca8ea..8932861 100644
> --- a/gcc/c-family/c-format.h
> +++ b/gcc/c-family/c-format.h
> @@ -132,6 +132,11 @@ struct format_type_detail
>  struct format_char_info
>  {
>    const char *format_chars;
> +  /* Strings of FORMAT_CHARS characters that begin and end quoting,
> +     respectively, and pairs of which should be matched in the same
> +     format string.  NULL when no such pairs exist.  */
> +  const char *quote_begin_chars;
> +  const char *quote_end_chars;

I don't think this comment is sufficiently detailed to make it obvious 
what should appear in each field for each of the four kinds of format 
specifiers I enumerated.  The best I can reverse-engineer from the code 
is: NULL for specifiers that may be used either quoted or unquoted *or* 
listing those specifiers in both quote_begin_chars and quote_end_chars 
(but I think the option of listing them in both should be removed as 
conceptually wrong); if the character is an opening or closing quote, list 
it in the appropriate one; if it must be used quoted, but isn't a quote 
itself, both strings must be non-NULL but it must not be listed in either.

Whether that's right or wrong, the comment needs to make the rules clear.

-- 
Joseph S. Myers
jos...@codesourcery.com

Reply via email to