On 16/04/15 17:01, David Malcolm wrote:
Attached is a work-in-progress patch for a new
-Wmisleading-indentation
warning I've been experimenting with, for GCC 6.
It sounds very cool...
(D) tabs vs spaces. This is probably the biggest can of worms.
I would suggest to be very conservative when warning. In case of doubt, do not
warn.
An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)
In case of a potential warning, you could detect if there is a mix of tab and
spaces for the lines affected that may make the warning bogus, and not warn in
that case. That is,
<6 spaces>if (flagC)
<1 tab><space>foo ();
<1 tab><space>bar ();
warns but
<6 spaces>if (flagC)
<8 spaces>foo ();
<2 tabs>bar ();
does not because you don't know how many spaces those 2 tabs take.
Also, in order to detect whether something is a tab or a space, why not simply
re-open and seek? The input file is probably already in memory. See how
diagnostic_show_locus in diagnostics.c gets lines from given locations.
Not even Emacs, which is a GNU package, can get tabs vs. spaces right and
compile-goto-error goes to the wrong column. That seems a can of worms not
worth opening.
+
+ This is fed three things by the frontend:
+
+ (A) a series of location_t by the frontend's tokenizer,
+ corresponding to the locations of the token stream. It uses
+ this to model a stack of "visual blocks", corresponding to
+ indentation levels within the source code.
I don't understand why you need more than three, one for the conditional
statement, another for the solo statement and another for the next-statement.
I also don't get why you need to pass the whole block and new_stmt to
visual_block, don't the locations returned by EXPR_LOCATION suffice?
Given that, the C++ tentative parsing should not be a problem since there is no
tentative parsing (or there should not be) of if/else/while, and you only care
about the first token when parsing the other two statements.
@@ -236,6 +236,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
token->keyword = RID_MAX;
token->pragma_kind = PRAGMA_NONE;
+ if (vis_parser)
+ vis_parser->on_token (token->location);
+
Why you need to save every location? Why not just do this when the parser
detects an if/else/while? Then,
@@ -5081,7 +5086,11 @@ c_parser_if_body (c_parser *parser, bool *if_p)
else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
add_stmt (c_parser_compound_statement (parser));
else
- c_parser_statement_after_labels (parser);
+ {
+ c_parser_statement_after_labels (parser);
+ if (vis_parser)
+ vis_parser->on_solo_stmt (block, if_loc);
+ }
return c_end_compound_stmt (body_loc, block, flag_isoc99);
}
You can simply here peek the location of the next token before and after
c_parser_statement_after_labels (parser). Then warn if they are equal or if
they are on the same line, no? You may need to skip pragmas.
+ Note that we can't simply use statement locations; for example, in:
+ if (flag)
+ x = 1;
+ the "if (flag)"'s location is at the open-paren, and that of the
+ assignment "x = 1;" is at the equals-sign, so any attempt to use
+ statement locations can be fooled by varying the spacing:
Aren't these bugs (or missing features) in the FE locations? I know that the
location of the assignment is at '=' because we don't have (yet) locations for
constants or variables (all binary expressions should have 3 locations!).
In any case, don't you have the location of the token that starts the
statements? This is all you need as far as I understand.
And why the 'if' points to the open paren? One can easily find the open-paren
location if needed by seeking in the input buffer.
+ /* FIXME: what about entirely empty lines???
+ Presumably they simply don't get tokens. */
Empty lines no, but preprocessor directives do.
Cheers,
Manuel.