On 05/19/2017 02:14 PM, Marek Polacek wrote: > On Thu, May 18, 2017 at 01:22:02PM +0200, Martin Liška wrote: >> On 05/16/2017 09:14 PM, David Malcolm wrote: >>> On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote: >>>> Hi. >>>> >>>> I sent this email to David some time ago, but it should be probably >>>> answered >>>> on gcc mailing list. >>> >>>> I have idea one to improve gcov tool and I'm interested in more >>>> precise locations for gimple >>>> statements. For gcov purpose, we dump location in ipa-profile pass, >>>> which is an early IPA >>>> pass and this data is used by gcov tool to map statements (blocks) to >>>> lines of code. >>>> >>>> I did a small experiment on the place we emit the location data: >>>> inform (gimple_location (stmt), "output_location"); >>>> >>>> and it shows for: >>>> $ cat m2.c >>>> unsigned int >>>> UuT (void) >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int >>>> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ >>>> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ >>>> return ret; } >>>> >>>> int >>>> main (int argc, char **argv) >>>> { >>>> UuT (); >>>> return 0; >>>> } >>>> >>>> $ gcc --coverage m2.c >>>> m2.c: In function ‘main’: >>>> m2.c:8:3: note: output_location >>>> UuT (); >>>> ^~~~~~ >>>> # .MEM_2 = VDEF <.MEM_1(D)> >>>> UuT (); >>>> m2.c:9:10: note: output_location >>>> return 0; >>>> ^ >>>> _3 = 0; >>>> m2.c: In function ‘UuT’: >>>> m2.c:3:16: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> ^~~~~~~~ >>>> true_var_3 = 1; >>>> m2.c:3:43: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> ^~~~~~~~~ >>>> false_var_4 = 0; >>>> m2.c:3:71: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> ^~~ >>>> ret_5 = 0; >>>> m2.c:3:83: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> ^ >>>> if (true_var_3 != 0) >>>> m2.c:3:114: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> ^ >>>> if (false_var_4 != 0) >>>> m2.c:3:145: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> >>>> ~~~~^~~~~ >>>> ret_7 = 111; >>>> m2.c:3:182: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> >>>> ~~~~^~~~~ >>>> ret_6 = 999; >>>> m2.c:3:215: note: output_location >>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned >>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* >>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* >>>> count(#####) */ return ret; } >>>> >>>> >>>> >>>> ^~~ >>>> _8 = ret_2; >>>> m2.c:3:215: note: output_location >>>> # VUSE <.MEM_9(D)> >>>> return _8; >>>> >>>> Which is not optimal, for some assignments I see just LHS >>>> (false_var_4 = 0), > > Note that > > unsigned int false_var = 0; > > is not an assignment-expression, it's an initialization. Only the > '0' here is parsed as an assignment-expression, but in this case > set_c_expr_source_range isn't called.
Hello. Yes, I noticed that it's not called. > >>> >>> My first though was: are there assignments for which this isn't the >>> case? The only one I see is the: >>> ret = 999; >>> ~~~~^~~~~ >>> >>> Are the locations for these assignments coming through from the >>> frontend? >> >> Hi. >> >> Actually not all, the default assignments are created in gimplifier and >> location is assigned from DECL_EXPR: >> >> (gdb) p debug_tree(*expr_p) >> <decl_expr 0x7ffff6988c80 >> type <void_type 0x7ffff6878f18 void VOID >> align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18 >> pointer_to_this <pointer_type 0x7ffff68800a8>> >> side-effects >> arg 0 <var_decl 0x7ffff7f9ae10 true_var >> type <integer_type 0x7ffff6878690 unsigned int public unsigned SI >> size <integer_cst 0x7ffff6860f18 constant 32> >> unit size <integer_cst 0x7ffff6860f30 constant 4> >> align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 >> precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst >> 0x7ffff6860f00 4294967295> >> pointer_to_this <pointer_type 0x7ffff6885dc8>> >> used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst >> 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4> >> align 32 context <function_decl 0x7ffff697ce00 UuT> initial >> <integer_cst 0x7ffff698b258 1> >> chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type >> 0x7ffff6878690 unsigned int> >> used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst >> 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4> >> align 32 context <function_decl 0x7ffff697ce00 UuT> initial >> <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>> >> /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23> >> >> That explains why only LHS of these assignments is selected. >> >>> >>> I believe that in the C frontend these are assignment-expression, and >>> hence handled by c_parser_expr_no_commas; in particular the use of >>> op_location and the call: >>> >>> set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ()); >>> >>> ought to be setting up the caret of the assignment to be on the >>> operator token, and for the start/finish to range from the start of the >>> lhs to the end of the rhs i.e. what we see for: >>> >>> ret = 999; >>> ~~~~^~~~~ >> >> Yep, MODIFY_EXPRs created in FE go this way and it's fine. >> >>> >>> >>>> for return statements only a returned value is displayed. >>> >>> Is this running on SSA form? If so, I wonder if you're running into >>> something like this: >>> >>> retval_N = PHI <lots of values>; >>> return retval_N; >>> >>> where it's using the location of that "return retval_N;" for all of the >>> return statements in the function, rather than the individual source >>> locations. >> >> Yep, but we properly assign each assignment to a SSA name that's going to >> be merged in exit BB by PHI node: >> >> _8 = ret_2; >> /tmp/m2.c:7:8: note: output_location >> return ret; } >> ^~~ >> >> Here the location comes from c_finish_return function where location >> comes from a value that's returned. >> >>> >>>> For conditions, only condition beginning is showed. >>>> Is this known behavior or do I miss >>>> something? >>> >>> c_parser_if_statement has: >>> >>> loc = c_parser_peek_token (parser)->location; >>> >>> which is that of the open-paren. Maybe we should build a location >>> covering the range of the "if ( expression )" part of the if-statement? >> >> Adding Marek as C FE maintainer to reply the question. > > I suppose we could do better and I'd probably highlight just the expression > part of "if ( expression )". But not sure how many use cases this range > location would have. Works for me. I guess it can take some time to improve locations of GIMPLE expressions. Anyhow, I can start enhancing gcov tool even with current locations. Having that, we can provide users following kind of information: 1|int b, c, d, e; 2| 3|int main() 4|{ ^1 5| int a = b < 1 ? (c < 3 ? d : c) : e; ^1 ^1 ^0 ^0 6| return a; 7|} Where '^0' means the block (statements) are not executed. Martin > > Marek >