On Mon, 2015-11-02 at 14:14 -0500, David Malcolm wrote: > On Fri, 2015-10-30 at 00:15 -0600, Jeff Law wrote: > > On 10/23/2015 02:41 PM, David Malcolm wrote: > > > As in the previous version of this patch > > > "Implement tree expression tracking in C FE (v2)" > > > the patch now captures ranges for all C expressions during parsing within > > > a new field of c_expr, and for all tree nodes with a location_t, it stores > > > them in ad-hoc locations for later use. > > > > > > Hence compound expressions get ranges; see: > > > > > > https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html > > > > > > and for this example: > > > > > > int test (int foo) > > > { > > > return foo * 100; > > > ^^^ ^^^ > > > } > > > > > > we have access to the ranges of "foo" and "100" during C parsing via > > > the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an > > > INTEGER_CST (the former's location is in at the top of the > > > function, and the latter has no location). > > > > > > gcc/ChangeLog: > > > * Makefile.in (OBJS): Add gcc-rich-location.o. > > > * gcc-rich-location.c: New file. > > > * gcc-rich-location.h: New file. > > > * print-tree.c (print_node): Print any source range information. > > > * tree.c (set_source_range): New functions. > > > * tree.h (CAN_HAVE_RANGE_P): New. > > > (EXPR_LOCATION_RANGE): New. > > > (EXPR_HAS_RANGE): New. > > > (get_expr_source_range): New inline function. > > > (DECL_LOCATION_RANGE): New. > > > (set_source_range): New decls. > > > (get_decl_source_range): New inline function. > > > > > > gcc/c-family/ChangeLog: > > > * c-common.c (c_fully_fold_internal): Capture existing souce_range, > > > and store it on the result. > > > > > > gcc/c/ChangeLog: > > > * c-parser.c (set_c_expr_source_range): New functions. > > > (c_token::get_range): New method. > > > (c_token::get_finish): New method. > > > (c_parser_expr_no_commas): Call set_c_expr_source_range on the ret > > > based on the range from the start of the LHS to the end of the > > > RHS. > > > (c_parser_conditional_expression): Likewise, based on the range > > > from the start of the cond.value to the end of exp2.value. > > > (c_parser_binary_expression): Call set_c_expr_source_range on > > > the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR. > > > (c_parser_cast_expression): Call set_c_expr_source_range on ret > > > based on the cast_loc through to the end of the expr. > > > (c_parser_unary_expression): Likewise, based on the > > > op_loc through to the end of op. > > > (c_parser_sizeof_expression) Likewise, based on the start of the > > > sizeof token through to either the closing paren or the end of > > > expr. > > > (c_parser_postfix_expression): Likewise, using the token range, > > > or from the open paren through to the close paren for > > > parenthesized expressions. > > > (c_parser_postfix_expression_after_primary): Likewise, for > > > various kinds of expression. > > > * c-tree.h (struct c_expr): Add field "src_range". > > > (c_expr::get_start): New method. > > > (c_expr::get_finish): New method. > > > (set_c_expr_source_range): New decls. > > > * c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range > > > on ret for prefix unary ops. > > > (parser_build_binary_op): Likewise, running from the start of > > > arg1.value through to the end of arg2.value. > > > > > > gcc/testsuite/ChangeLog: > > > * gcc.dg/plugin/diagnostic-test-expressions-1.c: New file. > > > * gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c: > > > New file. > > > * gcc.dg/plugin/plugin.exp (plugin_test_list): Add > > > diagnostic_plugin_test_tree_expression_range.c and > > > diagnostic-test-expressions-1.c. > > > > > /* Initialization routine for this file. */ > > > > > > @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct > > > c_expr *after, > > > ret.value = build_modify_expr (op_location, lhs.value, > > > lhs.original_type, > > > code, exp_location, rhs.value, > > > rhs.original_type); > > > + set_c_expr_source_range (&ret, > > > + lhs.get_start (), > > > + rhs.get_finish ()); > > One line if it fits. > > > > > > > @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, > > > struct c_expr *after, > > > ? t1 > > > : NULL); > > > } > > > + set_c_expr_source_range (&ret, > > > + start, > > > + exp2.get_finish ()); > > Here too. > > > > > @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct > > > c_expr *after) > > > expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true); > > > } > > > ret.value = c_cast_expr (cast_loc, type_name, expr.value); > > > + if (ret.value && expr.value) > > > + set_c_expr_source_range (&ret, > > > + cast_loc, > > > + expr.get_finish ()); > > And here? > > > > With the nits fixed, this is OK. > > > > I think that covers this iteration of the rich location work and that > > you'll continue working with Jason on extending this into the C++ front-end. > > Here's a summary of the current status of this work [1]: > > Patches 1-4 of the kit: these Jeff has approved, with some pre-approved > nit fixes in 4. I see these as relatively low risk, and plan to commit > these today/tomorrow. > > Patches 5-10: Jeff approved these also (again with some nits). These > feel higher-risk to me, owing to the potential for performance > regressions; I haven't yet answered at least one of Richi's performance > questions (impact on time taken to generate the C++ PCH file); the last > performance testing I did can be seen here: > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html > where the right-most column is this kit. > > CCing Richi to keep him in the loop for the above. Richi, is there any > other specific testing you'd want me to do for this? > Or is it OK to commit, and to see what impact it has on your daily > performance testing? (and to revert if the impact is unacceptable). > > Talking about risks: the reduction of the space for ordinary maps by a > factor of 32, by taking up 5 bits for the packed range information > optimization (patch 10): > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html > CCing Dodji: Dodji; is this reasonable? > > I did some experiments back in July on this; instrumented builds of > openldap and openssl to see how much space we have in location_t: > https://dmalcolm.fedorapeople.org/gcc/2015-07-23/openldap.csv > https://dmalcolm.fedorapeople.org/gcc/2015-07-24/openldap.csv > > (these are space-separated: > SRPM name > sourcefile > maximal ordinary location_t > minimal macro location_t) > > openldap build #files: 906 > maximal ordinary location_t was: > sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/bconfig.c' > max_ordinary_location=0x0081bd1b > (and min_macro_location=0x7ffe5903 > minimal macro location_t was: > sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/aclparse.c' > min_macro_location=0x7ffe57e2 > (with max_ordinary_location=0x00719775) > > openssl-1.0.1k-8.fc22.src.rpm.x86_64: > #files: 1495 > max_ordinary_location=0x00be3726 > (openssl-1.0.1k/apps/s_client.c) > with min_macro_location=0x7ffe7b6b > > min_macro_location=0x7ffdf069 > (openssl-1.0.1k/apps/speed.c) > with max_ordinary_location=0x00a1abdf > > In all of the above cases, we had enough room to do the bit-packing > optimization, but this is just two projects (albeit real-world C code). > > Comparing the gap between maximal ordinary map location and minimal > macro map location, and seeing how much we can expand the ordinary map > locations, the openldap build had: > (0x7ffe57e2 - 0x0081bd1b) / 0x0081bd1b == factor of 251 i.e. > 7 bits of space available > > openssl build had: > (0x7ffdf069 - 0x00be3726) / 0x00be3726 == factor of 171 i.e. 7 bits > of space available > > hence allocating 5 bits to packing ranges is (I hope) reasonable.
Actually, I realize now that a better upper limit to be concerned with for ordinary line maps is LINE_MAP_MAX_LOCATION_WITH_COLS i.e. 0x60000000, since there'd be a user-visible impact if we hit that limit. For the openldap build, we'd have: (0x60000000 - 0x0081bd1b) / 0x0081bd1b == factor of 188. and for openssl: (0x60000000 - 0x00be3726) / 0x00be3726 == factor of 128. So paying 5 bits (for a factor of 32) still seems a reasonable cost, for these two builds. > Jeff: I'm working on expression ranges in the C++ FE; is that a > prerequisite for patches 5-10, or can 5-10 go ahead without the C++ > work? (assuming the other issues above are acceptable). > > Hope this all makes sense and sounds sane > Dave > > [1] Together the kit gives us: > * patch 4: infrastructure for printing underlines in > diagnostic_show_locus and for multiple ranges > * patches 5-10: the "source_location" (aka location_t) type becomes a > caret plus a range; the tokens coming from libcpp gain ranges, so > everything using libcpp gains at least underlines of tokens; the C > frontend generates sane ranges for expressions as it parses them, better > showing the user how the parser "sees" their code. > > Hence we ought to get underlined ranges for many diagnostics in C and C > ++ with this (e.g. input_location gives an underline covering the range > of the token starting at the caret). The "caret" should remain > unchanged from the status quo, so e.g. debugging locations shouldn't be > affected by the addition of ranges. > > I'm anticipating that we'd need some followup patches to pick better > ranges for some diagnostics, analogous to the way we convert "warning" > to "warning_at" for where input_location isn't the best location; I'd > expect these followup patches to be relative simple and low-risk. > >