Hi! As mentioned in the PR, when compiling valgrind even on fairly small testcase where in one larger function the location keeps oscillating between a small line number and 8000-ish line number in the same file we very quickly run out of all possible location_t numbers and because of that emit non-sensical line numbers in .debug_line. There are ways how to decrease speed of depleting location_t numbers in libcpp, but the main reason of this is that we use stream_input_location_now for streaming in location_t for gimple_location and phi arg locations. libcpp strongly prefers that the locations it is given are sorted by the different files and by line numbers in ascending order, otherwise it depletes quickly no matter what and is much more costly (many extra file changes etc.). The reason for not caching those were the BLOCKs that were streamed immediately after the location and encoded into the locations (and for PHIs we failed to stream the BLOCKs altogether). This patch enhances the location cache to handle also BLOCKs (but not for everything, only for the spots we care about the BLOCKs) and also optimizes the size of the LTO stream by emitting a single bit into a pack whether the BLOCK changed from last case and only streaming the BLOCK tree if it changed.
Bootstapped/regtested on x86_64-linux with ../configure --enable-languages=default,ada,obj-c++,lto,go,brig,d --with-build-config=bootstrap-lto --enable-checking=yes,rtl,extra && make -j16 bootstrap > LOG 2>&1 && GXX_TESTSUITE_STDS=98,11,14,17,2a make -j32 -k check > LOGC 2>&1; ../contrib/test_summary > LOGT 2>&1 as well as normally bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? BTW, I wonder about one preexisting issue (but that can be done incrementally). On the streamer out side, we call clear_line_info in multiple spots which resets the current_* values to something, but on the reader side, we don't have corresponding resets in the same location, just have the stream_* static variables that keep the current values through the entire stream in (so across all the clear_line_info spots in a single LTO object but also across jumping from one LTO object to another one). Now, in an earlier version of my patch it actually broke LTO bootstrap (and a lot of LTO testcases), so for the BLOCK case I've solved it by clear_line_info setting current_block to something that should never appear, which means that in the LTO stream after the clear_line_info spots including the start of the LTO stream we force the block change bit to be set and thus BLOCK to be streamed and therefore stream_block from earlier to be ignored. But for the rest I think that is not the case, so I wonder if we don't sometimes end up with wrong line/column info because of that, or please tell me what prevents that. clear_line_info does: ob->current_file = NULL; ob->current_line = 0; ob->current_col = 0; ob->current_sysp = false; while I think NULL current_file is something that should likely be different from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are handled separately and not go through the caching), I think line number 0 can sometimes occur and especially column 0 occurs frequently if we ran out of location_t with columns info. But then we do: bp_pack_value (bp, ob->current_file != xloc.file, 1); bp_pack_value (bp, ob->current_line != xloc.line, 1); bp_pack_value (bp, ob->current_col != xloc.column, 1); and stream the details only if the != is true. If that happens immediately after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and not stream the actual value, so on read-in it would reuse whatever stream_col etc. were before. Shouldn't we set some ob->current_* new bit that would signal we are immediately past clear_line_info which would force all these != checks to non-zero? Either by oring something into those tests, or perhaps: if (ob->current_reset) { if (xloc.file == NULL) ob->current_file = ""; if (xloc.line == 0) ob->current_line = 1; if (xloc.column == 0) ob->current_column = 1; ob->current_reset = false; } before doing those bp_pack_value calls with a comment, effectively forcing all 6 != comparisons to be true? 2020-09-03 Jakub Jelinek <ja...@redhat.com> PR lto/94311 * gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New functions. * streamer-hooks.h (struct streamer_hooks): Add output_location_and_block callback. Fix up formatting for output_location. (stream_output_location_and_block): Define. * lto-streamer.h (class lto_location_cache): Fix comment typo. Add current_block member. (lto_location_cache::input_location_and_block): New method. (lto_location_cache::lto_location_cache): Initialize current_block. (lto_location_cache::cached_location): Add block member. (struct output_block): Add current_block member. (lto_output_location): Formatting fix. (lto_output_location_and_block): Declare. * lto-streamer.c (lto_streamer_hooks_init): Initialize streamer_hooks.output_location_and_block. * lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare block members. (lto_location_cache::apply_location_cache): Handle blocks. (lto_location_cache::accept_location_cache, lto_location_cache::revert_location_cache): Fix up function comments. (lto_location_cache::input_location_and_block): New method. (lto_location_cache::input_location): Implement using input_location_and_block. (input_function): Invoke apply_location_cache after streaming in all bbs. * lto-streamer-out.c (clear_line_info): Set current_block. (lto_output_location_1): New function, moved from lto_output_location, added block handling. (lto_output_location): Implement using lto_output_location_1. (lto_output_location_and_block): New function. * gimple-streamer-in.c (input_phi): Use input_location_and_block to input and cache both location and block. (input_gimple_stmt): Likewise. * gimple-streamer-out.c (output_phi): Use stream_output_location_and_block. (output_gimple_stmt): Likewise. --- gcc/gimple.h.jj 2020-08-03 22:54:51.426531549 +0200 +++ gcc/gimple.h 2020-09-02 13:51:38.841642793 +0200 @@ -1879,6 +1879,14 @@ gimple_set_location (gimple *g, location g->location = location; } +/* Return address of the location information for statement G. */ + +static inline location_t * +gimple_location_ptr (gimple *g) +{ + return &g->location; +} + /* Return true if G contains location information. */ @@ -4581,6 +4589,14 @@ gimple_phi_arg_set_location (gphi *phi, gimple_phi_arg (phi, i)->locus = loc; } +/* Return address of source location of gimple argument I of phi node PHI. */ + +static inline location_t * +gimple_phi_arg_location_ptr (gphi *phi, size_t i) +{ + return &gimple_phi_arg (phi, i)->locus; +} + /* Return TRUE if argument I of phi node PHI has a location record. */ static inline bool --- gcc/streamer-hooks.h.jj 2020-01-12 11:54:36.936405518 +0100 +++ gcc/streamer-hooks.h 2020-09-02 14:18:03.304429642 +0200 @@ -54,8 +54,15 @@ struct streamer_hooks { /* [REQ] Called by every streaming routine that needs to read a location. */ void (*input_location) (location_t *, struct bitpack_d *, class data_in *); - /* [REQ] Called by every streaming routine that needs to write a location. */ - void (*output_location) (struct output_block *, struct bitpack_d *, location_t); + /* [REQ] Called by every streaming routine that needs to write a + location. */ + void (*output_location) (struct output_block *, struct bitpack_d *, + location_t); + + /* [REQ] Called by every streaming routine that needs to write a + location, both LOCATION_LOCUS and LOCATION_BLOCK. */ + void (*output_location_and_block) (struct output_block *, struct bitpack_d *, + location_t); }; #define stream_write_tree(OB, EXPR, REF_P) \ @@ -73,6 +80,9 @@ struct streamer_hooks { #define stream_output_location(OB, BP, LOC) \ streamer_hooks.output_location (OB, BP, LOC) +#define stream_output_location_and_block(OB, BP, LOC) \ + streamer_hooks.output_location_and_block (OB, BP, LOC) + /* Streamer hooks. */ extern struct streamer_hooks streamer_hooks; --- gcc/lto-streamer.h.jj 2020-08-03 22:54:51.429531506 +0200 +++ gcc/lto-streamer.h 2020-09-02 14:21:13.829639521 +0200 @@ -262,7 +262,7 @@ typedef void (lto_free_section_data_f) ( /* The location cache holds expanded locations for streamed in trees. This is done to reduce memory usage of libcpp linemap that strongly prefers - locations to be inserted in the soruce order. */ + locations to be inserted in the source order. */ class lto_location_cache { @@ -276,9 +276,13 @@ public: void revert_location_cache (); void input_location (location_t *loc, struct bitpack_d *bp, class data_in *data_in); + void input_location_and_block (location_t *loc, struct bitpack_d *bp, + class lto_input_block *ib, + class data_in *data_in); lto_location_cache () : loc_cache (), accepted_length (0), current_file (NULL), current_line (0), - current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION) + current_col (0), current_sysp (false), current_loc (UNKNOWN_LOCATION), + current_block (NULL_TREE) { gcc_assert (!current_cache); current_cache = this; @@ -304,6 +308,7 @@ private: location_t *loc; int line, col; bool sysp; + tree block; }; /* The location cache. */ @@ -325,6 +330,7 @@ private: int current_col; bool current_sysp; location_t current_loc; + tree current_block; }; /* Structure used as buffer for reading an LTO file. */ @@ -711,6 +717,7 @@ struct output_block int current_line; int current_col; bool current_sysp; + tree current_block; /* Cache of nodes written in this section. */ struct streamer_tree_cache_d *writer_cache; @@ -881,7 +888,10 @@ void lto_output_decl_state_streams (stru void lto_output_decl_state_refs (struct output_block *, struct lto_output_stream *, struct lto_out_decl_state *); -void lto_output_location (struct output_block *, struct bitpack_d *, location_t); +void lto_output_location (struct output_block *, struct bitpack_d *, + location_t); +void lto_output_location_and_block (struct output_block *, struct bitpack_d *, + location_t); void lto_output_init_mode_table (void); void lto_prepare_function_for_streaming (cgraph_node *); --- gcc/lto-streamer.c.jj 2020-07-28 15:39:09.963756846 +0200 +++ gcc/lto-streamer.c 2020-09-02 14:19:23.624253409 +0200 @@ -272,4 +272,5 @@ lto_streamer_hooks_init (void) streamer_hooks.read_tree = lto_input_tree; streamer_hooks.input_location = lto_input_location; streamer_hooks.output_location = lto_output_location; + streamer_hooks.output_location_and_block = lto_output_location_and_block; } --- gcc/lto-streamer-in.c.jj 2020-08-27 18:42:35.604712155 +0200 +++ gcc/lto-streamer-in.c 2020-09-02 15:23:12.454221865 +0200 @@ -156,7 +156,18 @@ lto_location_cache::cmp_loc (const void return a->sysp ? 1 : -1; if (a->line != b->line) return a->line - b->line; - return a->col - b->col; + if (a->col != b->col) + return a->col - b->col; + if ((a->block == NULL_TREE) != (b->block == NULL_TREE)) + return a->block ? 1 : -1; + if (a->block) + { + if (BLOCK_NUMBER (a->block) < BLOCK_NUMBER (b->block)) + return -1; + if (BLOCK_NUMBER (a->block) > BLOCK_NUMBER (b->block)) + return 1; + } + return 0; } /* Apply all changes in location cache. Add locations into linemap and patch @@ -191,22 +202,33 @@ lto_location_cache::apply_location_cache linemap_line_start (line_table, loc.line, max + 1); } gcc_assert (*loc.loc == BUILTINS_LOCATION + 1); - if (current_file == loc.file && current_line == loc.line - && current_col == loc.col) - *loc.loc = current_loc; - else - current_loc = *loc.loc = linemap_position_for_column (line_table, - loc.col); + if (current_file != loc.file + || current_line != loc.line + || current_col != loc.col) + { + current_loc = linemap_position_for_column (line_table, loc.col); + if (loc.block) + current_loc = set_block (current_loc, loc.block); + } + else if (current_block != loc.block) + { + if (loc.block) + current_loc = set_block (current_loc, loc.block); + else + current_loc = LOCATION_LOCUS (current_loc); + } + *loc.loc = current_loc; current_line = loc.line; prev_file = current_file = loc.file; current_col = loc.col; + current_block = loc.block; } loc_cache.truncate (0); accepted_length = 0; return true; } -/* Tree merging did not suceed; mark all changes in the cache as accepted. */ +/* Tree merging did not succeed; mark all changes in the cache as accepted. */ void lto_location_cache::accept_location_cache () @@ -215,7 +237,7 @@ lto_location_cache::accept_location_cach accepted_length = loc_cache.length (); } -/* Tree merging did suceed; throw away recent changes. */ +/* Tree merging did succeed; throw away recent changes. */ void lto_location_cache::revert_location_cache () @@ -223,33 +245,46 @@ lto_location_cache::revert_location_cach loc_cache.truncate (accepted_length); } -/* Read a location bitpack from input block IB and either update *LOC directly - or add it to the location cache. +/* Read a location bitpack from bit pack BP and either update *LOC directly + or add it to the location cache. If IB is non-NULL, stream in a block + afterwards. It is neccesary to call apply_location_cache to get *LOC updated. */ void -lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp, - class data_in *data_in) +lto_location_cache::input_location_and_block (location_t *loc, + struct bitpack_d *bp, + class lto_input_block *ib, + class data_in *data_in) { static const char *stream_file; static int stream_line; static int stream_col; static bool stream_sysp; - bool file_change, line_change, column_change; + static tree stream_block; gcc_assert (current_cache == this); *loc = bp_unpack_int_in_range (bp, "location", 0, RESERVED_LOCATION_COUNT); if (*loc < RESERVED_LOCATION_COUNT) - return; + { + if (ib) + { + bool block_change = bp_unpack_value (bp, 1); + if (block_change) + stream_block = stream_read_tree (ib, data_in); + if (stream_block) + *loc = set_block (*loc, stream_block); + } + return; + } /* Keep value RESERVED_LOCATION_COUNT in *loc as linemap lookups will ICE on it. */ - file_change = bp_unpack_value (bp, 1); - line_change = bp_unpack_value (bp, 1); - column_change = bp_unpack_value (bp, 1); + bool file_change = bp_unpack_value (bp, 1); + bool line_change = bp_unpack_value (bp, 1); + bool column_change = bp_unpack_value (bp, 1); if (file_change) { @@ -263,21 +298,48 @@ lto_location_cache::input_location (loca if (column_change) stream_col = bp_unpack_var_len_unsigned (bp); - /* This optimization saves location cache operations druing gimple + tree block = NULL_TREE; + if (ib) + { + bool block_change = bp_unpack_value (bp, 1); + if (block_change) + stream_block = stream_read_tree (ib, data_in); + block = stream_block; + } + + /* This optimization saves location cache operations during gimple streaming. */ - if (current_file == stream_file && current_line == stream_line - && current_col == stream_col && current_sysp == stream_sysp) - { - *loc = current_loc; + if (current_file == stream_file + && current_line == stream_line + && current_col == stream_col + && current_sysp == stream_sysp) + { + if (current_block == block) + *loc = current_loc; + else if (block) + *loc = set_block (current_loc, block); + else + *loc = LOCATION_LOCUS (current_loc); return; } struct cached_location entry - = {stream_file, loc, stream_line, stream_col, stream_sysp}; + = {stream_file, loc, stream_line, stream_col, stream_sysp, block}; loc_cache.safe_push (entry); } +/* Read a location bitpack from bit pack BP and either update *LOC directly + or add it to the location cache. + It is neccesary to call apply_location_cache to get *LOC updated. */ + +void +lto_location_cache::input_location (location_t *loc, struct bitpack_d *bp, + class data_in *data_in) +{ + return input_location_and_block (loc, bp, NULL, data_in); +} + /* Read a location bitpack from input block IB and either update *LOC directly or add it to the location cache. It is neccesary to call apply_location_cache to get *LOC updated. */ @@ -1101,6 +1163,9 @@ input_function (tree fn_decl, class data tag = streamer_read_record_start (ib); } + /* Finalize gimple_location/gimple_block of stmts and phis. */ + data_in->location_cache.apply_location_cache (); + /* Fix up the call statements that are mentioned in the callgraph edges. */ set_gimple_stmt_max_uid (cfun, 0); --- gcc/lto-streamer-out.c.jj 2020-07-28 15:39:09.963756846 +0200 +++ gcc/lto-streamer-out.c 2020-09-02 15:16:27.634141378 +0200 @@ -60,6 +60,10 @@ clear_line_info (struct output_block *ob ob->current_line = 0; ob->current_col = 0; ob->current_sysp = false; + /* Initialize to something that will never appear as block, + so that the first location with block in a function etc. + always streams a change_block bit and the first block. */ + ob->current_block = void_node; } @@ -178,40 +182,72 @@ tree_is_indexable (tree t) After outputting bitpack, lto_output_location_data has to be done to output actual data. */ -void -lto_output_location (struct output_block *ob, struct bitpack_d *bp, - location_t loc) +static void +lto_output_location_1 (struct output_block *ob, struct bitpack_d *bp, + location_t orig_loc, bool block_p) { - expanded_location xloc; + location_t loc = LOCATION_LOCUS (orig_loc); - loc = LOCATION_LOCUS (loc); bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT, loc < RESERVED_LOCATION_COUNT ? loc : RESERVED_LOCATION_COUNT); - if (loc < RESERVED_LOCATION_COUNT) - return; - - xloc = expand_location (loc); + if (loc >= RESERVED_LOCATION_COUNT) + { + expanded_location xloc = expand_location (loc); - bp_pack_value (bp, ob->current_file != xloc.file, 1); - bp_pack_value (bp, ob->current_line != xloc.line, 1); - bp_pack_value (bp, ob->current_col != xloc.column, 1); + bp_pack_value (bp, ob->current_file != xloc.file, 1); + bp_pack_value (bp, ob->current_line != xloc.line, 1); + bp_pack_value (bp, ob->current_col != xloc.column, 1); + + if (ob->current_file != xloc.file) + { + bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true); + bp_pack_value (bp, xloc.sysp, 1); + } + ob->current_file = xloc.file; + ob->current_sysp = xloc.sysp; + + if (ob->current_line != xloc.line) + bp_pack_var_len_unsigned (bp, xloc.line); + ob->current_line = xloc.line; + + if (ob->current_col != xloc.column) + bp_pack_var_len_unsigned (bp, xloc.column); + ob->current_col = xloc.column; + } - if (ob->current_file != xloc.file) + if (block_p) { - bp_pack_string (ob, bp, remap_debug_filename (xloc.file), true); - bp_pack_value (bp, xloc.sysp, 1); + tree block = LOCATION_BLOCK (orig_loc); + bp_pack_value (bp, ob->current_block != block, 1); + streamer_write_bitpack (bp); + if (ob->current_block != block) + lto_output_tree (ob, block, true, true); + ob->current_block = block; } - ob->current_file = xloc.file; - ob->current_sysp = xloc.sysp; +} + +/* Output info about new location into bitpack BP. + After outputting bitpack, lto_output_location_data has + to be done to output actual data. */ - if (ob->current_line != xloc.line) - bp_pack_var_len_unsigned (bp, xloc.line); - ob->current_line = xloc.line; - - if (ob->current_col != xloc.column) - bp_pack_var_len_unsigned (bp, xloc.column); - ob->current_col = xloc.column; +void +lto_output_location (struct output_block *ob, struct bitpack_d *bp, + location_t loc) +{ + lto_output_location_1 (ob, bp, loc, false); +} + +/* Output info about new location into bitpack BP. + After outputting bitpack, lto_output_location_data has + to be done to output actual data. Like lto_output_location, but + additionally output LOCATION_BLOCK info too and write the BP bitpack. */ + +void +lto_output_location_and_block (struct output_block *ob, struct bitpack_d *bp, + location_t loc) +{ + lto_output_location_1 (ob, bp, loc, true); } --- gcc/gimple-streamer-in.c.jj 2020-01-12 11:54:36.625410210 +0100 +++ gcc/gimple-streamer-in.c 2020-09-02 13:53:39.716871790 +0200 @@ -57,11 +57,7 @@ input_phi (class lto_input_block *ib, ba tree def = stream_read_tree (ib, data_in); int src_index = streamer_read_uhwi (ib); bitpack_d bp = streamer_read_bitpack (ib); - /* Do not cache a location - we do not have API to get pointer to the - location in PHI statement and we may trigger reallocation. */ - location_t arg_loc = stream_input_location_now (&bp, data_in); basic_block sbb = BASIC_BLOCK_FOR_FN (fn, src_index); - edge e = NULL; int j; @@ -72,7 +68,11 @@ input_phi (class lto_input_block *ib, ba break; } - add_phi_arg (result, def, e, arg_loc); + add_phi_arg (result, def, e, UNKNOWN_LOCATION); + /* Read location and lexical block information. */ + location_t *arg_locp = gimple_phi_arg_location_ptr (result, e->dest_idx); + data_in->location_cache.input_location_and_block (arg_locp, &bp, ib, + data_in); } return result; @@ -106,12 +106,9 @@ input_gimple_stmt (class lto_input_block has_hist = bp_unpack_value (&bp, 1); stmt->subcode = bp_unpack_var_len_unsigned (&bp); - /* Read location information. Caching here makes no sense until streamer - cache can handle the following gimple_set_block. */ - gimple_set_location (stmt, stream_input_location_now (&bp, data_in)); - - /* Read lexical block reference. */ - gimple_set_block (stmt, stream_read_tree (ib, data_in)); + /* Read location and lexical block information. */ + data_in->location_cache.input_location_and_block (gimple_location_ptr (stmt), + &bp, ib, data_in); /* Read in all the operands. */ switch (code) --- gcc/gimple-streamer-out.c.jj 2020-01-12 11:54:36.625410210 +0100 +++ gcc/gimple-streamer-out.c 2020-09-02 14:34:20.360123419 +0200 @@ -48,8 +48,8 @@ output_phi (struct output_block *ob, gph stream_write_tree (ob, gimple_phi_arg_def (phi, i), true); streamer_write_uhwi (ob, gimple_phi_arg_edge (phi, i)->src->index); bitpack_d bp = bitpack_create (ob->main_stream); - stream_output_location (ob, &bp, gimple_phi_arg_location (phi, i)); - streamer_write_bitpack (&bp); + location_t loc = gimple_phi_arg_location (phi, i); + stream_output_location_and_block (ob, &bp, loc); } } @@ -84,12 +84,8 @@ output_gimple_stmt (struct output_block bp_pack_value (&bp, hist != NULL, 1); bp_pack_var_len_unsigned (&bp, stmt->subcode); - /* Emit location information for the statement. */ - stream_output_location (ob, &bp, LOCATION_LOCUS (gimple_location (stmt))); - streamer_write_bitpack (&bp); - - /* Emit the lexical block holding STMT. */ - stream_write_tree (ob, gimple_block (stmt), true); + /* Emit location information for the statement, including gimple_block. */ + stream_output_location_and_block (ob, &bp, gimple_location (stmt)); /* Emit the operands. */ switch (gimple_code (stmt)) Jakub