On Tue, Nov 5, 2024 at 6:16 PM Lewis Hyatt <lhy...@gmail.com> wrote: > > On Tue, Nov 05, 2024 at 10:56:30AM +0100, Jakub Jelinek wrote: > > On Tue, Nov 05, 2024 at 10:42:10AM +0100, Richard Biener wrote: > > > > Actually, I think cpp_token isn't that big deal, that should be > > > > short-lived > > > > unless using huge macros. > > > > cp_token in the C++ FE is more important, the FE uses a vector of those > > > > and there is one cp_token per token read from libcpp. > > > > Unfortunately, I'm afraid there is nothing that can be done there, > > > > the struct has currently 29 bits of various flags, then 32 bit > > > > location_t > > > > and then union with a single pointer in it, so nicely 16 bytes. > > > > Now it will be 24 bytes, with 35 spare bits for flags. > > > > And the vector is live across the whole parsing (pointer to it cleared > > > > at > > > > the end of parsing, so GC collect can use it). > > > > > > So cp_token[] could be split into two arrays to avoid the 32bit padding > > > with > > > the enlarged location_t. Maybe that's even more cache efficient if one > > > 32bit field is often the only accessed one when sweeping over a chain > > > of tokens. > > > > Not without rewriting the whole parser, cp_token is the basic structure it > > uses everywhere, most of the code doesn't really care whether the tokens are > > in a vector or where, they just peek a token or peek 2nd token or similarly. > > And having to rewrite all cp_token -> accesses into some inline function > > calls or macros that would for e.g. the 32-bit pointer to cp_token_flags > > look up the corresponding location/u.value/u.tree_check_value would be a > > nightmare, there are about 950 such cases. What is worse, not all cp_tokens > > live in the parser->lexer->buffer vector, some of them live in automatic > > variables. > > > > Jakub > > > > Thank you everyone for considering this change. I guess one reason I thought > about making it configurable was to give some time to optimize the memory > usage before forcing it by default, but I see Richard's point that it is > probably not possible to optimize the memory usage without committing to the > size first. > > I have a version of the patch set which removes the configurability, I can > send it that way for the next version if it makes sense to proceed with that?
Yes please. > Also I attached here the outputs of: > > g++ -O3 -c -fmem-report gcc/testsuite/g++.dg/modules/xtreme-header.h > > with --enable-gather-detailed-mem-stats, and with 32- or 64-bit > location_t... in case that helps with some of this discussion. GIMPLE statements Kind Stmts Bytes --------------------------------------- assignments 0 0 phi nodes 0 0 conditionals 0 0 everything else 0 0 --------------------------------------- Total 0 0 hmm, looks like stats no longer(?) work for GIMPLE :/ So we have 5MB more in trees (up to 100MB now) but the linemap stats show a 100% growth: Duplicated maps locations size: 836k Total allocated maps size: 4365k Total used maps size: 3617k to Duplicated maps locations size: 1673k Total allocated maps size: 8794k Total used maps size: 6238k with "improvement" only for unoptimized_ranges (I didn't go into the code to understand the statistics, so maybe my conclusion is wrong?) We can only guess that the other ~10MB peak memory goes to GIMPLE and possibly GC overhead which goes from (total accumulated) 2.7MB to 9MB. It would be nice to understand why the GIMPLE reports are empty - ah, maybe the module testcase is -fsyntax-only? Richard. > For the above compile I see the peak memory usage (reported by time -v) > change from 430.5 MB to 451.3 MB (+5%). > > -Lewis